Back to Collection
GPT-4o / Claude 3.5 Sonnet
verified safe

Robust Python Automation Web Scraper

By SR Prompts Board
June 5, 2026
4 min read
Robust Python Automation Web Scraper

Terminal Blueprint & Commands

Prompt Workbench

Select a tool tab to optimize prompt configurations automatically.

// Active target: ChatGPTUTF-8
Act as a Senior Python Automation Developer. Task: Write a production-ready web scraping script using Python. Technical Requirements: 1. Use `asyncio` and `httpx` for efficient, non-blocking asynchronous requests. 2. Implement robust error handling (try-except blocks for connection limits, timeouts, and status codes). 3. Include a user-agent rotation mechanism and custom request headers. 4. Parse the page contents using `BeautifulSoup4` with exact CSS selectors. 5. Save the parsed data to a localized JSON or CSV format automatically. 6. Enforce a rate-limiting delay between requests (e.g. 1.5 seconds) to respect robots.txt rules. 7. Return ONLY the complete Python script inside a single markdown code block. Do not add conversational text. Target URL and Scraping Objective: [insert target website details and data points to extract]
Presets:

How to use with ChatGPT

Step 1: Setup Workspace

Navigate to chatgpt.com. Select the GPT-4o model option from the dropdown menu in the upper left corner to support reasoning.

Step 2: Paste & Formulate

Paste the copied prompt from above directly into the chat input bar. Customize placeholder text brackets (e.g. `[Subject]`) with your own inputs.

Step 3: Refine Results

If the output needs adjustments, follow up with feedback like: "Add more details on feature X" or "Make the tone slightly more formal".

Pro Tip for ChatGPT:ChatGPT responds best to roleplay constraints. If you customize the prompt, you can add: 'Adopt the style of Steve Jobs' at the beginning to force a specific formatting aesthetic and presentation style.
Advertisement

Editorial Insight

This automation template guides the AI to build asynchronous, rate-limited Python web scrapers that execute efficiently without triggering anti-bot firewalls.

Creates robust, asynchronous web scraping scripts in Python utilizing BeautifulSoup, httpx, and asyncio, including full error handling, rate limiting, and output parsing to JSON/CSV.

Calibrated Model Settings

Recommended EngineGPT-4o / Claude 3.5 Sonnet
Aspect Parameter16:9
Temperature Values0.2 (deterministic coding) to 0.7 (creative scripting)
Token Length Limits2048 to 4096 tokens
Formatting DirectivesClean markdown formatted output code blocks

Expected Result Output

# Python Asynchronous Scraper Blueprint import asyncio import json import httpx from bs4 import BeautifulSoup async def fetch_page(client, url): headers = {"User-Agent": "Mozilla/5.0"} try: response = await client.get(url, headers=headers, timeout=10.0) response.raise_for_status() return response.text except httpx.HTTPStatusError as e: print(f"Server error: {e}") return None

Workflow Use Cases

Market ResearchAutomatically pull product details, prices, and review scores.
Data PipelinesFeed datasets into analytical dashboards or databases daily.
Site MonitoringCheck for inventory changes or news alerts on custom schedules.

💡 Creator Tips & Variations

  • Headless Browsing: If the target site uses JavaScript framework rendering (React/Next), swap bs4 for Playwright/Selenium commands.
  • Proxy Rotation: Feed a list of proxies in a pool and instruct the client to rotate them inside the query constructor.
  • Database Storage: Instruct the script to write directly to PostgreSQL or SQLite instead of local CSV files.

⚠️ Pitfalls & Mistakes to Avoid

  • Leaving brackets empty without filling in your custom variables (e.g., submitting '[insert topic]' directly to the model).
  • Not enforcing formatting limits, allowing the AI to generate long conversational intros and outros instead of clean, copyable code.
Advertisement

FAQ & Help

Q. Which AI model works best with this text template?

This prompt is optimized for advanced reasoning models like GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro. It also works with smaller local models like Llama 3 but requires clear constraint enforcement.

Q. How do I prevent the model from generating conversational text?

Add a system rule: 'Do not write any introductory or concluding text. Return ONLY the raw formatted output.'

Explore Other Templates

Home
Prompts
Articles
Upcoming
About