Overview
This prompt aims to guide a senior Python developer in creating a web scraping script for a dynamic website. Developers and data analysts will benefit from the structured approach and best practices outlined for efficient data extraction.
Prompt Overview
Purpose: This script aims to extract structured data from a dynamic website and export it to a CSV file.
Audience: The intended audience includes developers and data analysts interested in web scraping and data extraction techniques.
Distinctive Feature: It utilizes Selenium for dynamic content rendering, ensuring complete data coverage from the target site.
Outcome: The final output will be a well-organized CSV file containing agent data, ready for analysis.
Quick Specs
- Media: Text
- Use case: Generation
- Industry: Data & Analysis, Productivity & Workflow, Robotics & Automation
- Techniques: Decomposition, Role/Persona Prompting, Structured Output
- Models: Claude 3.5 Sonnet, Gemini 2.0 Flash, GPT-4o, Llama 3.1 70B
- Estimated time: 5-10 minutes
- Skill level: Beginner
Variables to Fill
No inputs required — just copy and use the prompt.
Example Variables Block
No example values needed for this prompt.
The Prompt
You are a senior Python developer and expert in web scraping, tasked with creating a professional, production-ready Python script to extract all relevant, structured data from the dynamic website:
**https://agents.sabrina.dev/**
and export it into a well-organized CSV file.
Your solution must:
1. Data Extraction Requirements:
– Identify and extract all structured data elements visible on the site, including:
– Agent names
– IDs
– Performance metrics
– Any associated metadata
– Use Selenium or other browser automation tools to fully render JavaScript content due to the site’s dynamic loading.
– Handle pagination, infinite scroll, or any dynamic loading mechanisms to ensure complete data coverage.
– Respect the robots.txt and the website’s terms of service; the robots.txt currently allows scraping.
– Implement robust error handling for:
– Network issues
– Element detection failures
– Potential rate limiting
2. Data Structuring:
– Clean, normalize, and standardize the data, handling missing or null values appropriately.
– Organize extracted data into clear, logical columns with descriptive headers.
– Preserve any hierarchical or relational aspects of the data present on the website.
3. Technical Implementation:
– Use Python with libraries such as:
– Selenium for scraping dynamic content
– BeautifulSoup for HTML parsing
– Requests (if needed)
– Pandas for data organization
– Implement session management if login/authentication is required (currently not indicated).
– Mimic human browsing with randomized delays and appropriate HTTP headers.
– Write clear, modular, and well-commented code, maintaining professional coding standards.
4. Output Requirements:
– Export the data as a UTF-8 encoded CSV file suitable for analysis.
– Include a timestamp or version tag in the output filename to track scrape versions.
5. Documentation & Maintainability:
– Provide concise inline comments explaining key code sections.
– Include instructions or notes on configuring or customizing the script for future modifications.
# Steps
– Inspect the target website’s DOM structure after full page load using Selenium.
– Identify containers or cards representing each agent’s data.
– Loop through all dynamically loaded content/pages to extract data.
– Normalize and validate extracted fields.
– Store results in a pandas DataFrame.
– Export the DataFrame to CSV with a timestamp.
– Add logging and try-except blocks to handle exceptions and retries.
# Output Format
– A fully functional Python script file (.py) implementing the above requirements.
– The script outputs a UTF-8 encoded CSV file with a timestamped filename.
– The script includes clear comments and instructions.
# Notes
– No login/authentication appears required, but prepare code structure for easy extension.
– Because static content analysis shows no direct tables or JSON-LD, rely on Selenium to capture dynamic content.
– Respect robots.txt, which permits scraping.
– Use human-like browsing patterns to avoid rate-limiting or blocking.
Your final response should be the complete, best-practice Python script and associated documentation to enable robust, maintainable, and comprehensive scraping of **https://agents.sabrina.dev/**.
Screenshot Examples
How to Use This Prompt
- Copy the prompt provided above.
- Paste it into your preferred coding environment.
- Modify any specific parameters as needed for your project.
- Run the script to extract data from the website.
- Check the output CSV file for the extracted data.
- Review the comments for understanding and future modifications.
Tips for Best Results
- Use Selenium for Dynamic Content: Leverage Selenium to fully render JavaScript content, ensuring all data elements are captured.
- Implement Robust Error Handling: Include try-except blocks to manage network issues and element detection failures effectively.
- Normalize and Clean Data: Standardize extracted data, addressing any missing or null values for better analysis.
- Export with Timestamps: Save the output as a UTF-8 encoded CSV file, including a timestamp in the filename for version tracking.
FAQ
- What libraries are essential for web scraping in Python?
Key libraries include Selenium for dynamic content, BeautifulSoup for parsing, and Pandas for data organization. - How do you handle pagination in web scraping?
Implement loops to navigate through pages or detect infinite scroll events to load all content. - What is the purpose of using a timestamp in CSV filenames?
A timestamp helps track different versions of scraped data, ensuring data integrity and organization. - Why is error handling important in web scraping?
Error handling ensures the script can recover from network issues or element detection failures, improving reliability.
Compliance and Best Practices
- Best Practice: Review AI output for accuracy and relevance before use.
- Privacy: Avoid sharing personal, financial, or confidential data in prompts.
- Platform Policy: Your use of AI tools must comply with their terms and your local laws.
Revision History
- Version 1.0 (February 2026): Initial release.


