Extract dynamic web data to CSV effortlessly

Overview

This prompt aims to guide a senior Python developer in creating a web scraping script for a dynamic website. Developers and data analysts will benefit from the structured approach and best practices outlined for efficient data extraction.

Prompt Overview

Purpose: This script aims to extract structured data from a dynamic website and export it to a CSV file.
Audience: The intended audience includes developers and data analysts interested in web scraping and data extraction techniques.
Distinctive Feature: It utilizes Selenium for dynamic content rendering, ensuring complete data coverage from the target site.
Outcome: The final output will be a well-organized CSV file containing agent data, ready for analysis.

Quick Specs

Media: Text
Use case: Generation
Industry: Data & Analysis, Productivity & Workflow, Robotics & Automation
Techniques: Decomposition, Role/Persona Prompting, Structured Output
Models: Claude 3.5 Sonnet, Gemini 2.0 Flash, GPT-4o, Llama 3.1 70B
Estimated time: 5-10 minutes
Skill level: Beginner

Variables to Fill

No inputs required — just copy and use the prompt.

Example Variables Block

No example values needed for this prompt.

The Prompt

                
You are a senior Python developer and expert in web scraping, tasked with creating a professional, production-ready Python script to extract all relevant, structured data from the dynamic website:

**https://agents.sabrina.dev/**

and export it into a well-organized CSV file.

Your solution must:

1. Data Extraction Requirements:

– Identify and extract all structured data elements visible on the site, including:

– Agent names

– IDs

– Performance metrics

– Any associated metadata

– Use Selenium or other browser automation tools to fully render JavaScript content due to the site’s dynamic loading.

– Handle pagination, infinite scroll, or any dynamic loading mechanisms to ensure complete data coverage.

– Respect the robots.txt and the website’s terms of service; the robots.txt currently allows scraping.

– Implement robust error handling for:

– Network issues

– Element detection failures

– Potential rate limiting

2. Data Structuring:

– Clean, normalize, and standardize the data, handling missing or null values appropriately.

– Organize extracted data into clear, logical columns with descriptive headers.

– Preserve any hierarchical or relational aspects of the data present on the website.

3. Technical Implementation:

– Use Python with libraries such as:

– Selenium for scraping dynamic content

– BeautifulSoup for HTML parsing

– Requests (if needed)

– Pandas for data organization

– Implement session management if login/authentication is required (currently not indicated).

– Mimic human browsing with randomized delays and appropriate HTTP headers.

– Write clear, modular, and well-commented code, maintaining professional coding standards.

4. Output Requirements:

– Export the data as a UTF-8 encoded CSV file suitable for analysis.

– Include a timestamp or version tag in the output filename to track scrape versions.

5. Documentation & Maintainability:

– Provide concise inline comments explaining key code sections.

– Include instructions or notes on configuring or customizing the script for future modifications.

# Steps

– Inspect the target website’s DOM structure after full page load using Selenium.

– Identify containers or cards representing each agent’s data.

– Loop through all dynamically loaded content/pages to extract data.

– Normalize and validate extracted fields.

– Store results in a pandas DataFrame.

– Export the DataFrame to CSV with a timestamp.

– Add logging and try-except blocks to handle exceptions and retries.

# Output Format

– A fully functional Python script file (.py) implementing the above requirements.

– The script outputs a UTF-8 encoded CSV file with a timestamped filename.

– The script includes clear comments and instructions.

# Notes

– No login/authentication appears required, but prepare code structure for easy extension.

– Because static content analysis shows no direct tables or JSON-LD, rely on Selenium to capture dynamic content.

– Respect robots.txt, which permits scraping.

– Use human-like browsing patterns to avoid rate-limiting or blocking.

Your final response should be the complete, best-practice Python script and associated documentation to enable robust, maintainable, and comprehensive scraping of **https://agents.sabrina.dev/**.

Editable Prompt
Click "Edit Prompt" to customize any part of this prompt before copying.

Screenshot Examples

How to Use This Prompt

Copy the prompt provided above.
Paste it into your preferred coding environment.
Modify any specific parameters as needed for your project.
Run the script to extract data from the website.
Check the output CSV file for the extracted data.
Review the comments for understanding and future modifications.

Tips for Best Results

Use Selenium for Dynamic Content: Leverage Selenium to fully render JavaScript content, ensuring all data elements are captured.
Implement Robust Error Handling: Include try-except blocks to manage network issues and element detection failures effectively.
Normalize and Clean Data: Standardize extracted data, addressing any missing or null values for better analysis.
Export with Timestamps: Save the output as a UTF-8 encoded CSV file, including a timestamp in the filename for version tracking.

FAQ

What libraries are essential for web scraping in Python?
Key libraries include Selenium for dynamic content, BeautifulSoup for parsing, and Pandas for data organization.
How do you handle pagination in web scraping?
Implement loops to navigate through pages or detect infinite scroll events to load all content.
What is the purpose of using a timestamp in CSV filenames?
A timestamp helps track different versions of scraped data, ensuring data integrity and organization.
Why is error handling important in web scraping?
Error handling ensures the script can recover from network issues or element detection failures, improving reliability.

Compliance and Best Practices

Best Practice: Review AI output for accuracy and relevance before use.
Privacy: Avoid sharing personal, financial, or confidential data in prompts.
Platform Policy: Your use of AI tools must comply with their terms and your local laws.

Revision History

Version 1.0 (February 2026): Initial release.

Python Script for Web Scraping Dynamic Data to CSV

Overview

Prompt Overview

Quick Specs

Variables to Fill

Example Variables Block

The Prompt

Screenshot Examples

How to Use This Prompt

Tips for Best Results

FAQ

Compliance and Best Practices

Revision History

LEAVE A REPLY Cancel reply

Most Used Prompts

Competitive Digital Marketing Analysis

Monthly Budget Planner Prompt for ChatGPT – Build a Balanced 50/30/20 Plan

Systems Integration Plan for Business Systems 1 and 2 Streamlining Process for Improved Efficiency

Data Analyst Analyzing Education Metrics for Improved Performance

Custom Visual Content Creation Prompt for [Social Media Platform]

Related articles

AI Powered Web Development Portfolio with React PHP Bootstrap and DBMS Integration

AI Wallet Finder Program with Authentication and Security

Determine Movie Ticket Cost by Age Conditional Logic Guide

Create a 3D Robot Slum Simulation with Three.js for Developers

EDITOR PICKS

Petite Indian Model in Saree Captured by Serene River Scene

Determine Movie Ticket Cost by Age Conditional Logic Guide

3D Shooting Game Script for Unity Developers

ABAP Class Method Generation for XMI Parsing

POPULAR POSTS

Video Script for Affiliate Marketing Prompt

Competitive Digital Marketing Analysis

Monthly Budget Planner Prompt for ChatGPT – Build a Balanced 50/30/20 Plan

POPULAR CATEGORY

ABOUT US

FOLLOW US