Skip to content

Key Features

Pydoll brings groundbreaking capabilities to browser automation, making it significantly more powerful than traditional tools while being easier to use.

Core Capabilities

Zero WebDrivers

Unlike traditional browser automation tools like Selenium, Pydoll eliminates the need for WebDrivers entirely. By connecting directly to browsers through the Chrome DevTools Protocol, Pydoll:

  • Eliminates version compatibility issues between browser and driver
  • Reduces setup complexity and maintenance overhead
  • Provides more reliable connections without driver-related issues
  • Allows for automation of all Chromium-based browsers with a unified API

No more "chromedriver version doesn't match Chrome version" errors or mysterious webdriver crashes.

Async-First Architecture

Built from the ground up with Python's asyncio, Pydoll provides:

  • True Concurrency: Run multiple operations in parallel without blocking
  • Efficient Resource Usage: Manage many browser instances with minimal overhead
  • Modern Python Patterns: Context managers, async iterators, and other asyncio-friendly interfaces
  • Performance Optimizations: Reduced latency and increased throughput for automation tasks

Human-Like Interactions

Avoid detection by mimicking real user behavior:

  • Natural Typing: Type text with randomized timing between keystrokes
  • Realistic clicking: Click with realistic timing and movement, including offset

Event-Driven Capabilities

Respond to browser events in real-time:

  • Network Monitoring: Track requests, responses, and failed loads
  • DOM Observation: React to changes in the page structure
  • Page Lifecycle Events: Capture navigation, loading, and rendering events
  • Custom Event Handlers: Register callbacks for specific events of interest

Multi-Browser Support

Pydoll works seamlessly with:

  • Google Chrome: Primary support with all features available
  • Microsoft Edge: Full support for Edge-specific features
  • Chromium: Support for other Chromium-based browsers

Screenshot and PDF Export

Capture visual content from web pages:

  • Full Page Screenshots: Capture entire page content, even beyond the viewport
  • Element Screenshots: Target specific elements for capture
  • High-Quality PDF Export: Generate PDF documents from web pages
  • Custom Formatting: Coming soon!

Native Cloudflare Captcha Bypass

Important Information About Captcha Bypass

The effectiveness of Cloudflare Turnstile bypass depends on several factors:

  • IP Reputation: Cloudflare assigns a "trust score" to each IP address. Clean residential IPs typically receive higher scores.
  • Previous History: IPs with a history of suspicious activity may be permanently flagged.

Pydoll can achieve scores comparable to a regular browser session, but cannot overcome IP-based blocks or extremely restrictive configurations. For best results, use residential IPs with good reputation.

Remember that captcha bypass techniques operate in a gray area and should be used responsibly.

One of Pydoll's most powerful features is its ability to automatically bypass Cloudflare Turnstile captchas that block most automation tools:

Context Manager Approach (Synchronous)

import asyncio
from pydoll.browser import Chrome

async def bypass_cloudflare_example():
    browser = Chrome()
    await browser.start()
    page = await browser.get_page()

    # The context manager will wait for the captcha to be processed
    # before continuing execution
    async with page.expect_and_bypass_cloudflare_captcha():
        await page.go_to('https://site-with-cloudflare.com')
        print("Waiting for captcha to be handled...")

    # This code runs only after the captcha is successfully bypassed
    print("Captcha bypassed! Continuing with automation...")
    await page.find_element_by_id('protected-content').get_text()

    await browser.stop()

asyncio.run(bypass_cloudflare_example())

Background Processing Approach

import asyncio
from pydoll.browser import Chrome

async def background_bypass_example():
    browser = Chrome()
    await browser.start()
    page = await browser.get_page()

    # Enable automatic captcha solving before navigating
    await page.enable_auto_solve_cloudflare_captcha()

    # Navigate to the protected site - captcha handled automatically in background
    await page.go_to('https://site-with-cloudflare.com')
    print("Page loaded, captcha will be handled in the background...")

    # Add a small delay to allow captcha solving to complete
    await asyncio.sleep(3)

    # Continue with automation
    await page.find_element_by_id('protected-content').get_text()

    # Disable auto-solving when no longer needed
    await page.disable_auto_solve_cloudflare_captcha()

    await browser.stop()

asyncio.run(background_bypass_example())

Access websites that actively block automation tools without using third-party captcha solving services. This native captcha handling makes Pydoll suitable for automating previously inaccessible websites.

Concurrent Scraping

Pydoll's async architecture allows you to scrape multiple pages or websites simultaneously for maximum efficiency:

import asyncio
from pydoll.browser.chrome import Chrome
from pydoll.constants import By

async def scrape_page(url):
    """Process a single page and extract data"""
    async with Chrome() as browser:
        await browser.start()
        page = await browser.get_page()
        await page.go_to(url)

        # Extract data
        title = await page.execute_script('return document.title')

        # Find elements and extract content
        elements = await page.find_elements(By.CSS_SELECTOR, '.article-content')
        content = []
        for element in elements:
            text = await element.get_element_text()
            content.append(text)

        return {
            "url": url,
            "title": title,
            "content": content
        }

async def main():
    # List of URLs to scrape in parallel
    urls = [
        'https://example.com/page1',
        'https://example.com/page2',
        'https://example.com/page3',
        'https://example.com/page4',
        'https://example.com/page5',
    ]

    # Process all URLs concurrently
    results = await asyncio.gather(*(scrape_page(url) for url in urls))

    # Print results
    for result in results:
        print(f"Scraped {result['url']}: {result['title']}")
        print(f"Found {len(result['content'])} content blocks")

    return results

# Run the concurrent scraping
all_data = asyncio.run(main())

This approach provides dramatic performance improvements over sequential scraping, especially for I/O-bound tasks like web scraping. Instead of waiting for each page to load one after another, Pydoll processes them all simultaneously, reducing total execution time significantly. For example, scraping 10 pages that each take 2 seconds to load would take just over 2 seconds total instead of 20+ seconds with sequential processing.

Advanced Keyboard Control

Pydoll provides human-like keyboard interaction with precise control over typing behavior:

import asyncio
from pydoll.browser.chrome import Chrome
from pydoll.constants import By
from pydoll.common.keys import Keys

async def realistic_typing_example():
    async with Chrome() as browser:
        await browser.start()
        page = await browser.get_page()
        await page.go_to('https://example.com/login')

        # Find login form elements
        username = await page.find_element(By.ID, 'username')
        password = await page.find_element(By.ID, 'password')

        # Type with realistic timing (interval between keystrokes)
        await username.type_keys("user@example.com", interval=0.15)

        # Use special key combinations
        await password.click()
        await password.key_down(Keys.SHIFT)
        await password.send_keys("PASSWORD")
        await password.key_up(Keys.SHIFT)

        # Press Enter to submit
        await password.send_keys(Keys.ENTER)

        # Wait for navigation
        await asyncio.sleep(2)
        print("Logged in successfully!")

asyncio.run(realistic_typing_example())

This realistic typing helps avoid detection by websites that look for automation patterns. The natural timing and ability to use special key combinations makes Pydoll's interactions virtually indistinguishable from human users.

Powerful Event System

Pydoll's event system allows you to react to browser events in real-time:

import asyncio
from pydoll.browser.chrome import Chrome
from pydoll.events.network import NetworkEvents
from pydoll.events.page import PageEvents
from functools import partial

async def event_monitoring_example():
    async with Chrome() as browser:
        await browser.start()
        page = await browser.get_page()

        # Monitor page load events
        async def on_page_loaded(event):
            print(f"🌐 Page loaded: {event['params'].get('url')}")

        await page.enable_page_events()
        await page.on(PageEvents.PAGE_LOADED, on_page_loaded)

        # Monitor network requests
        async def on_request(page, event):
            url = event['params']['request']['url']
            print(f"🔄 Request to: {url}")

        await page.enable_network_events()
        await page.on(NetworkEvents.REQUEST_WILL_BE_SENT, 
                      partial(on_request, page))

        # Navigate and see events in action
        await page.go_to('https://example.com')
        await asyncio.sleep(5)  # Allow time to see events

asyncio.run(event_monitoring_example())

The event system makes Pydoll uniquely powerful for monitoring API requests and responses, creating reactive automations, debugging complex web applications, and building comprehensive web monitoring tools.

File Upload Support

Seamlessly handle file uploads in your automation:

import asyncio
import os
from pydoll.browser.chrome import Chrome
from pydoll.constants import By

async def file_upload_example():
    async with Chrome() as browser:
        await browser.start()
        page = await browser.get_page()
        await page.go_to('https://example.com/upload')

        # Method 1: Direct file input
        file_input = await page.find_element(By.XPATH, '//input[@type="file"]')
        await file_input.set_input_files('path/to/document.pdf')

        # Method 2: Using file chooser with an upload button
        sample_file = os.path.join(os.getcwd(), 'sample.jpg')
        async with page.expect_file_chooser(files=sample_file):
            upload_button = await page.find_element(By.ID, 'upload-button')
            await upload_button.click()

        # Submit the form
        submit = await page.find_element(By.ID, 'submit-button')
        await submit.click()

        print("Files uploaded successfully!")

asyncio.run(file_upload_example())

File uploads are notoriously difficult to automate in other frameworks, often requiring workarounds. Pydoll makes it straightforward with both direct file input and file chooser dialog support.

Multi-Browser Example

Pydoll works with different browsers through a consistent API:

import asyncio
from pydoll.browser.chrome import Chrome
from pydoll.browser.edge import Edge

async def multi_browser_example():
    # Run the same automation in Chrome
    async with Chrome() as chrome:
        await chrome.start()
        chrome_page = await chrome.get_page()
        await chrome_page.go_to('https://example.com')
        chrome_title = await chrome_page.execute_script('return document.title')
        print(f"Chrome title: {chrome_title}")

    # Run the same automation in Edge
    async with Edge() as edge:
        await edge.start()
        edge_page = await edge.get_page()
        await edge_page.go_to('https://example.com')
        edge_title = await edge_page.execute_script('return document.title')
        print(f"Edge title: {edge_title}")

asyncio.run(multi_browser_example())

Cross-browser compatibility without changing your code. Test your automations across different browsers to ensure they work everywhere.

Proxy Integration

Unlike many automation tools that struggle with proxy implementation, Pydoll offers native proxy support with full authentication capabilities. This makes it ideal for:

  • Web scraping projects that need to rotate IPs
  • Geo-targeted testing of applications across different regions
  • Privacy-focused automation that requires anonymizing traffic
  • Testing web applications through corporate proxies

Configuring proxies in Pydoll is straightforward:

import asyncio
from pydoll.browser.chrome import Chrome
from pydoll.browser.options import Options

async def proxy_example():
    # Create browser options
    options = Options()

    # Simple proxy without authentication
    options.add_argument('--proxy-server=192.168.1.100:8080')
    # Or proxy with authentication
    # options.add_argument('--proxy-server=username:password@192.168.1.100:8080')

    # Bypass proxy for specific domains
    options.add_argument('--proxy-bypass-list=*.internal.company.com,localhost')

    # Start browser with proxy configuration
    async with Chrome(options=options) as browser:
        await browser.start()
        page = await browser.get_page()

        # Test the proxy by visiting an IP echo service
        await page.go_to('https://api.ipify.org')
        ip_address = await page.page_source
        print(f"Current IP address: {ip_address}")

        # Continue with your automation
        await page.go_to('https://example.com')
        title = await page.execute_script('document.title')
        print(f"Page title: {title}")

asyncio.run(proxy_example())

Request Interception

Intercept and modify network requests before they're sent:

import asyncio
from pydoll.browser.chrome import Chrome
from pydoll.events.fetch import FetchEvents
from pydoll.commands.fetch import FetchCommands
from functools import partial

async def request_interception_example():
    async with Chrome() as browser:
        await browser.start()
        page = await browser.get_page()

        # Define the request interceptor
        async def intercept_request(page, event):
            request_id = event['params']['requestId']
            url = event['params']['request']['url']

            if '/api/' in url:
                # Get original headers
                original_headers = event['params']['request'].get('headers', {})

                # Add custom headers
                custom_headers = {
                    **original_headers,
                    'Authorization': 'Bearer my-token-123',
                    'X-Custom-Header': 'CustomValue'
                }

                print(f"🔄 Modifying request to: {url}")
                await page._execute_command(
                    FetchCommands.continue_request(
                        request_id=request_id,
                        headers=custom_headers
                    )
                )
            else:
                # Continue normally for non-API requests
                await page._execute_command(
                    FetchCommands.continue_request(
                        request_id=request_id
                    )
                )

        # Enable interception and register handler
        await page.enable_fetch_events()
        await page.on(FetchEvents.REQUEST_PAUSED, 
                      partial(intercept_request, page))

        # Navigate to trigger requests
        await page.go_to('https://example.com')
        await asyncio.sleep(5)  # Allow time for requests to process

asyncio.run(request_interception_example())

This powerful capability allows you to add authentication headers dynamically, modify request payloads before they're sent, mock API responses for testing, and analyze and debug network traffic.

Each of these features showcases what makes Pydoll a next-generation browser automation tool, combining the power of direct browser control with an intuitive, async-native API.