proxywhirl.browser

Browser-based rendering for JavaScript-heavy proxy sources using Playwright.

Classes

BrowserRenderer

Browser-based page renderer using Playwright for JavaScript execution.

Module Contents

class proxywhirl.browser.BrowserRenderer(headless=True, browser_type='chromium', timeout=30000, wait_until='load', user_agent=None, viewport=None, max_contexts=3)[source]

Browser-based page renderer using Playwright for JavaScript execution.

Renders pages that require full browser JavaScript execution, useful for proxy sources that use client-side rendering or dynamic content loading.

Supports context pooling for improved performance when rendering multiple pages concurrently. Pooled contexts are reused instead of being created fresh for each render operation.

Example

>>> renderer = BrowserRenderer(headless=True)
>>> await renderer.start()
>>> html = await renderer.render("https://example.com/proxies")
>>> await renderer.close()

Or use as context manager: >>> async with BrowserRenderer() as renderer: … html = await renderer.render(”https://example.com/proxies”)

Pooled mode for concurrent rendering: >>> async with BrowserRenderer(max_contexts=5) as renderer: … results = await asyncio.gather( … renderer.render(”https://site1.com”), … renderer.render(”https://site2.com”), … renderer.render(”https://site3.com”), … )

Initialize browser renderer.

Parameters:
  • headless (bool) – Run browser in headless mode (default: True)

  • browser_type (Literal['chromium', 'firefox', 'webkit']) – Browser engine to use (default: chromium)

  • timeout (int) – Page load timeout in milliseconds (default: 30000)

  • wait_until (Literal['load', 'domcontentloaded', 'networkidle']) – When to consider navigation complete (default: load)

  • user_agent (str | None) – Custom user agent string (optional)

  • viewport (dict[str, int] | None) – Custom viewport size, e.g. {“width”: 1280, “height”: 720}

  • max_contexts (int) – Maximum number of pooled browser contexts (default: 3). Higher values allow more concurrent rendering but use more memory.

async acquire_context(timeout=None)[source]

Acquire a browser context from the pool.

Blocks until a context is available. For concurrent rendering, use this with release_context() to manage context lifecycle manually.

Parameters:

timeout (float | None) – Maximum time to wait for a context (seconds). None means wait forever.

Returns:

A browser context from the pool

Raises:
Return type:

playwright.async_api.BrowserContext

async close()[source]

Close the browser instance and all pooled contexts.

Closes all browser contexts in the pool and the browser itself. Safe to call multiple times.

Return type:

None

async release_context(context)[source]

Release a browser context back to the pool.

Returns a previously acquired context to the pool for reuse.

Parameters:

context (playwright.async_api.BrowserContext) – The browser context to release

Raises:
Return type:

None

async render(url, wait_for_selector=None, wait_for_timeout=None)[source]

Render a page and return its HTML content.

Uses the context pool for concurrent-safe rendering. Acquires a context from the pool, renders the page, and releases the context back.

Parameters:
  • url (str) – URL to render

  • wait_for_selector (str | None) – Optional CSS selector to wait for before returning

  • wait_for_timeout (int | None) – Optional additional wait time in milliseconds

Returns:

Rendered HTML content as string

Raises:
Return type:

str

async start()[source]

Start the browser instance and initialize the context pool.

Initializes Playwright, launches the browser, and pre-creates browser contexts for the pool. Idempotent - safe to call multiple times.

Raises:
Return type:

None

property pool_capacity: int[source]

Total capacity of the context pool.

Return type:

int

property pool_size: int[source]

Number of contexts currently available in the pool.

Return type:

int