proxywhirl.browser¶
Browser-based rendering for JavaScript-heavy proxy sources using Playwright.
Classes¶
Browser-based page renderer using Playwright for JavaScript execution. |
Module Contents¶
- class proxywhirl.browser.BrowserRenderer(headless=True, browser_type='chromium', timeout=30000, wait_until='load', user_agent=None, viewport=None, max_contexts=3)[source]¶
Browser-based page renderer using Playwright for JavaScript execution.
Renders pages that require full browser JavaScript execution, useful for proxy sources that use client-side rendering or dynamic content loading.
Supports context pooling for improved performance when rendering multiple pages concurrently. Pooled contexts are reused instead of being created fresh for each render operation.
Example
>>> renderer = BrowserRenderer(headless=True) >>> await renderer.start() >>> html = await renderer.render("https://example.com/proxies") >>> await renderer.close()
Or use as context manager: >>> async with BrowserRenderer() as renderer: … html = await renderer.render(”https://example.com/proxies”)
Pooled mode for concurrent rendering: >>> async with BrowserRenderer(max_contexts=5) as renderer: … results = await asyncio.gather( … renderer.render(”https://site1.com”), … renderer.render(”https://site2.com”), … renderer.render(”https://site3.com”), … )
Initialize browser renderer.
- Parameters:
headless (bool) – Run browser in headless mode (default: True)
browser_type (Literal['chromium', 'firefox', 'webkit']) – Browser engine to use (default: chromium)
timeout (int) – Page load timeout in milliseconds (default: 30000)
wait_until (Literal['load', 'domcontentloaded', 'networkidle']) – When to consider navigation complete (default: load)
user_agent (str | None) – Custom user agent string (optional)
viewport (dict[str, int] | None) – Custom viewport size, e.g. {“width”: 1280, “height”: 720}
max_contexts (int) – Maximum number of pooled browser contexts (default: 3). Higher values allow more concurrent rendering but use more memory.
- async acquire_context(timeout=None)[source]¶
Acquire a browser context from the pool.
Blocks until a context is available. For concurrent rendering, use this with release_context() to manage context lifecycle manually.
- Parameters:
timeout (float | None) – Maximum time to wait for a context (seconds). None means wait forever.
- Returns:
A browser context from the pool
- Raises:
RuntimeError – If browser is not started
asyncio.TimeoutError – If timeout expires before a context is available
- Return type:
playwright.async_api.BrowserContext
- async close()[source]¶
Close the browser instance and all pooled contexts.
Closes all browser contexts in the pool and the browser itself. Safe to call multiple times.
- Return type:
None
- async release_context(context)[source]¶
Release a browser context back to the pool.
Returns a previously acquired context to the pool for reuse.
- Parameters:
context (playwright.async_api.BrowserContext) – The browser context to release
- Raises:
RuntimeError – If browser is not started
ValueError – If context is not from this pool
- Return type:
None
- async render(url, wait_for_selector=None, wait_for_timeout=None)[source]¶
Render a page and return its HTML content.
Uses the context pool for concurrent-safe rendering. Acquires a context from the pool, renders the page, and releases the context back.
- Parameters:
- Returns:
Rendered HTML content as string
- Raises:
RuntimeError – If browser is not started
TimeoutError – If page load times out
- Return type:
- async start()[source]¶
Start the browser instance and initialize the context pool.
Initializes Playwright, launches the browser, and pre-creates browser contexts for the pool. Idempotent - safe to call multiple times.
- Raises:
ImportError – If playwright is not installed
RuntimeError – If browser fails to start
- Return type:
None