proxywhirl.strategies.core¶
Rotation strategies for proxy selection.
Classes¶
Composite strategy that applies filtering and selection strategies in sequence. |
|
Cost-aware proxy selection strategy. |
|
Geo-targeted proxy selection strategy. |
|
Least-used proxy selection strategy with SelectionContext support. |
|
Performance-based proxy selection using EMA response times. |
|
Per-proxy mutable metrics maintained by a strategy. |
|
Random proxy selection strategy with SelectionContext support. |
|
Protocol defining interface for proxy rotation strategies. |
|
Round-robin proxy selection strategy with SelectionContext support. |
|
Thread-safe session manager for sticky proxy assignments. |
|
Session persistence strategy (sticky sessions). |
|
Singleton registry for custom rotation strategies. |
|
Per-strategy mutable state for managing proxy metrics. |
|
Weighted proxy selection strategy with SelectionContext support. |
Module Contents¶
- class proxywhirl.strategies.core.CompositeStrategy(filters=None, selector=None)[source]¶
Composite strategy that applies filtering and selection strategies in sequence.
This strategy implements the filter + select pattern: 1. Filter strategies narrow down the proxy pool based on criteria (e.g., geography) 2. Selector strategy chooses the best proxy from the filtered set
Example
>>> # Filter by geography, then select by performance >>> from proxywhirl.strategies import CompositeStrategy, GeoTargetedStrategy, PerformanceBasedStrategy >>> strategy = CompositeStrategy( ... filters=[GeoTargetedStrategy()], ... selector=PerformanceBasedStrategy() ... ) >>> proxy = strategy.select(pool, SelectionContext(target_country="US"))
- Thread Safety:
Thread-safe if all component strategies are thread-safe.
- Performance:
Selection time is sum of filter and selector times. Target: <5ms total (SC-007).
Initialize composite strategy.
- Parameters:
filters (list[RotationStrategy] | None) – List of filtering strategies to apply sequentially
selector (RotationStrategy | None) – Final selection strategy to choose from filtered pool
- Raises:
ValueError – If both filters and selector are None
- configure(config)[source]¶
Configure all component strategies.
- Parameters:
config (proxywhirl.models.StrategyConfig) – Strategy configuration to apply
- Return type:
None
- classmethod from_config(config)[source]¶
Create CompositeStrategy from configuration dictionary.
- Parameters:
config (dict[str, Any]) – Configuration dict with keys: - filters: List of filter strategy names or instances - selector: Selector strategy name or instance
- Returns:
Configured CompositeStrategy instance
- Return type:
Example
>>> config = { ... "filters": ["geo-targeted"], ... "selector": "performance-based" ... } >>> strategy = CompositeStrategy.from_config(config)
- Raises:
ValueError – If config is invalid
- Parameters:
- Return type:
- record_result(proxy, success, response_time_ms)[source]¶
Record result by delegating to selector strategy.
- select(pool, context=None)[source]¶
Select a proxy by applying filters then selector.
Process: 1. Start with full pool of healthy proxies 2. Apply each filter strategy sequentially 3. Apply selector strategy to filtered set 4. Return selected proxy
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Request context with filtering criteria
- Returns:
Selected proxy from filtered pool
- Raises:
ProxyPoolEmptyError – If filters eliminate all proxies
- Return type:
proxywhirl.models.Proxy
- Performance:
Target: <5ms total including all filters and selector (SC-007)
- class proxywhirl.strategies.core.CostAwareStrategy(max_cost_per_request=None)[source]¶
Cost-aware proxy selection strategy.
Prioritizes free proxies over paid ones, with configurable cost thresholds. Uses weighted random selection based on inverse cost - lower cost proxies are more likely to be selected.
Features: - Free proxies (cost_per_request = 0.0) are heavily favored - Paid proxies are selected based on inverse cost weighting - Configurable cost threshold to filter out expensive proxies - Supports fallback to any proxy when no low-cost options available
- Thread Safety:
Uses Python’s random.choices() which is thread-safe via GIL.
Example
>>> from proxywhirl.strategies import CostAwareStrategy >>> strategy = CostAwareStrategy() >>> config = StrategyConfig(metadata={"max_cost_per_request": 0.5}) >>> strategy.configure(config) >>> proxy = strategy.select(pool) # Selects cheapest available proxy
Initialize cost-aware strategy.
- Parameters:
max_cost_per_request (float | None) – Maximum acceptable cost per request. Proxies exceeding this cost will be filtered out. None means no cost limit (default).
- configure(config)[source]¶
Configure cost-aware parameters.
- Parameters:
config (proxywhirl.models.StrategyConfig) – Strategy configuration with optional metadata: - max_cost_per_request: Maximum cost threshold - free_proxy_boost: Weight multiplier for free proxies (default: 10.0)
- Return type:
None
- select(pool, context=None)[source]¶
Select a proxy based on cost optimization.
Selection logic: 1. Get healthy proxies 2. Filter by context.failed_proxy_ids if present 3. Filter by max_cost_per_request threshold if configured 4. Apply inverse cost weighting (lower cost = higher weight) 5. Free proxies get boost multiplier (default 10x weight) 6. Use weighted random selection
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Optional selection context for filtering
- Returns:
Cost-optimized proxy selection
- Raises:
ProxyPoolEmptyError – If no proxies meet criteria
- Return type:
proxywhirl.models.Proxy
- validate_metadata(pool)[source]¶
Validate that pool has cost metadata.
Cost field is optional, so always returns True. Proxies without cost data are treated as free (cost = 0.0).
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to validate
- Returns:
Always True - cost data is optional
- Return type:
- class proxywhirl.strategies.core.GeoTargetedStrategy[source]¶
Geo-targeted proxy selection strategy.
Filters proxies based on geographical location (country or region) specified in the SelectionContext. Supports fallback to any proxy when no matches found.
Features: - Country-based filtering (ISO 3166-1 alpha-2 codes) - Region-based filtering (custom region names) - Country takes precedence over region when both specified - Configurable fallback behavior - Secondary strategy for selection from filtered proxies
- Thread Safety:
Stateless per-request operations, thread-safe.
- Success Criteria:
SC-006: 100% correct region selection when available
- Performance:
O(n) filtering + O(1) or O(n) secondary selection
Initialize geo-targeted strategy.
- configure(config)[source]¶
Configure geo-targeting parameters.
- Parameters:
config (proxywhirl.models.StrategyConfig) – Strategy configuration with geo settings
- Return type:
None
- record_result(proxy, success, response_time_ms)[source]¶
Record the result of a request through a proxy.
Updates proxy completion statistics via Proxy.complete_request().
- select(pool, context=None)[source]¶
Select a proxy based on geographical targeting.
Selection logic: 1. If context has target_country: filter by country (exact match) 2. Else if context has target_region: filter by region (exact match) 3. If no target specified: use all healthy proxies 4. Apply context.failed_proxy_ids filtering 5. If filtered list empty and fallback enabled: use all healthy proxies 6. If filtered list empty and fallback disabled: raise error 7. Apply secondary strategy to filtered proxies
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Selection context with target_country or target_region
- Returns:
Proxy matching geo criteria (or any proxy if fallback enabled)
- Raises:
ProxyPoolEmptyError – If no proxies match criteria and fallback disabled
- Return type:
proxywhirl.models.Proxy
- validate_metadata(pool)[source]¶
Validate that pool has geo metadata.
Geo-targeting is optional, so always returns True. Proxies without geo data will simply not match geo filters.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to validate
- Returns:
Always True - geo data is optional
- Return type:
- class proxywhirl.strategies.core.LeastUsedStrategy[source]¶
Least-used proxy selection strategy with SelectionContext support.
Selects the proxy with the fewest started requests, helping to balance load across all available proxies. Uses min-heap for efficient O(log n) selection.
- Performance:
O(log n) selection using min-heap
O(n) heap rebuild when pool composition changes
Lazy heap invalidation for optimal performance
- Thread Safety:
Uses threading.Lock to ensure atomic select-and-mark operations, preventing TOCTOU race conditions where multiple threads could select the same “least used” proxy simultaneously.
- Implementation:
Uses a min-heap with lazy invalidation. The heap is rebuilt when: 1. Pool composition changes (detected via proxy ID set) 2. Heap becomes empty after filtering The heap stores tuples of (requests_started, proxy_id, proxy) for efficient comparison and retrieval.
Initialize least-used strategy.
- configure(config)[source]¶
Configure the strategy with custom settings.
- Parameters:
config (proxywhirl.models.StrategyConfig)
- Return type:
None
- select(pool, context=None)[source]¶
Select the least-used healthy proxy using min-heap.
Uses min-heap for O(log n) selection. The heap is lazily rebuilt when pool composition changes, providing optimal performance for stable pools.
The selection and usage marking are performed atomically under a lock to prevent TOCTOU race conditions where multiple threads could select the same “least used” proxy simultaneously.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Optional selection context for filtering
- Returns:
Healthy proxy with fewest started requests
- Raises:
ProxyPoolEmptyError – If no healthy proxies are available
- Return type:
proxywhirl.models.Proxy
- class proxywhirl.strategies.core.PerformanceBasedStrategy(exploration_count=5)[source]¶
Performance-based proxy selection using EMA response times.
Selects proxies using weighted random selection based on inverse EMA response times - faster proxies (lower EMA) get higher weights. This adaptively favors better-performing proxies while still giving all proxies a chance to be selected.
Cold Start Handling: New proxies without performance data are given exploration trials (default: 3-5 trials) before being deprioritized. This ensures new proxies can build up performance data and prevents proxy starvation.
Thread Safety: Uses Python’s random.choices() which is thread-safe via GIL-protected random number generation. No additional locking required.
Initialize performance-based strategy.
- Parameters:
exploration_count (int) – Minimum trials for new proxies before performance-based selection applies. Default is 5 trials. Set to 0 to disable exploration.
- configure(config)[source]¶
Configure the strategy with custom settings.
- Parameters:
config (proxywhirl.models.StrategyConfig) – Strategy configuration with optional exploration_count
- Return type:
None
- record_result(proxy, success, response_time_ms)[source]¶
Record the result of using a proxy.
The EMA is updated using the strategy’s configured alpha value, ensuring consistent metric calculations regardless of proxy state.
- select(pool, context=None)[source]¶
Select a proxy weighted by inverse EMA response time.
Faster proxies (lower EMA) receive higher weights for selection. New proxies with insufficient trials (< exploration_count) are given priority to ensure they can build performance data.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Optional selection context for filtering
- Returns:
Performance-weighted selected healthy proxy with EMA data
- Raises:
ProxyPoolEmptyError – If no healthy proxies are available
- Return type:
proxywhirl.models.Proxy
- validate_metadata(pool)[source]¶
Validate that pool is usable for performance-based selection.
With exploration support, we only need at least one healthy proxy. Returns True if pool has healthy proxies (exploration will handle cold start).
- Returns:
True if pool has at least one healthy proxy
- Parameters:
pool (proxywhirl.models.ProxyPool)
- Return type:
- class proxywhirl.strategies.core.ProxyMetrics[source]¶
Per-proxy mutable metrics maintained by a strategy.
This class encapsulates performance metrics that a strategy tracks for each proxy. By storing these separately from the Proxy model, strategies can maintain independent metric state with their own configuration (e.g., EMA alpha values).
- class proxywhirl.strategies.core.RandomStrategy[source]¶
Random proxy selection strategy with SelectionContext support.
Randomly selects a proxy from the pool of healthy proxies. Provides unpredictable rotation for scenarios where sequential patterns should be avoided.
Thread Safety: Uses Python’s random module which is thread-safe via GIL-protected random number generation. No additional locking required.
Initialize random strategy.
- configure(config)[source]¶
Configure the strategy with custom settings.
- Parameters:
config (proxywhirl.models.StrategyConfig)
- Return type:
None
- select(pool, context=None)[source]¶
Select a random healthy proxy.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Optional selection context for filtering
- Returns:
Randomly selected healthy proxy
- Raises:
ProxyPoolEmptyError – If no healthy proxies are available
- Return type:
proxywhirl.models.Proxy
- class proxywhirl.strategies.core.RotationStrategy[source]¶
Bases:
ProtocolProtocol defining interface for proxy rotation strategies.
- select(pool, context=None)[source]¶
Select a proxy from the pool based on strategy logic.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Optional selection context for filtering
- Returns:
Selected proxy
- Raises:
ProxyPoolEmptyError – If no suitable proxy is available
- Return type:
proxywhirl.models.Proxy
- class proxywhirl.strategies.core.RoundRobinStrategy[source]¶
Round-robin proxy selection strategy with SelectionContext support.
Selects proxies in sequential order, wrapping around to the first proxy after reaching the end of the list. Only selects healthy proxies. Supports filtering based on SelectionContext (e.g., failed_proxy_ids).
- Thread Safety:
Uses threading.Lock to protect _current_index access, ensuring atomic index increment and preventing proxy skipping or duplicate selection in multi-threaded environments.
Initialize round-robin strategy.
- configure(config)[source]¶
Configure the strategy with custom settings.
- Parameters:
config (proxywhirl.models.StrategyConfig) – Strategy configuration object
- Return type:
None
- record_result(proxy, success, response_time_ms)[source]¶
Record the result of using a proxy.
Updates proxy statistics based on request outcome and completes the request tracking.
- select(pool, context=None)[source]¶
Select next proxy in round-robin order.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Optional selection context for filtering
- Returns:
Next healthy proxy in rotation
- Raises:
ProxyPoolEmptyError – If no healthy proxies are available
- Return type:
proxywhirl.models.Proxy
- class proxywhirl.strategies.core.SessionManager(max_sessions=10000, auto_cleanup_threshold=100)[source]¶
Thread-safe session manager for sticky proxy assignments.
Manages the mapping between session IDs and their assigned proxies, with automatic expiration and cleanup. All operations are thread-safe.
Features: - Automatic TTL-based expiration - LRU eviction when max_sessions limit is reached - Periodic cleanup of expired sessions
Initialize the session manager.
- Parameters:
- cleanup_expired()[source]¶
Remove all expired sessions.
- Returns:
Number of expired sessions removed
- Return type:
- create_session(session_id, proxy, timeout_seconds=300)[source]¶
Create or update a session assignment.
- get_all_sessions()[source]¶
Get all active (non-expired) sessions.
- Returns:
List of active Session objects
- Return type:
list[proxywhirl.models.Session]
- get_session(session_id)[source]¶
Get an active session by ID.
- Parameters:
session_id (str) – The session ID to look up
- Returns:
Session object if found and not expired, None otherwise
- Return type:
proxywhirl.models.Session | None
- class proxywhirl.strategies.core.SessionPersistenceStrategy(max_sessions=10000, auto_cleanup_threshold=100)[source]¶
Session persistence strategy (sticky sessions).
Maintains consistent proxy assignment for a given session ID across multiple requests. Ensures that all requests within a session use the same proxy unless the proxy becomes unavailable.
Features: - Session-to-proxy binding with configurable TTL - Automatic failover when assigned proxy becomes unhealthy - Thread-safe session management - Session expiration and cleanup
- Thread Safety:
Uses SessionManager which has internal locking for thread-safe operations.
- Success Criteria:
SC-005: 99.9% same-proxy guarantee for session requests
- Performance:
O(1) session lookup, <1ms overhead for session management
Initialize session persistence strategy.
- Parameters:
- cleanup_expired_sessions()[source]¶
Remove expired sessions.
- Returns:
Number of sessions removed
- Return type:
- close_session(session_id)[source]¶
Explicitly close a session.
- Parameters:
session_id (str) – The session ID to close
- Return type:
None
- configure(config)[source]¶
Configure session persistence parameters.
- Parameters:
config (proxywhirl.models.StrategyConfig) – Strategy configuration with session_stickiness_duration_seconds
- Return type:
None
- record_result(proxy, success, response_time_ms)[source]¶
Record the result of a request through a proxy.
Updates proxy completion statistics via Proxy.complete_request().
- select(pool, context=None)[source]¶
Select a proxy with session persistence.
If session_id exists and proxy is healthy, returns same proxy. If session_id is new or assigned proxy is unhealthy, assigns new proxy.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Selection context with session_id (required)
- Returns:
Healthy proxy assigned to the session
- Raises:
ValueError – If context is None or session_id is missing
ProxyPoolEmptyError – If no healthy proxies available
- Return type:
proxywhirl.models.Proxy
- validate_metadata(pool)[source]¶
Validate that pool has necessary metadata for strategy.
Session persistence doesn’t require specific proxy metadata.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to validate
- Returns:
Always True - session persistence works with any pool
- Return type:
- class proxywhirl.strategies.core.StrategyRegistry[source]¶
Singleton registry for custom rotation strategies.
Allows registration and retrieval of custom strategy implementations, enabling plugin architecture for ProxyWhirl.
Example
>>> from proxywhirl.strategies import StrategyRegistry >>> >>> # Create custom strategy >>> class MyStrategy: ... def select(self, pool): ... return pool.get_all_proxies()[0] ... def record_result(self, proxy, success, response_time_ms): ... pass >>> >>> # Register it >>> registry = StrategyRegistry() >>> registry.register_strategy("my-strategy", MyStrategy) >>> >>> # Retrieve and use >>> strategy_class = registry.get_strategy("my-strategy") >>> strategy = strategy_class()
- Thread Safety:
Thread-safe singleton implementation using double-checked locking.
- Performance:
Registration: O(1) Retrieval: O(1) Validation: <1ms per strategy (SC-010)
Initialize the registry (called once by __new__).
- get_strategy(name)[source]¶
Retrieve a registered strategy class.
- Parameters:
name (str) – Strategy name used during registration
- Returns:
Strategy class (not instance - caller must instantiate)
- Raises:
KeyError – If strategy name not found in registry
- Return type:
Example
>>> registry = StrategyRegistry() >>> strategy_class = registry.get_strategy("my-strategy") >>> strategy = strategy_class() # Instantiate
- register_strategy(name, strategy_class, *, validate=True)[source]¶
Register a custom strategy.
- Parameters:
- Raises:
ValueError – If strategy name already registered (unless re-registering)
TypeError – If strategy doesn’t implement required protocol methods
- Return type:
None
Example
>>> class FastStrategy: ... def select(self, pool): ... return pool.get_all_proxies()[0] ... def record_result(self, proxy, success, response_time_ms): ... pass >>> >>> registry = StrategyRegistry() >>> registry.register_strategy("fast", FastStrategy)
- class proxywhirl.strategies.core.StrategyState[source]¶
Per-strategy mutable state for managing proxy metrics.
This class separates mutable strategy state from immutable proxy identity. Each strategy instance maintains its own StrategyState, which tracks per-proxy metrics independently. This allows different strategies to:
Use different EMA alpha values without conflicts
Track proxy performance independently
Maintain consistent metrics across strategy reconfiguration
The state is keyed by proxy UUID to ensure stable identity even if proxy objects are recreated.
Example
>>> state = StrategyState(ema_alpha=0.3) >>> state.record_success(proxy.id, response_time_ms=150.0) >>> metrics = state.get_metrics(proxy.id) >>> print(metrics.ema_response_time_ms) # 150.0
- Thread Safety:
Uses threading.Lock to protect all state mutations.
- get_metrics(proxy_id)[source]¶
Get or create metrics for a proxy.
- Parameters:
proxy_id (uuid.UUID) – UUID of the proxy
- Returns:
ProxyMetrics instance for this proxy
- Return type:
- record_failure(proxy_id)[source]¶
Record a failed request.
- Parameters:
proxy_id (uuid.UUID) – UUID of the proxy
- Return type:
None
- class proxywhirl.strategies.core.WeightedStrategy[source]¶
Weighted proxy selection strategy with SelectionContext support.
Selects proxies based on custom weights or success rates. When custom weights are provided via StrategyConfig, they take precedence. Otherwise, weights are derived from success_rate. Uses weighted random selection to favor higher-performing proxies while still giving all proxies a chance.
Supports: - Custom weights via StrategyConfig.weights (proxy URL -> weight mapping) - Fallback to success_rate-based weights - Minimum weight (0.1) to ensure all proxies have selection chance - SelectionContext for filtering (e.g., failed_proxy_ids) - Weight caching to avoid O(n) recalculation on every selection
- Thread Safety:
Uses threading.Lock to protect weight cache access, ensuring atomic cache validation and update operations. Prevents race conditions where multiple threads could trigger duplicate weight recalculations or inconsistent cache states.
Initialize weighted strategy.
- configure(config)[source]¶
Configure the strategy with custom settings.
Invalidates the weight cache since configuration changes may affect weights.
- Parameters:
config (proxywhirl.models.StrategyConfig) – Strategy configuration object with optional custom weights
- Return type:
None
- record_result(proxy, success, response_time_ms)[source]¶
Record the result of using a proxy.
Updates proxy statistics based on request outcome and invalidates the weight cache since success rates may have changed.
Thread-safe: Uses double-checked locking pattern to ensure atomic invalidation and update. This prevents race conditions where another thread could select using stale weights while proxy stats are being updated.
The lock ensures: 1. No thread can read cached weights between invalidation and stat update 2. Proxy stat updates are atomic with cache invalidation 3. Multiple concurrent record_result() calls don’t interfere
- select(pool, context=None)[source]¶
Select a proxy weighted by custom weights or success rate.
Uses cached weights when possible to avoid O(n) recalculation on every call. Cache is invalidated when the proxy set changes (different IDs).
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to select from
context (proxywhirl.models.SelectionContext | None) – Optional selection context for filtering
- Returns:
Weighted-random selected healthy proxy
- Raises:
ProxyPoolEmptyError – If no healthy proxies are available
- Return type:
proxywhirl.models.Proxy
- validate_metadata(pool)[source]¶
Validate that pool has required metadata for weighted selection.
Weighted strategy can work with success_rate (always available) or custom weights.
- Parameters:
pool (proxywhirl.models.ProxyPool) – The proxy pool to validate
- Returns:
Always True as success_rate is always available
- Return type: