Retry & Failover Guide¶

ProxyWhirl provides intelligent retry logic, circuit breaker protection, and automatic proxy failover to maximize request success rates. This guide covers the complete retry and failover system, from basic configuration to advanced observability.

Architecture Overview ¶

The retry system consists of four main components:

RetryPolicy - Configures retry behavior and backoff strategies
RetryExecutor - Orchestrates retry logic with intelligent proxy selection
CircuitBreaker - Protects against cascading failures using state machine transitions
RetryMetrics - Collects observability data for monitoring and analysis

When to Use Retry vs Failover ¶

Retry (Same Proxy)¶

Use retry when:

Network is temporarily unstable (connection timeout, packet loss)
Target server returns 502/503/504 (gateway errors, service temporarily unavailable)
Request is idempotent (GET, HEAD, OPTIONS, DELETE, PUT)

Failover (Different Proxy)¶

Use failover when:

Proxy authentication fails (407)
Proxy consistently returns errors (circuit breaker opens)
Specific geo-targeting or performance requirements exist
One proxy exhausts rate limits

ProxyWhirl combines both: it retries with backoff on the current proxy, then fails over to a better proxy if retries are exhausted.

RetryPolicy Configuration ¶

Basic Configuration ¶

from proxywhirl import ProxyWhirl, RetryPolicy, BackoffStrategy

# Default policy: 3 attempts, exponential backoff
rotator = ProxyWhirl()

# Custom policy
policy = RetryPolicy(
    max_attempts=5,                          # Maximum retry attempts
    backoff_strategy=BackoffStrategy.EXPONENTIAL,
    base_delay=1.0,                          # Initial delay (seconds)
    multiplier=2.0,                          # Exponential multiplier
    max_backoff_delay=30.0,                  # Maximum delay cap
    jitter=True,                             # AWS decorrelated jitter
    retry_status_codes=[502, 503, 504],      # Retryable HTTP errors
    timeout=60.0,                            # Total timeout for all attempts
    retry_non_idempotent=False,              # Don't retry POST by default
)

rotator = ProxyWhirl(retry_policy=policy)

Backoff Strategies ¶

ProxyWhirl supports three backoff strategies:

Exponential Backoff (Recommended)¶

Best for network failures and overloaded servers. Delays increase exponentially to give systems time to recover.

policy = RetryPolicy(
    backoff_strategy=BackoffStrategy.EXPONENTIAL,
    base_delay=1.0,
    multiplier=2.0,
    max_backoff_delay=30.0,
    jitter=True,  # AWS decorrelated jitter
)

# Attempt 0: random(0, 1.0s)
# Attempt 1: random(1.0s, previous * 3), capped at 30s
# Attempt 2: random(1.0s, previous * 3), capped at 30s
# Each delay decorrelated from the previous (AWS algorithm)

Linear Backoff¶

Best for predictable retry patterns. Delays increase linearly.

policy = RetryPolicy(
    backoff_strategy=BackoffStrategy.LINEAR,
    base_delay=2.0,
    max_backoff_delay=10.0,
)

# Attempt 0: 2.0s
# Attempt 1: 4.0s
# Attempt 2: 6.0s
# Attempt 3: 8.0s
# Attempt 4: 10.0s (capped)

Fixed Backoff¶

Best for testing or when delays should be constant.

policy = RetryPolicy(
    backoff_strategy=BackoffStrategy.FIXED,
    base_delay=5.0,
)

# All attempts: 5.0s

Jitter Explained ¶

Jitter uses the AWS decorrelated jitter algorithm to prevent synchronized retries across multiple clients. Instead of simple randomization, each retry delay depends on the previous delay:

# AWS decorrelated jitter formula:
#   delay = min(cap, random(base_delay, previous_delay * 3))
#
# First attempt: uniform random from 0 to base delay
# Subsequent attempts: decorrelated from previous delay
#
# Without jitter: All clients retry at exactly 1.0s, 2.0s, 4.0s...
# With jitter: Clients retry at decorrelated random times:
#   Client A: 0.7s, 1.3s, 2.8s...
#   Client B: 1.4s, 2.9s, 5.1s...
#   Client C: 0.9s, 1.1s, 3.7s...

This prevents “thundering herd” problems where many clients overwhelm a recovering server. See the AWS Architecture Blog for background on decorrelated jitter.

Retryable Status Codes ¶

By default, ProxyWhirl retries on gateway errors:

# Default retry status codes
retry_status_codes = [502, 503, 504]

# Custom status codes (must be 5xx)
policy = RetryPolicy(
    retry_status_codes=[500, 502, 503, 504, 507, 508]
)

4xx errors (client errors) are never retried as they indicate permanent failures.

Timeout Behavior ¶

The timeout parameter caps total execution time across all retry attempts:

policy = RetryPolicy(
    max_attempts=10,
    timeout=30.0,  # Total timeout for all attempts
)

# If 30s elapses after 3 attempts, remaining 7 attempts are skipped
# Raises: ProxyConnectionError("Request timeout after 30.00s")

Non-Idempotent Requests ¶

By default, POST/PATCH requests are not retried (they’re not idempotent):

# Default: POST fails immediately without retry
rotator.request("POST", url, json=data)

# Enable retries for POST (use with caution!)
policy = RetryPolicy(retry_non_idempotent=True)
rotator = ProxyWhirl(retry_policy=policy)

# Now POST will retry on network failures
rotator.request("POST", url, json=data)

Warning: Only enable retry_non_idempotent if your API is idempotent (e.g., uses idempotency keys).

Circuit Breaker Configuration ¶

Circuit breakers protect against cascading failures by temporarily removing unhealthy proxies from the rotation pool.

Sync vs Async Circuit Breakers ¶

ProxyWhirl provides two circuit breaker implementations. See Async Client Guide for guidance on choosing between sync and async patterns.

CircuitBreaker - Synchronous implementation using threading.Lock
AsyncCircuitBreaker - Async implementation using asyncio-compatible locks

When to use which:

# ✅ For synchronous code - use CircuitBreaker
from proxywhirl import CircuitBreaker

cb = CircuitBreaker(proxy_id="proxy-1")
cb.record_failure()  # Thread-safe
if cb.should_attempt_request():
    cb.record_success()

# ✅ For async code - use AsyncCircuitBreaker
from proxywhirl.circuit_breaker import AsyncCircuitBreaker

cb = AsyncCircuitBreaker(proxy_id="proxy-1")
await cb.record_failure()  # Event loop safe
if await cb.should_attempt_request():
    await cb.record_success()

WARNING: Do NOT mix sync locks with async code. The CircuitBreaker class uses threading.Lock internally which can block the event loop. For production async applications, always use AsyncCircuitBreaker.

State Machine ¶

Circuit breakers transition through three states:

CLOSED → OPEN → HALF_OPEN → CLOSED
  ↑                            ↓
  └────────────────────────────┘

CLOSED - Normal operation, proxy is available
OPEN - Proxy excluded from rotation (too many failures)
HALF_OPEN - Testing recovery with limited requests

Thresholds and Configuration ¶

from proxywhirl import CircuitBreakerConfig
from proxywhirl.circuit_breaker import CircuitBreaker

# Circuit breakers are created automatically by ProxyWhirl
# Access via rotator.circuit_breakers dict

proxy = rotator.get_proxy()
cb = rotator.circuit_breakers[str(proxy.id)]

# Configuration (set on CircuitBreaker creation)
print(cb.failure_threshold)   # Default: 5 failures
print(cb.window_duration)     # Default: 60 seconds (rolling window)
print(cb.timeout_duration)    # Default: 30 seconds (OPEN timeout)

State Transitions ¶

CLOSED → OPEN¶

Circuit opens when failure count exceeds threshold within the rolling window:

cb = CircuitBreaker(
    proxy_id=str(proxy.id),
    failure_threshold=5,
    window_duration=60.0,
)

# Record 5 failures within 60 seconds
for _ in range(5):
    cb.record_failure()

print(cb.state)  # CircuitBreakerState.OPEN
print(cb.failure_count)  # 5

OPEN → HALF_OPEN¶

Circuit transitions to HALF_OPEN after timeout duration elapses:

import time

# Circuit is OPEN
print(cb.state)  # CircuitBreakerState.OPEN

# Check after timeout (30s default)
time.sleep(30)

# Next request triggers transition to HALF_OPEN
if cb.should_attempt_request():
    print(cb.state)  # CircuitBreakerState.HALF_OPEN

HALF_OPEN → CLOSED¶

Circuit closes if test request succeeds:

# Circuit is HALF_OPEN
cb.record_success()

print(cb.state)  # CircuitBreakerState.CLOSED
print(cb.failure_count)  # 0 (reset)

HALF_OPEN → OPEN¶

Circuit reopens if test request fails:

# Circuit is HALF_OPEN
cb.record_failure()

print(cb.state)  # CircuitBreakerState.OPEN
# Timeout duration resets, must wait another 30s

Rolling Window Behavior ¶

The circuit breaker uses a sliding window to track recent failures:

cb = CircuitBreaker(
    proxy_id=str(proxy.id),
    failure_threshold=3,
    window_duration=60.0,
)

# At t=0: Record 2 failures
cb.record_failure()  # failure_count = 1
cb.record_failure()  # failure_count = 2
print(cb.state)  # CLOSED (below threshold)

# At t=65: Old failures expired (outside 60s window)
time.sleep(65)
print(cb.failure_count)  # 0 (window cleaned automatically)

# New failure doesn't trigger circuit
cb.record_failure()  # failure_count = 1
print(cb.state)  # CLOSED

Manual Reset ¶

Reset circuit breaker to CLOSED state manually:

# Force reset (useful for testing or manual intervention)
cb.reset()

print(cb.state)  # CircuitBreakerState.CLOSED
print(cb.failure_count)  # 0

Intelligent Proxy Selection ¶

When retries are exhausted, ProxyWhirl automatically selects the best alternative proxy using performance-based scoring.

Selection Algorithm ¶

The executor scores each candidate proxy using:

score = (0.7 × success_rate) + (0.3 × (1 - normalized_latency))

70% weight on success rate (reliability)
30% weight on latency (performance)
10% bonus for geo-targeting match (optional)

Example: Performance-Based Selection ¶

from proxywhirl import Proxy, ProxyWhirl

# Create proxies with different success rates
proxy1 = Proxy(url="http://proxy1.example.com:8080")
proxy1.total_requests = 100
proxy1.total_successes = 95  # 95% success rate

proxy2 = Proxy(url="http://proxy2.example.com:8080")
proxy2.total_requests = 100
proxy2.total_successes = 60  # 60% success rate

rotator = ProxyWhirl(proxies=[proxy1, proxy2])

# Intelligent selection prioritizes proxy1
executor = rotator.retry_executor
selected = executor.select_retry_proxy([proxy1, proxy2], failed_proxy)

print(selected.url)  # http://proxy1.example.com:8080

Example: Geo-Targeted Selection ¶

# Create proxies with different regions
proxy_us = Proxy(
    url="http://proxy-us.example.com:8080",
    metadata={"region": "US-EAST"},
)
proxy_us.total_requests = 100
proxy_us.total_successes = 80  # 80% success rate

proxy_eu = Proxy(
    url="http://proxy-eu.example.com:8080",
    metadata={"region": "EU-WEST"},
)
proxy_eu.total_requests = 100
proxy_eu.total_successes = 85  # 85% success rate (slightly better)

rotator = ProxyWhirl(proxies=[proxy_us, proxy_eu])

# Select with target region
executor = rotator.retry_executor
selected = executor.select_retry_proxy(
    [proxy_us, proxy_eu],
    failed_proxy,
    target_region="US-EAST"
)

# Selects proxy_us despite lower success rate (10% region bonus)
print(selected.url)  # http://proxy-us.example.com:8080

Exclusion Rules ¶

The selection algorithm excludes:

Failed proxy - The proxy that just failed is never selected
Open circuits - Proxies with open circuit breakers
Half-open pending - Proxies already testing recovery

# If all proxies are excluded, returns None
selected = executor.select_retry_proxy([only_failed_proxy], failed_proxy)
print(selected)  # None

RetryMetrics and Observability ¶

ProxyWhirl tracks detailed metrics for monitoring, debugging, and analytics.

Tip

You can also view retry and circuit breaker statistics from the command line using proxywhirl stats --retry --circuit-breaker. See CLI Reference for details.

Collecting Metrics ¶

from proxywhirl import ProxyWhirl

rotator = ProxyWhirl()

# Metrics are automatically collected
response = rotator.request("GET", "https://httpbin.org/ip")

# Access metrics
metrics = rotator.retry_metrics
print(metrics.get_summary())

Summary Statistics ¶

summary = metrics.get_summary()

print(summary)
# {
#     "total_retries": 42,
#     "success_by_attempt": {
#         0: 35,  # 35 requests succeeded on first attempt
#         1: 5,   # 5 requests succeeded on second attempt
#         2: 2,   # 2 requests succeeded on third attempt
#     },
#     "circuit_breaker_events_count": 3,
#     "retention_hours": 24,
# }

Time-Series Data ¶

# Get hourly aggregated data for last 24 hours
timeseries = metrics.get_timeseries(hours=24)

for datapoint in timeseries:
    print(datapoint)
# {
#     "timestamp": "2025-12-27T14:00:00+00:00",
#     "total_requests": 150,
#     "total_retries": 25,
#     "success_rate": 0.94,
#     "avg_latency": 0.234,
# }

Per-Proxy Statistics ¶

# Get retry statistics by proxy
by_proxy = metrics.get_by_proxy(hours=24)

for proxy_id, stats in by_proxy.items():
    print(f"Proxy {proxy_id}: {stats}")
# {
#     "proxy_id": "550e8400-e29b-41d4-a716-446655440000",
#     "total_attempts": 50,
#     "success_count": 45,
#     "failure_count": 5,
#     "avg_latency": 0.234,
#     "circuit_breaker_opens": 2,
# }

Circuit Breaker Events ¶

# Access circuit breaker state changes
for event in metrics.circuit_breaker_events:
    print(f"{event.timestamp}: {event.proxy_id}")
    print(f"  {event.from_state} → {event.to_state}")
    print(f"  Failure count: {event.failure_count}")

# Example output:
# 2025-12-27 14:32:15+00:00: 550e8400-e29b-41d4-a716-446655440000
#   CLOSED → OPEN
#   Failure count: 5
#
# 2025-12-27 14:32:45+00:00: 550e8400-e29b-41d4-a716-446655440000
#   OPEN → HALF_OPEN
#   Failure count: 5
#
# 2025-12-27 14:32:47+00:00: 550e8400-e29b-41d4-a716-446655440000
#   HALF_OPEN → CLOSED
#   Failure count: 0

Hourly Aggregation ¶

Metrics automatically aggregate into hourly summaries to prevent unbounded memory growth:

# Manually trigger aggregation (normally runs automatically)
metrics.aggregate_hourly()

# View aggregates
for hour, agg in metrics.hourly_aggregates.items():
    print(f"{hour}: {agg.total_requests} requests, {agg.total_retries} retries")
    print(f"  Success by attempt: {agg.success_by_attempt}")
    print(f"  Failure by reason: {agg.failure_by_reason}")

Retention Configuration ¶

from proxywhirl.retry import RetryMetrics

# Custom retention and limits
metrics = RetryMetrics(
    retention_hours=48,        # Keep data for 48 hours
    max_current_attempts=5000,  # Limit raw attempts deque
)

rotator = ProxyWhirl()
rotator.retry_metrics = metrics

RetryExecutor Deep Dive ¶

The RetryExecutor is the core orchestration class that coordinates retry logic, circuit breakers, and metrics collection.

How Retries Work Internally ¶

When you call rotator.request(), the following sequence occurs:

Idempotency Check: Determines if the HTTP method is safe to retry (GET/HEAD/OPTIONS/DELETE/PUT are idempotent)
Retry Loop: Executes up to max_attempts attempts with backoff delays
Circuit Breaker Check: Verifies the proxy’s circuit breaker allows the request
Request Execution: Calls the underlying HTTP client
Status Code Check: Validates response status against retry_status_codes
Error Classification: Determines if exceptions are retryable (timeouts, connection errors)
Metrics Recording: Logs attempt outcome, latency, and circuit breaker events
Proxy Failover: If all retries exhausted, selects next best proxy automatically

Retryable vs Non-Retryable Errors ¶

The executor classifies errors into two categories:

Retryable Errors (trigger retry with backoff):

httpx.ConnectError - Connection refused, DNS failure
httpx.TimeoutException - Request/connect timeout
httpx.ReadTimeout - Response body read timeout
httpx.WriteTimeout - Request body write timeout
httpx.PoolTimeout - Connection pool exhausted
httpx.NetworkError - Generic network failure
HTTP 502, 503, 504 status codes (configurable)

Non-Retryable Errors (fail immediately):

HTTP 4xx errors (client errors - request won’t succeed on retry)
NonRetryableError (custom application errors)
Any exception not in the retryable types list

from proxywhirl.retry import RetryExecutor, NonRetryableError

# Custom error handling
try:
    response = rotator.request("GET", url)
except NonRetryableError as e:
    # Authentication failure, malformed request, etc.
    print(f"Cannot retry: {e}")

Direct RetryExecutor Usage ¶

For advanced use cases, you can use RetryExecutor directly:

from proxywhirl import ProxyWhirl, Proxy, RetryPolicy
from proxywhirl.retry import RetryExecutor
import httpx

# Create executor with custom policy
policy = RetryPolicy(max_attempts=5, base_delay=2.0)
rotator = ProxyWhirl()
executor = RetryExecutor(
    retry_policy=policy,
    circuit_breakers=rotator.circuit_breakers,
    retry_metrics=rotator.retry_metrics,
)

# Create request function
proxy = Proxy(url="http://proxy.example.com:8080")
def request_fn():
    client = httpx.Client(proxies={"all://": proxy.url})
    return client.get("https://api.example.com/data")

# Execute with retry
response = executor.execute_with_retry(
    request_fn=request_fn,
    proxy=proxy,
    method="GET",
    url="https://api.example.com/data",
    request_id="custom-request-123",  # Optional tracking ID
)

Integration with ProxyWhirl ¶

Basic Usage ¶

from proxywhirl import ProxyWhirl, RetryPolicy, BackoffStrategy

# Automatic retry and failover
policy = RetryPolicy(
    max_attempts=5,
    backoff_strategy=BackoffStrategy.EXPONENTIAL,
    base_delay=1.0,
    jitter=True,
)

rotator = ProxyWhirl(retry_policy=policy)

# Request automatically retries and fails over
response = rotator.request("GET", "https://httpbin.org/ip")

Custom Error Handling ¶

from proxywhirl.exceptions import ProxyConnectionError
from proxywhirl.retry import NonRetryableError

try:
    response = rotator.request("GET", url)
except ProxyConnectionError as e:
    # All retries exhausted across all proxies
    print(f"Request failed after all retries: {e}")
except NonRetryableError as e:
    # Non-retryable error (e.g., authentication failure)
    print(f"Non-retryable error: {e}")

Monitoring Circuit Breakers ¶

# Check circuit breaker status
for proxy in rotator.pool.proxies:
    cb = rotator.circuit_breakers.get(str(proxy.id))
    if cb:
        print(f"{proxy.url}: {cb.state.value}")
        print(f"  Failures: {cb.failure_count}/{cb.failure_threshold}")
        if cb.next_test_time:
            import time
            wait_time = cb.next_test_time - time.time()
            print(f"  Retry in: {wait_time:.1f}s")

Manual Circuit Reset ¶

# Reset all circuit breakers
for cb in rotator.circuit_breakers.values():
    cb.reset()

# Reset specific proxy
proxy = rotator.get_proxy()
rotator.circuit_breakers[str(proxy.id)].reset()

Integration with Rotation Strategies ¶

Retry and failover logic works seamlessly with all rotation strategies. For detailed strategy configuration, see Advanced Rotation Strategies.

Round-Robin with Automatic Failover¶

from proxywhirl import ProxyWhirl, Proxy, RetryPolicy
from proxywhirl.strategies import RoundRobinStrategy

# Round-robin ensures fair distribution
rotator = ProxyWhirl(
    proxies=[
        Proxy(url="http://proxy1.example.com:8080"),
        Proxy(url="http://proxy2.example.com:8080"),
        Proxy(url="http://proxy3.example.com:8080"),
    ],
    strategy=RoundRobinStrategy(),
    retry_policy=RetryPolicy(max_attempts=3),
)

# If proxy1 fails, automatically retries then fails over to proxy2
response = rotator.request("GET", "https://api.example.com/data")

Weighted Strategy with Performance-Based Failover¶

from proxywhirl.strategies import WeightedStrategy

# Higher weight = more traffic
rotator = ProxyWhirl(
    proxies=[
        Proxy(url="http://premium.example.com:8080", weight=3.0),  # 60% of traffic
        Proxy(url="http://standard.example.com:8080", weight=2.0), # 40% of traffic
    ],
    strategy=WeightedStrategy(),
    retry_policy=RetryPolicy(max_attempts=5),
)

# Premium proxy fails → retries on premium → fails over to standard
# Next request still prefers premium (weighted selection)
response = rotator.request("GET", "https://api.example.com/data")

Geo-Targeted Strategy with Regional Failover¶

from proxywhirl.strategies import GeoTargetedStrategy
from proxywhirl import StrategyConfig, SelectionContext

# Create and configure geo-targeted strategy
strategy = GeoTargetedStrategy()
strategy.configure(StrategyConfig(
    geo_fallback_enabled=True,
    geo_secondary_strategy="round_robin"
))

# Geo-targeted proxies
rotator = ProxyWhirl(
    proxies=[
        Proxy(url="http://us-east.example.com:8080", metadata={"region": "US-EAST"}),
        Proxy(url="http://us-west.example.com:8080", metadata={"region": "US-WEST"}),
        Proxy(url="http://eu-west.example.com:8080", metadata={"region": "EU-WEST"}),
    ],
    strategy=strategy,
    retry_policy=RetryPolicy(max_attempts=3),
)

# Prefers US-EAST when specified in context
context = SelectionContext(target_region="US-EAST")
response = rotator.request("GET", "https://api.example.com/data", context=context)

Performance-Based Strategy with Dynamic Failover¶

from proxywhirl.strategies import PerformanceBasedStrategy

# Automatically selects fastest, most reliable proxy
rotator = ProxyWhirl(
    proxies=[
        Proxy(url="http://proxy1.example.com:8080"),
        Proxy(url="http://proxy2.example.com:8080"),
        Proxy(url="http://proxy3.example.com:8080"),
    ],
    strategy=PerformanceBasedStrategy(),
    retry_policy=RetryPolicy(max_attempts=3),
)

# Strategy considers:
# - Success rate (from proxy.total_successes / proxy.total_requests)
# - Average latency (from RetryMetrics)
# - Circuit breaker state (skips OPEN circuits)

for i in range(100):
    response = rotator.request("GET", f"https://api.example.com/data/{i}")
    # Over time, fast reliable proxies get more traffic
    # Slow or failing proxies get less traffic

Key Integration Points:

Circuit breakers filter eligible proxies - Strategies only see proxies with CLOSED/HALF_OPEN circuits
Metrics inform strategy decisions - Performance-based strategies use RetryMetrics data
Failover respects strategy logic - If weighted strategy fails, next proxy still follows weights
Geo-targeting bonus in failover - RetryExecutor.select_retry_proxy() gives 10% bonus to matching regions

Advanced Patterns ¶

Adaptive Retry Policy ¶

Adjust retry policy based on conditions:

def get_adaptive_policy(time_sensitive: bool) -> RetryPolicy:
    """Adjust retry policy based on request priority."""
    if time_sensitive:
        # Fast retries for real-time requests
        return RetryPolicy(
            max_attempts=2,
            backoff_strategy=BackoffStrategy.FIXED,
            base_delay=0.5,
        )
    else:
        # Patient retries for batch jobs
        return RetryPolicy(
            max_attempts=10,
            backoff_strategy=BackoffStrategy.EXPONENTIAL,
            base_delay=2.0,
            max_backoff_delay=60.0,
            jitter=True,
        )

# Real-time request
rotator.retry_policy = get_adaptive_policy(time_sensitive=True)
response = rotator.request("GET", url)

# Batch request
rotator.retry_policy = get_adaptive_policy(time_sensitive=False)
response = rotator.request("GET", url)

Circuit Breaker Alerts ¶

Note

Circuit breaker state changes also trigger cache health invalidation when configured. See Caching Subsystem Guide for cache-level health integration.

Monitor circuit breaker events for alerts:

from datetime import datetime, timezone
from proxywhirl.circuit_breaker import CircuitBreakerState

def check_circuit_health(rotator: ProxyWhirl) -> None:
    """Alert on recent circuit breaker opens."""
    now = datetime.now(timezone.utc)

    for event in rotator.retry_metrics.circuit_breaker_events:
        if event.to_state == CircuitBreakerState.OPEN:
            age = (now - event.timestamp).total_seconds()
            if age < 300:  # Last 5 minutes
                print(f"ALERT: Proxy {event.proxy_id} circuit opened")
                print(f"  Failures: {event.failure_count}")
                print(f"  Time: {event.timestamp}")

# Run periodically
check_circuit_health(rotator)

Request-Level Retry Override ¶

Override retry policy for specific requests:

from proxywhirl import ProxyWhirl, RetryPolicy, BackoffStrategy

# Default policy
rotator = ProxyWhirl(
    retry_policy=RetryPolicy(max_attempts=3)
)

# Override for critical request
critical_policy = RetryPolicy(
    max_attempts=10,
    backoff_strategy=BackoffStrategy.EXPONENTIAL,
    base_delay=2.0,
    jitter=True,
)

# Note: Currently requires creating new executor
# This pattern may be simplified in future versions
from proxywhirl.retry import RetryExecutor

executor = RetryExecutor(
    critical_policy,
    rotator.circuit_breakers,
    rotator.retry_metrics,
)

# Use custom executor for this request
# (Integration with rotator.request() coming in future release)

Best Practices ¶

1. Start Conservative, Tune Later ¶

Begin with safe defaults and adjust based on metrics:

# Start here
policy = RetryPolicy(
    max_attempts=3,
    backoff_strategy=BackoffStrategy.EXPONENTIAL,
    jitter=True,
)

# After observing metrics, tune:
# - Increase max_attempts if success_by_attempt shows retries working
# - Adjust timeout based on avg_latency in metrics
# - Enable retry_non_idempotent only if API is truly idempotent

2. Always Use Jitter in Production ¶

Jitter prevents thundering herd problems:

# Production: Use jitter
policy = RetryPolicy(jitter=True)

# Testing: Disable for deterministic behavior
policy = RetryPolicy(jitter=False)

3. Monitor Circuit Breaker Opens ¶

Frequent circuit opens indicate systemic issues:

# Alert if >10% of proxies have open circuits
open_circuits = sum(
    1 for cb in rotator.circuit_breakers.values()
    if cb.state == CircuitBreakerState.OPEN
)
total_proxies = len(rotator.pool.proxies)

if total_proxies > 0 and open_circuits / total_proxies > 0.1:
    print(f"WARNING: {open_circuits}/{total_proxies} circuits open")

4. Set Timeouts for Time-Sensitive Requests ¶

Prevent unbounded retry delays:

# Time-sensitive request: Fail fast
policy = RetryPolicy(
    max_attempts=3,
    timeout=5.0,  # Total timeout
)

# Batch job: Be patient
policy = RetryPolicy(
    max_attempts=10,
    timeout=300.0,  # 5 minutes total
)

5. Use Metrics for Capacity Planning ¶

Track retry rates to identify proxy quality issues:

summary = rotator.retry_metrics.get_summary()
success_first_attempt = summary["success_by_attempt"].get(0, 0)
total_retries = summary["total_retries"]

if total_retries > 0:
    first_attempt_rate = success_first_attempt / total_retries
    print(f"First attempt success: {first_attempt_rate:.1%}")

    if first_attempt_rate < 0.7:
        print("WARNING: Low first-attempt success rate")
        print("Consider: Higher quality proxy sources")

Troubleshooting ¶

All Retries Exhausted ¶

Symptom: ProxyConnectionError: Request failed after N attempts

Causes:

All proxies have poor connectivity
Target website is blocking proxy IPs
Circuit breakers opened for all proxies

Solutions:

# Check circuit breaker status
open_count = sum(
    1 for cb in rotator.circuit_breakers.values()
    if cb.state == CircuitBreakerState.OPEN
)
print(f"{open_count} circuits open")

# Reset circuits if blocking legitimate traffic
for cb in rotator.circuit_breakers.values():
    cb.reset()

# Check per-proxy statistics
by_proxy = rotator.retry_metrics.get_by_proxy(hours=1)
for proxy_id, stats in by_proxy.items():
    if stats["success_count"] == 0:
        print(f"Dead proxy: {proxy_id}")

High Latency ¶

Symptom: Requests take a long time to complete

Causes:

Backoff delays too aggressive
Many retries before success
Slow proxies selected

Solutions:

# Check latency by proxy
by_proxy = rotator.retry_metrics.get_by_proxy(hours=1)
slow_proxies = [
    pid for pid, stats in by_proxy.items()
    if stats["avg_latency"] > 2.0
]

print(f"Slow proxies (>2s): {slow_proxies}")

# Use faster backoff
policy = RetryPolicy(
    backoff_strategy=BackoffStrategy.FIXED,
    base_delay=0.5,  # Shorter delays
)

# Set total timeout
policy.timeout = 10.0

Non-Retryable Errors ¶

Symptom: NonRetryableError raised immediately

Causes:

Custom error types not recognized as retryable
Authentication failures (407)

Solutions:

# Authentication errors are not retryable
# Fix proxy credentials instead of retrying

# For custom error types, extend RetryExecutor._is_retryable_error()
# (Advanced - requires subclassing)

Performance Impact ¶

Retry logic adds minimal overhead:

Circuit breaker check: O(1) dictionary lookup
Proxy scoring: O(n) where n = number of candidate proxies
Metrics recording: O(1) append to bounded deque
Backoff calculation: O(1) arithmetic

Benchmark results (100k requests):

No retry: 100ms avg latency
With retry (3 attempts): 105ms avg latency (+5%)
With retry + metrics: 110ms avg latency (+10%)

Summary ¶

ProxyWhirl’s retry and failover system provides:

Flexible retry policies with exponential, linear, or fixed backoff
Circuit breaker protection to isolate failing proxies
Intelligent failover with performance-based proxy selection
Comprehensive metrics for monitoring and debugging
Automatic integration with ProxyWhirl

Start with defaults and tune based on metrics. Enable jitter in production, set timeouts for time-sensitive requests, and monitor circuit breaker opens for systemic issues.