Cache API Reference¶
Complete reference for ProxyWhirl’s multi-tier caching system with L1 (memory), L2 (disk), and L3 (SQLite) support.
See also
For cache configuration patterns and optimization tips, see Caching. For TOML cache configuration options, see Configuration.
from proxywhirl.cache import (
CacheManager,
CacheConfig,
CacheEntry,
HealthStatus,
CacheTierType,
MemoryCacheTier,
DiskCacheTier,
SQLiteCacheTier,
CredentialEncryptor,
)
# L2BackendType and JsonlCacheTier are available via submodule imports:
from proxywhirl.cache.models import L2BackendType
from proxywhirl.cache.tiers import JsonlCacheTier
Overview¶
The cache subsystem provides three-tier storage for proxies with automatic promotion, credential encryption, TTL management, and health-based invalidation. It supports graceful degradation when tiers fail and provides comprehensive statistics for monitoring.
Architecture:
L1 (Memory): Fast in-memory cache using OrderedDict with LRU eviction
L2 (Disk): Configurable persistent cache with two backend options:
JSONL (default): File-based using sharded JSON Lines files, human-readable, portable, best for <10K entries
SQLite: Database-based with indexed lookups, faster for >10K entries with O(log n) performance
L3 (SQLite): Full database cache with SQL indexing, health history tracking, and complete queryability
Data Models¶
CacheEntry¶
Container for a single cached proxy with metadata, TTL, and health tracking.
- class CacheEntry¶
Pydantic model that stores proxy information with TTL, health status, and access tracking. Credentials are SecretStr in memory, encrypted at rest in L2/L3.
Example:
from proxywhirl.cache import CacheEntry, HealthStatus from datetime import datetime, timezone, timedelta from pydantic import SecretStr entry = CacheEntry( key="abc123", proxy_url="http://proxy.example.com:8080", username=SecretStr("user"), password=SecretStr("pass"), source="api", fetch_time=datetime.now(timezone.utc), last_accessed=datetime.now(timezone.utc), ttl_seconds=3600, expires_at=datetime.now(timezone.utc) + timedelta(seconds=3600), health_status=HealthStatus.HEALTHY ) # Check expiration if entry.is_expired: print("Entry has expired") # Check health if entry.is_healthy: print("Proxy is healthy")
Fields¶
Identity:
key(str): Unique cache key (proxy URL hash)proxy_url(str): Full proxy URL (scheme://host:port)
Credentials (encrypted at rest in L2/L3):
username(SecretStr | None): Proxy usernamepassword(SecretStr | None): Proxy password
Metadata:
source(str): Proxy source identifierfetch_time(datetime): When proxy was fetchedlast_accessed(datetime): Last cache access timeaccess_count(int): Number of cache hits (default: 0)
TTL & Health:
ttl_seconds(int): Time-to-live in seconds (≥0)expires_at(datetime): Absolute expiration timehealth_status(HealthStatus): Current health status (default: UNKNOWN)failure_count(int): Consecutive failures (≥0, default: 0)evicted_from_l1(bool): Whether entry was evicted from L1 cache (default: False)
Health Monitoring (Feature 006):
last_health_check(datetime | None): Last health check timestampconsecutive_health_failures(int): Consecutive health check failures (≥0, default: 0)consecutive_health_successes(int): Consecutive successful health checks (≥0, default: 0)recovery_attempt(int): Current recovery attempt count (≥0, default: 0)next_check_time(datetime | None): Scheduled next health checklast_health_error(str | None): Last health check error messagetotal_health_checks(int): Total health checks performed (≥0, default: 0)total_health_check_failures(int): Total health check failures (≥0, default: 0)
Properties¶
CacheConfig¶
Configuration for cache behavior and tier settings.
- class CacheConfig¶
Pydantic model that aggregates configuration for all three tiers plus global settings like TTL, cleanup intervals, and storage paths.
Example:
from proxywhirl.cache import CacheConfig, CacheTierConfig, L2BackendType from pydantic import SecretStr # Default JSONL backend (file-based, portable) config = CacheConfig( # Tier configurations l1_config=CacheTierConfig( enabled=True, max_entries=1000, eviction_policy="lru" ), l2_config=CacheTierConfig( enabled=True, max_entries=5000, eviction_policy="lru" ), l2_backend=L2BackendType.JSONL, # or L2BackendType.SQLITE for large caches l3_config=CacheTierConfig( enabled=True, max_entries=None, # Unlimited eviction_policy="lru" ), # TTL Configuration default_ttl_seconds=3600, ttl_cleanup_interval=60, enable_background_cleanup=True, cleanup_interval_seconds=60, per_source_ttl={ "api": 7200, # API sources: 2 hours "scraper": 1800 # Scrapers: 30 minutes }, # Storage Paths l2_cache_dir=".cache/proxies", l3_database_path=".cache/db/proxywhirl.db", # Encryption encryption_key=SecretStr("your-32-byte-url-safe-base64-key"), # Health Integration health_check_invalidation=True, failure_threshold=3, # Performance Tuning enable_statistics=True, statistics_interval=5 ) # SQLite backend for large caches (>10K entries) large_cache_config = CacheConfig( l2_backend=L2BackendType.SQLITE, l2_config=CacheTierConfig(max_entries=50000) )
Fields¶
Tier Configuration:
l1_config(CacheTierConfig): L1 (Memory) configuration (default: max_entries=1000)l2_config(CacheTierConfig): L2 (Disk) configuration (default: max_entries=5000)l2_backend(L2BackendType): L2 storage backend - “jsonl” or “sqlite” (default: JSONL)l3_config(CacheTierConfig): L3 (SQLite) configuration (default: max_entries=None)
TTL Configuration:
default_ttl_seconds(int): Default TTL for cached proxies (≥60, default: 3600)ttl_cleanup_interval(int): Background cleanup interval (≥10, default: 60)enable_background_cleanup(bool): Enable background TTL cleanup thread (default: False)cleanup_interval_seconds(int): Interval between cleanup runs (≥5, default: 60)per_source_ttl(dict[str, int]): Per-source TTL overrides (default: empty dict)
Storage Paths:
l2_cache_dir(str): Directory for L2 cache (JSONL shards or SQLite database) (default: “.cache/proxies”)l3_database_path(str): SQLite database path for L3 (default: “.cache/db/proxywhirl.db”)
Encryption:
encryption_key(SecretStr | None): Fernet encryption key (from env: PROXYWHIRL_CACHE_ENCRYPTION_KEY)
Health Integration:
health_check_invalidation(bool): Auto-invalidate on health check failure (default: True)failure_threshold(int): Failures before health invalidation (≥1, default: 3)
Performance Tuning:
enable_statistics(bool): Track cache statistics (default: True)statistics_interval(int): Stats aggregation interval (≥1, default: 5)
CacheTierConfig¶
Configuration for a single cache tier.
- class CacheTierConfig¶
Pydantic model that defines capacity, eviction policy, and enable/disable state for one tier (L1, L2, or L3).
Example:
from proxywhirl.cache import CacheTierConfig config = CacheTierConfig( enabled=True, max_entries=1000, eviction_policy="lru" # "lru", "lfu", or "fifo" )
Fields¶
enabled(bool): Enable this tier (default: True)max_entries(int | None): Max entries (None=unlimited, default: None)eviction_policy(str): Eviction policy: “lru”, “lfu”, or “fifo” (default: “lru”)
Validators¶
- classmethod validate_policy(v: str) str¶
Validate eviction policy is supported.
- Parameters:
v – Policy name to validate
- Raises:
ValueError – If policy is not one of [“lru”, “lfu”, “fifo”]
- Returns:
Validated policy name
CacheStatistics¶
Aggregate cache statistics across all tiers.
- class CacheStatistics¶
Pydantic model that combines tier-level statistics and tracks cross-tier operations like promotions and demotions.
Example:
from proxywhirl.cache import CacheStatistics stats = CacheStatistics() stats.l1_stats.hits = 100 stats.l1_stats.misses = 20 print(f"L1 hit rate: {stats.l1_stats.hit_rate:.2%}") print(f"Overall hit rate: {stats.overall_hit_rate:.2%}") print(f"Total size: {stats.total_size}") # Export to monitoring metrics = stats.to_metrics_dict()
Fields¶
Per-Tier Statistics:
l1_stats(TierStatistics): L1 statistics (default: empty TierStatistics)l2_stats(TierStatistics): L2 statistics (default: empty TierStatistics)l3_stats(TierStatistics): L3 statistics (default: empty TierStatistics)
Cross-Tier Operations:
promotions(int): L3→L2→L1 promotions (≥0, default: 0)demotions(int): L1→L2→L3 demotions (≥0, default: 0)
Degradation Tracking:
l1_degraded(bool): L1 tier unavailable (default: False)l2_degraded(bool): L2 tier unavailable (default: False)l3_degraded(bool): L3 tier unavailable (default: False)
Computed Properties¶
Methods¶
- to_metrics_dict() dict[str, float]¶
Convert to flat metrics dict for monitoring systems.
- Returns:
Dictionary with metric names and float values
Example:
metrics = stats.to_metrics_dict() # { # "cache.l1.hit_rate": 0.85, # "cache.l2.hit_rate": 0.60, # "cache.l3.hit_rate": 0.40, # "cache.overall.hit_rate": 0.75, # "cache.total_size": 1500.0, # "cache.promotions": 250.0, # "cache.demotions": 150.0, # "cache.l1.size": 1000.0, # "cache.l2.size": 450.0, # "cache.l3.size": 50.0 # }
TierStatistics¶
Statistics for a single cache tier.
- class TierStatistics¶
Pydantic model that tracks hits, misses, evictions by reason, and computes hit rate.
Example:
from proxywhirl.cache import TierStatistics stats = TierStatistics(hits=100, misses=20) print(f"Hit rate: {stats.hit_rate:.2%}") # 83.33% print(f"Total evictions: {stats.total_evictions}")
Fields¶
hits(int): Cache hits (≥0, default: 0)misses(int): Cache misses (≥0, default: 0)current_size(int): Current number of entries (≥0, default: 0)evictions_lru(int): LRU evictions (≥0, default: 0)evictions_ttl(int): TTL-based evictions (≥0, default: 0)evictions_health(int): Health-based evictions (≥0, default: 0)evictions_corruption(int): Corruption-based evictions (≥0, default: 0)
Computed Properties¶
HealthStatus (Enum)¶
Proxy health status for cache entries (imported from proxywhirl.models).
- class HealthStatus¶
String enum representing proxy health status with 5 states.
Values:
UNKNOWN = "unknown"- Not yet tested (default)HEALTHY = "healthy"- Working normallyDEGRADED = "degraded"- Partial functionality (some failures)UNHEALTHY = "unhealthy"- Experiencing issues (many failures)DEAD = "dead"- Not responding (completely unusable)
Example:
from proxywhirl.cache import HealthStatus status = HealthStatus.HEALTHY print(status.value) # "healthy" # All 5 states are available for state in HealthStatus: print(f"{state.name}: {state.value}")
CacheTierType (Enum)¶
Type of cache tier.
- class CacheTierType¶
String enum representing cache tier types.
Values:
L1 = "l1"- Memory tierL2 = "l2"- Disk tierL3 = "l3"- SQLite tier
Example:
from proxywhirl.cache import CacheTierType tier = CacheTierType.L1 print(tier.value) # "l1"
L2BackendType (Enum)¶
L2 cache backend type selection.
- class L2BackendType¶
String enum for selecting the L2 disk cache storage backend.
Values:
JSONL = "jsonl"- File-based JSONL with sharding (default, best for <10K entries)SQLITE = "sqlite"- SQLite database (faster for >10K entries)
Example:
from proxywhirl.cache import CacheConfig, L2BackendType # Default JSONL backend config = CacheConfig() assert config.l2_backend == L2BackendType.JSONL # SQLite backend for large caches config = CacheConfig(l2_backend=L2BackendType.SQLITE)
When to use each backend:
Backend |
Best For |
Performance |
Features |
|---|---|---|---|
JSONL |
<10K entries |
O(n) lookups |
Human-readable, portable, simple debugging |
SQLite |
>10K entries |
O(log n) lookups |
Indexed queries, faster batch operations |
Tier Implementations¶
CacheTier (Abstract Base Class)¶
Abstract base class for cache tier implementations.
- class CacheTier¶
Defines the interface that all cache tiers (L1, L2, L3) must implement, including graceful degradation on repeated failures.
Attributes:
config(CacheTierConfig) - Configuration for this tiertier_type(TierType) - Type of tier (L1/L2/L3)enabled(bool) - Whether tier is operationalfailure_count(int) - Consecutive failures for degradation trackingfailure_threshold(int) - Failures before auto-disabling tier (default: 3)
Constructor¶
- __init__(config: CacheTierConfig, tier_type: TierType) None¶
Initialize cache tier with configuration.
- Parameters:
config – Configuration for this tier
tier_type – Type of tier (L1/L2/L3)
Abstract Methods¶
- abstractmethod get(key: str) CacheEntry | None¶
Retrieve entry by key, None if not found or expired.
- Parameters:
key – Cache key to lookup
- Returns:
CacheEntry if found and valid, None otherwise
- abstractmethod put(key: str, entry: CacheEntry) bool¶
Store entry, return True if successful.
- Parameters:
key – Cache key for entry
entry – CacheEntry to store
- Returns:
True if stored successfully, False otherwise
- abstractmethod delete(key: str) bool¶
Remove entry by key, return True if existed.
- Parameters:
key – Cache key to delete
- Returns:
True if entry existed and was deleted, False if not found
Concrete Methods¶
MemoryCacheTier¶
L1 in-memory cache using OrderedDict for LRU tracking.
- class MemoryCacheTier(CacheTier)¶
Provides O(1) lookups with automatic LRU eviction when max_entries exceeded.
Example:
from proxywhirl.cache.tiers import MemoryCacheTier, TierType from proxywhirl.cache import CacheTierConfig config = CacheTierConfig(max_entries=1000, eviction_policy="lru") tier = MemoryCacheTier(config, TierType.L1_MEMORY) # Store entry tier.put(key, entry) # Retrieve entry (moves to end for LRU) cached = tier.get(key) # Delete entry deleted = tier.delete(key) # Get all keys keys = tier.keys() # Get size size = tier.size() # Clear all cleared = tier.clear() # Cleanup expired removed = tier.cleanup_expired()
Constructor¶
- __init__(config: CacheTierConfig, tier_type: TierType, on_evict: Callable[[str, CacheEntry], None] | None = None) None
Initialize memory cache with LRU tracking.
- Parameters:
config – Tier configuration
tier_type – Type of tier (L1/L2/L3)
on_evict – Optional callback when entry is evicted (key, entry)
Features¶
O(1) lookups
Automatic LRU eviction when max_entries exceeded
Thread-safe with failure tracking
No persistence
Callbacks on eviction for demotion to L2
JsonlCacheTier¶
L2 file-based cache using sharded JSONL files with encryption.
- class JsonlCacheTier(CacheTier)¶
File-based cache tier using JSON Lines format with consistent-hash sharding. Best for <10K entries. Human-readable, portable, and git-friendly.
Uses sharded JSONL files with:
Consistent hash sharding (default 16 shards)
In-memory index for O(1) key→shard lookups
File locking (portalocker) for concurrent access safety
Fernet encryption for credentials at rest
Human-readable JSON Lines format
Example:
from proxywhirl.cache.tiers import JsonlCacheTier, TierType from proxywhirl.cache import CacheTierConfig, CredentialEncryptor from pathlib import Path config = CacheTierConfig(max_entries=5000, eviction_policy="lru") encryptor = CredentialEncryptor() cache_dir = Path(".cache/proxies") tier = JsonlCacheTier( config=config, tier_type=TierType.L2_FILE, cache_dir=cache_dir, encryptor=encryptor, num_shards=16 # Default ) # Store entry (writes to appropriate shard file) tier.put(key, entry) # Retrieve entry (uses in-memory index for O(1) shard lookup) cached = tier.get(key) # Delete entry deleted = tier.delete(key) # Get all keys (from index) keys = tier.keys() # Get size size = tier.size() # Clear all (removes all shard files) cleared = tier.clear() # Cleanup expired entries removed = tier.cleanup_expired()
Constructor¶
- __init__(config: CacheTierConfig, tier_type: TierType, cache_dir: Path, encryptor: CredentialEncryptor | None = None, num_shards: int = 16) None
Initialize JSONL file cache with sharding and encryption.
- Parameters:
config – Tier configuration
tier_type – Type of tier (L1/L2/L3)
cache_dir – Directory for shard files
encryptor – Optional encryptor for credentials
num_shards – Number of shard files (default: 16)
File Structure¶
.cache/proxies/
├── shard_00.jsonl
├── shard_01.jsonl
├── ...
└── shard_15.jsonl
Each shard file contains JSON Lines entries:
{"key": "abc123", "proxy_url": "http://proxy:8080", "source": "free-proxy-list", "ttl_seconds": 3600, ...}
{"key": "def456", "proxy_url": "socks5://proxy:1080", "source": "geonode", "ttl_seconds": 7200, ...}
Features¶
Human-readable JSON Lines format
Portable (can copy/move files)
Git-friendly for version control
Consistent-hash sharding for distribution
In-memory index for fast lookups
File locking for concurrent access
Encrypted credentials at rest
Best for <10K entries
When to Use JSONL vs SQLite¶
Factor |
JSONL (JsonlCacheTier) |
SQLite (DiskCacheTier) |
|---|---|---|
Entry count |
<10K entries |
>10K entries |
Lookup speed |
O(n) per shard |
O(log n) indexed |
Portability |
Copy files anywhere |
Single .db file |
Git-friendly |
Yes |
Not recommended |
Human-readable |
Yes |
No (binary) |
Concurrent writes |
File locking |
WAL mode |
DiskCacheTier¶
L2 SQLite-based cache with encryption and indexed lookups.
- class DiskCacheTier(CacheTier)¶
Optimized for >10K entries using SQLite with B-tree indexes instead of JSONL. Provides O(log n) lookups vs O(n) for JSONL, achieving <10ms reads for 10K+ entries.
Uses a lightweight SQLite database with:
Primary key index on cache key for fast lookups
Encrypted credentials stored as BLOB
Efficient bulk operations (cleanup, size, keys)
File-based persistence without complex sharding
Example:
from proxywhirl.cache.tiers import DiskCacheTier, TierType from proxywhirl.cache import CacheTierConfig, CredentialEncryptor from pathlib import Path config = CacheTierConfig(max_entries=5000, eviction_policy="lru") encryptor = CredentialEncryptor() cache_dir = Path(".cache/proxies") tier = DiskCacheTier(config, TierType.L2_FILE, cache_dir, encryptor) # Same interface as MemoryCacheTier tier.put(key, entry) cached = tier.get(key)
Constructor¶
- __init__(config: CacheTierConfig, tier_type: TierType, cache_dir: Path, encryptor: CredentialEncryptor | None = None) None
Initialize SQLite-based L2 cache.
- Parameters:
config – Tier configuration
tier_type – Type of tier (should be L2_FILE)
cache_dir – Directory for cache database
encryptor – Credential encryptor for username/password
Methods¶
- migrate_from_jsonl(jsonl_dir: Path | None = None) int¶
Migrate existing JSONL shard files to SQLite L2 cache.
This method provides a migration path from the old JSONL-based L2 cache to the new SQLite-based implementation. It reads all shard_*.jsonl files from the specified directory and imports them into the SQLite database.
- Parameters:
jsonl_dir – Directory containing shard_*.jsonl files (defaults to self.cache_dir)
- Returns:
Number of entries successfully migrated
Example:
tier = DiskCacheTier(config, TierType.L2_FILE, cache_dir) migrated = tier.migrate_from_jsonl() print(f"Migrated {migrated} entries from JSONL to SQLite")
- close() None¶
Close the persistent SQLite connection and release database resources.
Should be called when the cache tier is no longer needed to properly release database resources and file locks. Safe to call multiple times. Thread-safe via internal lock.
Example:
tier = DiskCacheTier(config, TierType.L2_FILE, cache_dir, encryptor) try: tier.put(key, entry) cached = tier.get(key) finally: tier.close()
Features¶
O(log n) indexed lookups using SQLite B-tree
Encrypted credential storage (BLOB fields)
Atomic operations with SQLite transactions
Efficient bulk cleanup using SQL DELETE
Simple file-based persistence (single .db file)
Automatic schema initialization
Database Schema¶
CREATE TABLE l2_cache (
key TEXT PRIMARY KEY,
proxy_url TEXT NOT NULL,
username_encrypted BLOB,
password_encrypted BLOB,
source TEXT NOT NULL,
fetch_time REAL NOT NULL,
last_accessed REAL NOT NULL,
access_count INTEGER DEFAULT 0,
ttl_seconds INTEGER NOT NULL,
expires_at REAL NOT NULL,
health_status TEXT DEFAULT 'unknown',
failure_count INTEGER DEFAULT 0,
evicted_from_l1 INTEGER DEFAULT 0
);
CREATE INDEX idx_l2_expires_at ON l2_cache(expires_at);
CREATE INDEX idx_l2_source ON l2_cache(source);
SQLiteCacheTier¶
L3 SQLite database cache with encrypted credentials and health history.
- class SQLiteCacheTier(CacheTier)¶
Provides durable persistence with SQL indexing for fast lookups and comprehensive health history tracking.
Example:
from proxywhirl.cache.tiers import SQLiteCacheTier, TierType from proxywhirl.cache import CacheTierConfig, CredentialEncryptor from pathlib import Path config = CacheTierConfig(max_entries=None, eviction_policy="lru") # Unlimited encryptor = CredentialEncryptor() db_path = Path(".cache/db/proxywhirl.db") tier = SQLiteCacheTier(config, TierType.L3_SQLITE, db_path, encryptor) # Same interface as other tiers tier.put(key, entry) cached = tier.get(key) # Optimized bulk cleanup with SQL DELETE removed = tier.cleanup_expired() # O(1) instead of O(n)
Constructor¶
- __init__(config: CacheTierConfig, tier_type: TierType, db_path: Path, encryptor: CredentialEncryptor | None = None) None
Initialize SQLite cache.
- Parameters:
config – Tier configuration
tier_type – Type of tier (should be L3_SQLITE)
db_path – Path to SQLite database file
encryptor – Credential encryptor for username/password
Features¶
Full persistence
SQL indexing for fast lookups
Health history tracking with separate table
Automatic schema migration
Optimized bulk cleanup (O(1) using SQL DELETE)
Credential encryption with BLOB storage
Foreign key constraints
Database Schema¶
CREATE TABLE cache_entries (
key TEXT PRIMARY KEY,
proxy_url TEXT NOT NULL,
username_encrypted BLOB,
password_encrypted BLOB,
source TEXT NOT NULL,
fetch_time REAL NOT NULL,
last_accessed REAL NOT NULL,
access_count INTEGER DEFAULT 0,
ttl_seconds INTEGER NOT NULL,
expires_at REAL NOT NULL,
health_status TEXT DEFAULT 'unknown',
failure_count INTEGER DEFAULT 0,
created_at REAL NOT NULL,
updated_at REAL NOT NULL,
-- Health monitoring fields
last_health_check REAL,
consecutive_health_failures INTEGER DEFAULT 0,
consecutive_health_successes INTEGER DEFAULT 0,
recovery_attempt INTEGER DEFAULT 0,
next_check_time REAL,
last_health_error TEXT,
total_health_checks INTEGER DEFAULT 0,
total_health_check_failures INTEGER DEFAULT 0,
evicted_from_l1 INTEGER DEFAULT 0
);
CREATE TABLE health_history (
id INTEGER PRIMARY KEY AUTOINCREMENT,
proxy_key TEXT NOT NULL,
check_time REAL NOT NULL,
status TEXT NOT NULL,
response_time_ms REAL,
error_message TEXT,
check_url TEXT NOT NULL,
FOREIGN KEY (proxy_key) REFERENCES cache_entries(key) ON DELETE CASCADE
);
-- Indexes
CREATE INDEX idx_expires_at ON cache_entries(expires_at);
CREATE INDEX idx_source ON cache_entries(source);
CREATE INDEX idx_health_status ON cache_entries(health_status);
CREATE INDEX idx_last_accessed ON cache_entries(last_accessed);
CREATE INDEX idx_health_history_proxy ON health_history(proxy_key);
CREATE INDEX idx_health_history_time ON health_history(check_time);
Utilities¶
CredentialEncryptor¶
Warning
If no encryption key is provided and the PROXYWHIRL_CACHE_ENCRYPTION_KEY environment variable is not set, a new key is generated automatically. This means cached data encrypted with a previous key will be unreadable. Always persist your encryption key for production use.
Handles encryption/decryption of proxy credentials using Fernet symmetric encryption (AES-128-CBC + HMAC). Supports key rotation via MultiFernet: set PROXYWHIRL_CACHE_KEY_PREVIOUS to the old key when rotating, allowing decryption of data encrypted with either key while new encryptions use the current key.
- class CredentialEncryptor¶
Provides Fernet symmetric encryption for proxy credentials at rest (L2/L3 tiers). Uses environment variable PROXYWHIRL_CACHE_ENCRYPTION_KEY for key management.
Example:
from proxywhirl.cache import CredentialEncryptor from pydantic import SecretStr import os # Option 1: Use environment variable os.environ["PROXYWHIRL_CACHE_ENCRYPTION_KEY"] = "your-32-byte-url-safe-base64-key" encryptor = CredentialEncryptor() # Option 2: Provide key directly from cryptography.fernet import Fernet key = Fernet.generate_key() encryptor = CredentialEncryptor(key=key) # Encrypt credentials plaintext = SecretStr("mypassword") encrypted = encryptor.encrypt(plaintext) # bytes # Decrypt credentials decrypted = encryptor.decrypt(encrypted) # SecretStr print(decrypted.get_secret_value()) # "mypassword"
Constructor¶
- __init__(key: bytes | None = None) None
Initialize encryptor with Fernet key.
- Parameters:
key – Optional Fernet key (32 url-safe base64-encoded bytes). If None, reads from PROXYWHIRL_CACHE_ENCRYPTION_KEY env var. If env var not set, generates a new key (WARNING: regenerated keys cannot decrypt existing cached data).
- Raises:
ValueError – If provided key is invalid for Fernet
Attributes:
key(bytes) - Fernet encryption key_cipher(Fernet) - Fernet cipher instance
Methods¶
- encrypt(secret: SecretStr) bytes¶
Encrypt a SecretStr to bytes.
- Parameters:
secret – SecretStr containing plaintext to encrypt
- Returns:
Encrypted bytes suitable for storage in BLOB fields
- Raises:
ValueError – If encryption fails
Example:
encrypted = encryptor.encrypt(SecretStr("password123")) # b'gAAAAA...'
- decrypt(encrypted: bytes) SecretStr¶
Decrypt encrypted bytes back to SecretStr.
- Parameters:
encrypted – Encrypted bytes from storage
- Returns:
SecretStr containing decrypted plaintext (never logs value)
- Raises:
ValueError – If decryption fails (wrong key, corrupted data)
Example:
decrypted = encryptor.decrypt(encrypted_bytes) print(decrypted.get_secret_value()) # "password123"
CacheManager¶
Main orchestrator for multi-tier proxy caching with automatic promotion/demotion, TTL management, and health-based invalidation.
- class CacheManager¶
Manages caching across three tiers:
L1 (Memory): Fast in-memory cache using OrderedDict (LRU)
L2 (Disk): Persistent cache with configurable backend (JSONL or SQLite)
L3 (SQLite): Database cache for cold storage with full queryability
Supports TTL-based expiration, health-based invalidation, and graceful degradation when tiers fail. Thread-safe via
threading.RLock.Example:
from proxywhirl.cache import CacheManager, CacheConfig, CacheEntry, HealthStatus from datetime import datetime, timezone, timedelta config = CacheConfig() manager = CacheManager(config) # Store an entry entry = CacheEntry( key="abc123", proxy_url="http://proxy.example.com:8080", source="api", fetch_time=datetime.now(timezone.utc), last_accessed=datetime.now(timezone.utc), ttl_seconds=3600, expires_at=datetime.now(timezone.utc) + timedelta(seconds=3600), health_status=HealthStatus.HEALTHY ) manager.put(entry.key, entry) # Retrieve (promotes to higher tiers on hit) retrieved = manager.get(entry.key) # Delete from all tiers manager.delete(entry.key) # Statistics stats = manager.get_statistics() print(f"Overall hit rate: {stats.overall_hit_rate:.2%}") # Export/import manager.export_to_file("proxies.jsonl") manager.warm_from_file("proxies.jsonl", ttl_override=3600)
Constructor¶
- __init__(config: CacheConfig) None
Initialize cache manager with configuration.
- Parameters:
config – Cache configuration with tier settings (required)
Initializes L1 (memory), L2 (disk), and L3 (SQLite) tiers based on config. Starts background TTL cleanup if
enable_background_cleanupis True.
Methods¶
- get(key: str) CacheEntry | None
Retrieve entry from cache with tier promotion.
Checks L1 → L2 → L3 in order. Promotes entries to higher tiers on hit. Updates
access_countandlast_accessedon successful retrieval. Expired entries are automatically deleted from all tiers.- Parameters:
key – Cache key to retrieve
- Returns:
CacheEntry if found and not expired, None otherwise
- put(key: str, entry: CacheEntry) bool
Store entry in all enabled tiers.
Writes to all tiers for redundancy. Credentials are automatically redacted in logs.
- Parameters:
key – Cache key
entry – CacheEntry to store
- Returns:
True if stored in at least one tier, False otherwise
- delete(key: str) bool
Delete entry from all tiers.
- Parameters:
key – Cache key to delete
- Returns:
True if deleted from at least one tier, False if not found
- clear() int
Clear all entries from all tiers.
- Returns:
Total number of entries cleared
- invalidate_by_health(key: str) None¶
Mark proxy as unhealthy and evict if failure threshold reached.
Increments the
failure_countand setshealth_statusto UNHEALTHY. Iffailure_countreaches the configuredfailure_threshold, the proxy is removed from all cache tiers.- Parameters:
key – Cache key to invalidate
- get_statistics() CacheStatistics¶
Get current cache statistics.
- Returns:
CacheStatistics with hit rates, sizes, and tier degradation status
- export_to_file(filepath: str) dict[str, int]¶
Export all cache entries to a JSONL file.
- Parameters:
filepath – Path to export file
- Returns:
Dict with
exportedandfailedcounts
- warm_from_file(file_path: str, ttl_override: int | None = None) dict[str, int]¶
Load proxies from a file to pre-populate the cache.
Supports JSON (array), JSONL (newline-delimited), and CSV formats. Invalid entries are skipped with warnings logged.
- Parameters:
file_path – Path to file containing proxy data
ttl_override – Optional TTL in seconds (overrides
default_ttl_seconds)
- Returns:
Dict with
loaded,skipped, andfailedcounts
Crypto Utilities¶
The proxywhirl.cache.crypto module provides helper functions for encryption key management and rotation.
from proxywhirl.cache.crypto import get_encryption_keys, create_multi_fernet, rotate_key
get_encryption_keys() -> list[bytes]¶
Get all valid encryption keys for MultiFernet. Returns keys in priority order: current key first, then previous key. Reads from PROXYWHIRL_CACHE_ENCRYPTION_KEY and PROXYWHIRL_CACHE_KEY_PREVIOUS environment variables. Generates a new key if no env vars are set.
create_multi_fernet() -> MultiFernet¶
Create a MultiFernet instance with all valid encryption keys. MultiFernet tries keys in order for decryption (newest first). All new encryptions use the first (current) key.
rotate_key(new_key: str) -> None¶
Rotate encryption keys by setting a new current key. Moves the current PROXYWHIRL_CACHE_ENCRYPTION_KEY to PROXYWHIRL_CACHE_KEY_PREVIOUS and sets the new key as current. This allows gradual migration: new data uses the new key, old data can still be decrypted with the previous key.
from cryptography.fernet import Fernet
from proxywhirl.cache.crypto import rotate_key
# Generate new key and rotate
new_key = Fernet.generate_key().decode()
rotate_key(new_key)
# Old data remains readable via PROXYWHIRL_CACHE_KEY_PREVIOUS
TTLManager¶
Manages TTL-based expiration with hybrid lazy + background cleanup. Used internally by CacheManager when enable_background_cleanup=True.
- class TTLManager¶
Combines two cleanup strategies:
Lazy expiration: Check TTL on every
get()operationBackground cleanup: Periodic scan of all tiers to remove expired entries
Example:
from proxywhirl.cache.manager import TTLManager, CacheManager from proxywhirl.cache import CacheConfig config = CacheConfig(enable_background_cleanup=False) manager = CacheManager(config) # Manually create and start TTL manager ttl_mgr = TTLManager(manager, cleanup_interval=60) ttl_mgr.start() # ... later ... ttl_mgr.stop()
Constructor¶
- __init__(cache_manager: CacheManager, cleanup_interval: int = 60) None
- Parameters:
cache_manager – Parent CacheManager instance
cleanup_interval – Seconds between cleanup runs (default: 60)
Methods¶
Attributes¶
enabled(bool): Whether background cleanup is runningcleanup_interval(int): Seconds between cleanup runs
Usage Examples¶
Working with Cache Tiers Directly¶
from proxywhirl.cache.tiers import MemoryCacheTier, DiskCacheTier, SQLiteCacheTier, TierType
from proxywhirl.cache import CacheTierConfig, CacheEntry, CredentialEncryptor, HealthStatus
from datetime import datetime, timezone, timedelta
from pathlib import Path
from pydantic import SecretStr
# Initialize tiers
config = CacheTierConfig(max_entries=1000, eviction_policy="lru")
encryptor = CredentialEncryptor()
l1 = MemoryCacheTier(config, TierType.L1_MEMORY)
l2 = DiskCacheTier(config, TierType.L2_FILE, Path(".cache/l2"), encryptor)
l3 = SQLiteCacheTier(config, TierType.L3_SQLITE, Path(".cache/l3.db"), encryptor)
# Create entry
entry = CacheEntry(
key="proxy1",
proxy_url="http://proxy.example.com:8080",
username=SecretStr("user"),
password=SecretStr("pass"),
source="api",
fetch_time=datetime.now(timezone.utc),
last_accessed=datetime.now(timezone.utc),
ttl_seconds=3600,
expires_at=datetime.now(timezone.utc) + timedelta(seconds=3600),
health_status=HealthStatus.HEALTHY
)
# Store in L1
l1.put(entry.key, entry)
# Retrieve from L1 (O(1) lookup)
cached = l1.get(entry.key)
if cached:
print(f"L1 hit: {cached.proxy_url}")
# Store in L2 (persisted to disk)
l2.put(entry.key, entry)
# Retrieve from L2 (O(log n) SQLite lookup)
cached = l2.get(entry.key)
if cached:
print(f"L2 hit: {cached.proxy_url}")
# Store in L3 (full database persistence)
l3.put(entry.key, entry)
# Cleanup expired entries
removed_l1 = l1.cleanup_expired()
removed_l2 = l2.cleanup_expired()
removed_l3 = l3.cleanup_expired()
print(f"Removed: L1={removed_l1}, L2={removed_l2}, L3={removed_l3}")
Encryption and Security¶
from proxywhirl.cache import CredentialEncryptor
from cryptography.fernet import Fernet
from pydantic import SecretStr
import os
# Generate and save encryption key
key = Fernet.generate_key()
os.environ["PROXYWHIRL_CACHE_ENCRYPTION_KEY"] = key.decode()
# Initialize encryptor
encryptor = CredentialEncryptor()
# Encrypt credentials
username = SecretStr("admin")
password = SecretStr("secret123")
encrypted_user = encryptor.encrypt(username)
encrypted_pass = encryptor.encrypt(password)
print(f"Encrypted username: {encrypted_user.hex()}")
print(f"Encrypted password: {encrypted_pass.hex()}")
# Decrypt credentials
decrypted_user = encryptor.decrypt(encrypted_user)
decrypted_pass = encryptor.decrypt(encrypted_pass)
print(f"Decrypted: {decrypted_user.get_secret_value()}") # "admin"
# Password value never logged by SecretStr
Tip
If you have more than 10,000 cache entries, migrating from JSONL to SQLite L2 backend can significantly improve lookup performance (O(log n) vs O(n)).
Migration from JSONL to SQLite L2¶
from proxywhirl.cache.tiers import DiskCacheTier, TierType
from proxywhirl.cache import CacheTierConfig, CredentialEncryptor
from pathlib import Path
# Initialize new SQLite-based L2 tier
config = CacheTierConfig(max_entries=5000)
encryptor = CredentialEncryptor()
cache_dir = Path(".cache/proxies")
tier = DiskCacheTier(config, TierType.L2_FILE, cache_dir, encryptor)
# Migrate from old JSONL shards
migrated = tier.migrate_from_jsonl()
print(f"Successfully migrated {migrated} entries from JSONL to SQLite")
# Old JSONL files can now be safely removed
# for shard in cache_dir.glob("shard_*.jsonl"):
# shard.unlink()
Performance Considerations¶
Tier Selection¶
L1 (Memory):
Fastest (O(1) lookup)
Limited capacity (default: 1000 entries)
Use for hot proxies
L2 (Disk/SQLite):
Medium speed (O(log n) indexed lookup)
Moderate capacity (default: 5000 entries)
Persistent across restarts
Use for warm proxies
L3 (SQLite):
Slower (database overhead, but indexed)
Unlimited capacity
Full health history tracking
Use for cold storage and analytics
Optimization Tips¶
Tune tier sizes based on workload
Enable background cleanup to avoid lazy cleanup overhead
Use encryption for sensitive credentials in L2/L3
Monitor failure rates for graceful degradation
Leverage indexes in L2/L3 for fast queries
Thread Safety¶
All tier implementations use internal locking for thread-safe operations. The CacheTier base class provides handle_failure() and reset_failures() methods for graceful degradation tracking.
Error Handling¶
Tiers implement graceful degradation:
After 3 consecutive failures, tier auto-disables (
enabled = False)Successful operations reset failure counter
Operations on disabled tiers return failure without attempting
Parent cache manager can detect degraded tiers via tier.enabled
See Also¶
Python API – Main ProxyWhirl API (CacheManager, CacheConfig usage)
Configuration – TOML cache configuration options
Exceptions – Cache-specific exceptions (CacheCorruptionError, CacheStorageError, CacheValidationError)
Rate Limiting API – Rate limiting integration
Caching – Cache configuration patterns and optimization
Deployment Security – Production cache security
Getting Started – Getting started guide