Idempotency
Idempotency: Preventing Duplicate Execution
Section titled “Idempotency: Preventing Duplicate Execution”Idempotency means an operation can be safely repeated without unintended side effects. OnceOnly guarantees this through distributed locking.
The Problem: Why Duplication Happens
Section titled “The Problem: Why Duplication Happens”Timeline of a Payment with No Idempotency Protection:
10:30:00 │ Agent: "Charge $100" │ ↓ ├────────────────────────────────────┐ │ Stripe API processes charge │ │ Returns: Charge ID ch_123 │ │ But network latency occurs... │ │ │10:30:05 │ Agent: (timeout, no response) │ │ Network still processing │ │ │10:30:10 │ Agent: Retry! "Charge $100 again" │ │ ↓ │ │ Stripe: OK! New charge: ch_124 │ │ │RESULT: │ Customer charged $200 instead of │ │ $100! 💥 │This happens because:
- Network timeouts — Agent doesn’t receive response
- Server errors — Stripe charges, but returns 500 error
- Retry logic — Agent automatically retries, not knowing it already succeeded
The Solution: OnceOnly’s Idempotency Lock
Section titled “The Solution: OnceOnly’s Idempotency Lock”OnceOnly uses a unique key per action + Redis lock to ensure it only happens once.
Timeline of a Payment WITH OnceOnly:
10:30:00 │ Agent: POST /check-lock │ key="payment_123" │ ↓ │ OnceOnly: "locked" ✓ (first time) │ │ Agent: Charge $100 │ → Stripe: OK ch_123 │10:30:05 │ (Network timeout, agent confused) │10:30:10 │ Agent: Retry POST /check-lock │ key="payment_123" │ ↓ │ OnceOnly: "duplicate" ⚠️ │ │ Agent: Returns previous result │ Does NOT charge again! │RESULT: │ Customer charged $100 (once) ✓How It Works
Section titled “How It Works”Request 1:POST /v1/check-lock {"key": "payment_123", "ttl": 3600} ↓Check Redis: Does lock exist? NO ↓Create lock in Redis with 1 hour expiry ↓Return: {"success": true, "status": "locked", "first_seen_at": null} ↓Agent executes payment
Request 2 (within 1 hour):POST /v1/check-lock {"key": "payment_123", "ttl": 3600} ↓Check Redis: Does lock exist? YES ↓Return: {"success": false, "status": "duplicate", "first_seen_at": "10:30:00Z"} ↓Agent: Do NOT execute payment Return cached result insteadChoosing the Right Key
Section titled “Choosing the Right Key”Your key should be unique, stable, and deterministic.
✅ Good Keys
Section titled “✅ Good Keys”# Invoice payment — unique per invoicekey = f"payment_invoice_{invoice_id}" # payment_invoice_INV-123
# Email sending — unique per user and email typekey = f"email_{user_id}_welcome" # email_user_456_welcome
# Database record creation — unique per intended recordkey = f"create_user_{email}" # create_user_customer@example.com
# API call — unique per resource and operationkey = f"update_profile_{user_id}_{timestamp}" # Includes timestamp if needed❌ Bad Keys
Section titled “❌ Bad Keys”# Too generic — will collide with other operationskey = "payment" # ❌ Multiple payments will share lock!
# Non-deterministic — different each timekey = f"payment_{uuid.uuid4()}" # ❌ New UUID each retry!
# External ID — may change unexpectedlykey = f"payment_{api_response.id}" # ❌ ID might change!
# Time-based — defeats idempotencykey = f"payment_{time.time()}" # ❌ Different each second!TTL: How Long Should the Lock Last?
Section titled “TTL: How Long Should the Lock Last?”The ttl parameter determines how long OnceOnly remembers this action.
For Network Issues (< 1 min operations)
Section titled “For Network Issues (< 1 min operations)”# Quick operations: database writes, API callsttl = 60 # Remember for 1 minute (example; default depends on plan/server config)
# Operations with longer network retry windowttl = 300 # 5 minutesReason: Retries typically happen within seconds. After 1-5 minutes, it’s probably a different user action, not a retry.
For Operations with Backoff Retry
Section titled “For Operations with Backoff Retry”# If your retry logic has exponential backoff# (1s, 2s, 4s, 8s, 16s...)ttl = 60 # 1 minute covers most retry attemptsFor Long-running Tasks
Section titled “For Long-running Tasks”# ❌ DON'T use check-lock for long-running tasks!# These take minutes or hours. Use AI Lease instead.
# ✅ Use AI LeasePOST /v1/ai/lease {"key": "support_chat_1", "ttl": 1800}Caching Results
Section titled “Caching Results”OnceOnly only tells you “new” or “duplicate”. You must cache the result to return on duplicates.
Redis Caching Example
Section titled “Redis Caching Example”import redisimport json
redis_client = redis.Redis(host='localhost', port=6379)
def process_with_cache(action_key: str, processor_fn, ttl: int = 3600) -> dict: """ Execute action with idempotency and result caching.
processor_fn: Callable that does the actual work Returns the result (cached on duplicate) """
# Step 1: Check OnceOnly for idempotency result = requests.post( "https://api.onceonly.tech/v1/check-lock", headers={"Authorization": f"Bearer {api_key}"}, json={"key": action_key, "ttl": ttl} ).json()
# Step 2: If duplicate, return cached result if result["status"] == "duplicate": cached = redis_client.get(f"result:{action_key}") if cached: return json.loads(cached) else: # Cache expired (shouldn't happen if TTLs match) raise CacheExpired(f"Cache miss for {action_key}")
# Step 3: If new, execute the processor output = processor_fn()
# Step 4: Cache the result for future duplicates redis_client.setex( f"result:{action_key}", ttl, json.dumps(output) )
return output
# Usagedef charge_customer(invoice_id: str, amount: float) -> dict: key = f"payment_inv_{invoice_id}"
return process_with_cache( action_key=key, processor_fn=lambda: stripe.Charge.create(amount=int(amount*100)), ttl=3600 )In-Memory Caching Example
Section titled “In-Memory Caching Example”from functools import lru_cachefrom datetime import datetime, timedelta
cache = {}
def process_with_memory_cache(action_key: str, processor_fn, ttl: int = 3600) -> dict: """Cache results in memory (suitable for single-process apps)"""
# Check OnceOnly result = requests.post(...).json()
if result["status"] == "duplicate": if action_key in cache: cached_result, expiry = cache[action_key] if datetime.now() < expiry: return cached_result raise CacheExpired(f"Cache miss for {action_key}")
# New action output = processor_fn()
# Cache result cache[action_key] = (output, datetime.now() + timedelta(seconds=ttl))
return outputHandling Duplicate Responses
Section titled “Handling Duplicate Responses”When you get "duplicate", you have several options:
Option 1: Return Cached Result (Recommended)
Section titled “Option 1: Return Cached Result (Recommended)”def charge_payment(invoice_id): lock = check_lock(f"payment_inv_{invoice_id}")
if lock["status"] == "locked": # New action charge = stripe.create_charge() cache[invoice_id] = charge return {"status": "charged", "id": charge.id} else: # Duplicate — return cached result cached = cache.get(invoice_id) if cached: return {"status": "cached", "id": cached.id} else: return {"error": "cache_miss"}Option 2: Log and Skip (For Read-only)
Section titled “Option 2: Log and Skip (For Read-only)”def fetch_user_data(user_id): lock = check_lock(f"fetch_user_{user_id}")
if lock["status"] == "locked": data = db.query(user_id) cache[user_id] = data return data else: # For read-only ops, logging dupes is fine logger.info(f"Duplicate fetch attempt for user {user_id}") return cache.get(user_id)Option 3: Return Error (For Critical Ops)
Section titled “Option 3: Return Error (For Critical Ops)”def transfer_money(account_a, account_b, amount): lock = check_lock(f"transfer_{account_a}_{account_b}_{amount}")
if lock["status"] == "locked": return perform_transfer(account_a, account_b, amount) else: # For financial ops, fail safely if cache is lost return { "error": "duplicate_detected", "message": "This transfer was already attempted", "first_seen_at": lock["first_seen_at"] }Best Practices
Section titled “Best Practices”- Use stable identifiers — Base keys on resource IDs, not random tokens
- Match cache TTL to lock TTL — Keep them in sync
- Store full results — Cache the complete response
- Log duplicates — For debugging retry patterns
- Use descriptive keys — Makes logs easier to read
# Good practice examplekey = f"email_user_{user_id}_reset_password"ttl = 3600 # 1 hour
result = check_lock(key, ttl)if result["status"] == "locked": output = send_reset_email(user_id) cache.set(f"result:{key}", output, ttl)else: output = cache.get(f"result:{key}") logger.info(f"Duplicate email send: {key}")❌ DON’T
Section titled “❌ DON’T”- Don’t retry with new keys — Defeats idempotency
- Don’t use timestamps in keys — TTL should be enough
- Don’t forget to cache results — Then duplicates have nothing to return
- Don’t use for long operations — Use AI Lease instead
- Don’t ignore duplicate responses — They’re important signals
Comparison: Check-Lock vs AI Lease
Section titled “Comparison: Check-Lock vs AI Lease”| Feature | Check-Lock | AI Lease |
|---|---|---|
| Best for | < 1 min operations | 1-24 hour operations |
| Max TTL | 3600 seconds | 86400 seconds (24h) |
| Can extend | No | Yes (via /extend) |
| Ownership | Any agent can query | Only owner can complete |
| Use case | Payments, emails, DB writes | Support chats, document processing |
For long operations, use AI Leases instead.
Common Implementation Errors
Section titled “Common Implementation Errors”Error 1: New Key on Each Retry
Section titled “Error 1: New Key on Each Retry”# ❌ WRONGdef pay_invoice(invoice_id): key = f"payment_{uuid.uuid4()}" # New UUID! check_lock(key) # Lock is different each time charge_stripe()Error 2: TTL Too Short
Section titled “Error 2: TTL Too Short”# ❌ WRONGdef send_email(user_id): key = f"email_{user_id}" check_lock(key, ttl=5) # Too short! # If user's email client retries after 6 seconds, duplicate! send_email_api()Error 3: No Caching
Section titled “Error 3: No Caching”# ❌ WRONGdef create_record(name): key = f"create_{name}" result = check_lock(key)
if result["status"] == "duplicate": return "already created" # But which one? Lost!
return create_db_record(name)Error 4: Using for Long Tasks
Section titled “Error 4: Using for Long Tasks”# ❌ WRONGdef process_file(file_id): key = f"file_{file_id}" check_lock(key, ttl=3600) # File might take 2 hours! process_heavy_file()
# If another agent retries after 1 hour, it will also start processing!Correct Implementation
Section titled “Correct Implementation”# ✅ RIGHTdef create_record(name): key = f"create_record_{name}"
result = check_lock(key, ttl=300) # 5 min window
if result["status"] == "locked": # New action record = db.create(name) cache.set(f"result:{key}", record, ttl=300) return record else: # Duplicate cached = cache.get(f"result:{key}") if cached: return cached # Return the original record raise CacheExpiredError()Monitoring
Section titled “Monitoring”Track idempotency patterns in your logs:
import logging
logger = logging.getLogger(__name__)
def tracked_check_lock(key, ttl=3600): result = check_lock(key, ttl)
if result["status"] == "locked": logger.info(f"idempotency_new key={key}") else: logger.warning(f"idempotency_duplicate key={key} first_seen={result['first_seen_at']}")
return result
# High duplicate rate might indicate:# - Overly aggressive retry logic# - Network instability# - Clock skew (same timestamp used)Summary
Section titled “Summary”- Idempotency prevents duplicate execution via distributed locks
- Key selection is critical — be descriptive and stable
- TTL depends on your retry window (usually 60-300 seconds)
- Caching results is your responsibility
- For long tasks use AI Leases instead