Audit Logging

Audit Logging: Complete Action Trail

Audit logging creates a permanent record of every decision OnceOnly makes:

What did the agent try to do?
Was it allowed or blocked?
Why (which policy rule)?
How much did it cost?
When did it happen?

This enables compliance, debugging, and security investigations.

Events Logged

OnceOnly logs these event types:

1. lock:checked — Idempotency check

{
  "event": "lock:checked",
  "ts": 1705322400,
  "agent_id": "support_bot",
  "key": "payment_invoice_123",
  "status": "locked",  // or "duplicate"
  "first_seen_at": "2025-01-15T10:30:00Z",
  "metadata": {
    "invoice_id": "INV-123",
    "amount_usd": 99.99
  }
}

2. lease:acquired — Lease created

{
  "event": "lease:acquired",
  "ts": 1705322400,
  "agent_id": "support_bot",
  "lease_id": "lease_abc123xyz",
  "key": "support_chat_001",
  "ttl": 1800,
  "status": "acquired",
  "metadata": {
    "customer_id": "cust_123"
  }
}

3. lease:extended — Lease TTL extended

{
  "event": "lease:extended",
  "ts": 1705322600,
  "agent_id": "support_bot",
  "lease_id": "lease_abc123xyz",
  "key": "support_chat_001",
  "old_ttl": 1800,
  "new_ttl": 3600,
  "reason": "task_needs_more_time"
}

4. lease:completed — Lease finished successfully

{
  "event": "lease:completed",
  "ts": 1705325200,
  "agent_id": "support_bot",
  "lease_id": "lease_abc123xyz",
  "key": "support_chat_001",
  "duration_seconds": 2800,
  "result_hash": "abc123def456",
  "spent_usd": 0.05,
  "version": 2
}

5. lease:failed — Lease failed with error

{
  "event": "lease:failed",
  "ts": 1705325200,
  "agent_id": "support_bot",
  "lease_id": "lease_abc123xyz",
  "key": "support_chat_001",
  "error_code": "email_service_timeout",
  "error_hash": "xyz789abc",
  "version": 2
}

6. tool:called — Tool invoked

{
  "event": "tool:called",
  "ts": 1705322450,
  "agent_id": "support_bot",
  "tool": "send_email",
  "args_hash": "sha256_of_args",
  "metadata": {
    "recipient_count": 1,
    "template": "password_reset"
  }
}

7. policy:allowed — Action passed policy check

{
  "event": "policy:allowed",
  "ts": 1705322450,
  "agent_id": "support_bot",
  "tool": "send_email",
  "decision": "executed",
  "policy_reason": "ok",
  "risk_level": "low",
  "actions_this_hour": 45,
  "max_actions_per_hour": 100,
  "spend_estimate": 0.001
}

8. policy:blocked — Action blocked by policy

{
  "event": "policy:blocked",
  "ts": 1705322500,
  "agent_id": "support_bot",
  "tool": "delete_user",
  "decision": "blocked",
  "policy_reason": "tool_not_in_allowed_list",
  "risk_level": "high",
  "policy_version": "v1"
}

9. policy:rate_limit_exceeded — Rate limit hit

{
  "event": "policy:rate_limit_exceeded",
  "ts": 1705322600,
  "agent_id": "support_bot",
  "tool": "send_email",
  "decision": "blocked",
  "policy_reason": "rate_limit_exceeded",
  "risk_level": "medium",
  "actions_this_hour": 100,
  "max_actions_per_hour": 100,
  "message": "Max 100 actions/hour reached"
}

10. policy:budget_exceeded — Daily spend limit hit

{
  "event": "policy:budget_exceeded",
  "ts": 1705322700,
  "agent_id": "billing_bot",
  "tool": "process_payment",
  "decision": "blocked",
  "policy_reason": "budget_exceeded",
  "risk_level": "critical",
  "spend_today": 10000.00,
  "max_spend_per_day": 10000.00,
  "spend_estimate": 1500.00,
  "message": "Daily budget exhausted"
}

11. agent:disabled — Agent turned off

{
  "event": "agent:disabled",
  "ts": 1705322800,
  "agent_id": "suspicious_bot",
  "disabled_reason": "repeated_policy_violations",
  "disabled_by": "admin_user@example.com",
  "violation_count": 15
}

12. agent:enabled — Agent re-enabled

{
  "event": "agent:enabled",
  "ts": 1705322900,
  "agent_id": "suspicious_bot",
  "enabled_by": "admin_user@example.com",
  "policy_updated": true,
  "new_policy_version": "v2"
}

Querying Audit Logs

Get Agent Logs

curl -H "Authorization: Bearer once_live_xxxxxxxxxxxxx" \
  "https://api.onceonly.tech/v1/agents/support_bot/logs?limit=100"

Query Parameters:

limit — Max results (default: 100, max: 500)

Response:

[
  {
    "ts": 1705322450,
    "agent_id": "support_bot",
    "tool": "send_email",
    "allowed": true,
    "decision": "executed",
    "policy_reason": "ok",
    "risk_level": "low",
    "spend_usd": 0.001
  },
  {
    "ts": 1705322500,
    "agent_id": "support_bot",
    "tool": "delete_user",
    "allowed": false,
    "decision": "blocked",
    "policy_reason": "tool_not_in_allowed_list",
    "risk_level": "medium"
  }
]

Get Metrics for Time Period

curl -H "Authorization: Bearer once_live_xxxxxxxxxxxxx" \
  "https://api.onceonly.tech/v1/agents/support_bot/metrics?period=day"

Query Parameters:

period — “hour”, “day”, or “week”

Response:

{
  "agent_id": "support_bot",
  "period": "day",
  "total_actions": 450,
  "blocked_actions": 12,
  "total_spend_usd": 125.50,
  "top_tools": [
    {"tool": "send_email", "count": 400},
    {"tool": "create_ticket", "count": 50}
  ]
}

Real-World Investigation Examples

Scenario 1: Investigate Policy Block

An admin notices an agent was blocked. What happened?

def investigate_block(agent_id: str):
    """Investigate why an agent was blocked"""

    logs = get_agent_logs(agent_id)
    logs = [l for l in logs if l.get(\"decision\") == \"blocked\"]

    if not logs:
        return {"status": "no_blocks", "message": "Agent hasn't been blocked"}

    # Group by policy reason
    blocks_by_reason = {}
    for log in logs:
        reason = log.get("policy_reason")
        blocks_by_reason[reason] = blocks_by_reason.get(reason, 0) + 1

    return {
        "total_blocks": len(logs),
        "blocks_by_reason": blocks_by_reason,
        "recent_blocks": logs[-5:],
        "recommendation": recommend_action(blocks_by_reason)
    }

def recommend_action(blocks_by_reason):
    if "rate_limit_exceeded" in blocks_by_reason:
        return "Increase max_actions_per_hour"
    elif "budget_exceeded" in blocks_by_reason:
        return "Increase max_spend_usd_per_day"
    elif "tool_not_in_allowed_list" in blocks_by_reason:
        return "Add needed tools to allowed_tools list"
    else:
        return "Review policy configuration"

Scenario 2: Detect Suspicious Activity

Check if an agent is behaving abnormally:

def detect_anomalies(agent_id: str):
    """Detect unusual activity patterns"""

    metrics = get_agent_metrics(agent_id, period="hour")

    # Red flags
    red_flags = []

    # High block rate
    block_rate = metrics["blocked_actions"] / (metrics["total_actions"] or 1)
    if block_rate > 0.2:  # More than 20% blocked
        red_flags.append(f"High block rate: {block_rate:.1%}")

    # Rapid spending
    hourly_spend = metrics["total_spend_usd"]
    if hourly_spend > 100:
        red_flags.append(f"High hourly spend: ${hourly_spend}")

    # Unusual tool usage
    if not metrics.get("top_tools"):
        red_flags.append("No tool usage in last hour")

    return {
        "agent_id": agent_id,
        "is_suspicious": len(red_flags) > 0,
        "red_flags": red_flags,
        "metrics": metrics
    }

# Usage
anomalies = detect_anomalies("support_bot")
if anomalies["is_suspicious"]:
    logger.warning(f"Suspicious activity: {anomalies}")
    # Could auto-disable agent:
    # disable_agent("support_bot", reason="anomalous_activity")

Scenario 3: Compliance Audit

Generate compliance report for audit:

def generate_compliance_report(agent_id: str, start_date: date, end_date: date):
    """Generate compliance report for auditors"""

    # Fetch recent logs and filter by date range client-side
    logs = get_agent_logs(agent_id, limit=500)
    logs = [l for l in logs if start_date <= date.fromtimestamp(int(l.get("ts", 0))) <= end_date]

    report = {
        "agent_id": agent_id,
        "period": f"{start_date} to {end_date}",
        "total_actions": len(logs),
        "breakdown": {
            "allowed": sum(1 for log in logs if log.get("allowed", True)),
            "blocked": sum(1 for log in logs if log.get("allowed", False))
        },
        "policy_blocks": {
            "tool_not_in_allowed_list": sum(1 for log in logs if log.get("policy_reason") == "tool_not_in_allowed_list"),
            "rate_limit_exceeded": sum(1 for log in logs if log.get("policy_reason") == "rate_limit_exceeded"),
            "budget_exceeded": sum(1 for log in logs if log.get("policy_reason") == "budget_exceeded")
        },
        "total_spend": sum(log.get("spend_usd", 0) for log in logs),
        "policy_versions_used": list(set(log.get("policy_version") for log in logs))
    }

    return report

# Usage
report = generate_compliance_report("support_bot", date(2025, 1, 1), date(2025, 1, 31))
# Export to CSV/PDF for auditors

Retention and Archival

Log Retention Policy

Hot storage (Redis) — Last 7 days, fast access
Cold storage (PostgreSQL) — 90+ days, archival
Export to S3 — 1 year+ retention for compliance

# Automated archival
async def archive_old_logs():
    """Move logs older than 30 days to S3"""
    thirty_days_ago = datetime.now() - timedelta(days=30)

    # Get logs from PostgreSQL
    old_logs = db.query(
        "SELECT * FROM audit_logs WHERE ts < %s",
        thirty_days_ago.timestamp()
    )

    # Export to S3
    s3_key = f"audit-logs/{thirty_days_ago.year}/{thirty_days_ago.month:02d}.json"
    s3.put_object(
        Bucket="compliance-archive",
        Key=s3_key,
        Body=json.dumps([dict(log) for log in old_logs])
    )

    # Clean up PostgreSQL
    db.execute(
        "DELETE FROM audit_logs WHERE ts < %s",
        thirty_days_ago.timestamp()
    )

Alerting on Audit Events

Set up alerts for important events:

def setup_audit_alerts(agent_id: str):
    """Configure alerts for suspicious activity"""

    alerts = [
        {
            "event_type": "policy:blocked",
            "condition": "count > 5 in 1 hour",
            "action": "email_admin",
            "message": f"Agent {agent_id} blocked >5 times in 1 hour"
        },
        {
            "event_type": "policy:budget_exceeded",
            "condition": "any occurrence",
            "action": "disable_agent",
            "message": f"Agent {agent_id} exceeded daily budget"
        },
        {
            "event_type": "lease:failed",
            "condition": "count > 3 in 1 hour",
            "action": "email_admin",
            "message": f"Agent {agent_id} had >3 failures in 1 hour"
        },
        {
            "event_type": "agent:disabled",
            "condition": "any occurrence",
            "action": "create_incident",
            "message": f"Agent {agent_id} was disabled"
        }
    ]

    return alerts

Log Format Specification

Each log entry has this structure:

{
  "ts": 1705322400,                    // Unix timestamp
  "type": "string",                    // Event type
  "agent_id": "string",                // Agent identifier
  "req_id": "string",                  // Request ID for tracing

  // Conditional fields depending on event type
  "tool": "string",                    // If tool-related
  "key": "string",                     // If lock/lease-related
  "lease_id": "string",                // If lease operation
  "decision": "allowed|blocked",       // If policy check
  "policy_reason": "string",           // Why was it allowed/blocked
  "risk_level": "low|medium|high|critical",  // Risk assessment

  // Financial tracking
  "spend_usd": 0.0,                    // Cost of this action
  "spend_estimate": 0.0,               // Estimated cost

  // Metadata
  "metadata": {},                      // Arbitrary data
  "error_code": "string",              // If error occurred
  "error_hash": "string"               // Partial error trace
}

Best Practices

✅ DO

Review logs regularly — Weekly or daily depending on traffic
Set up alerts — For policy violations and failures
Archive logs — Keep 90+ days for compliance
Export for audit — Regular compliance reports
Correlate events — Track agent activity patterns

# Good practice: Weekly audit review
import schedule

def weekly_audit_review():
    """Review audit logs weekly"""
    for agent_id in get_all_agents():
        logs = get_agent_logs(agent_id, limit=200)

        blocks = [log for log in logs if log["allowed"] == False]
        if len(blocks) > 10:
            logger.warning(f"Agent {agent_id} had {len(blocks)} blocks this week")
            # Investigate
            investigate_blocks(agent_id)

schedule.every().monday.at("09:00").do(weekly_audit_review)

❌ DON’T

Don’t ignore blocks — They signal misconfigured policies
Don’t delete logs — Keep for compliance (90+ days)
Don’t miss failures — High failure rate indicates issues
Don’t disable logging — Always log for safety

Plan-Based Audit Features

Feature	Free	Starter	Pro	Agency
Basic logging	✓	✓	✓	✓
7-day retention	✓	✓	✓	✓
View agent logs	-	-	✓	✓
Download logs	-	-	✓	✓
Metrics API	-	-	✓	✓
Custom alerts	-	-	-	✓
90-day retention	-	-	-	✓
Compliance exports	-	-	-	✓

Summary

Audit logs record every OnceOnly decision
Events include checks, leases, policies, and errors
Logs enable investigation and compliance
Metrics show performance and usage
Retention handles archival and compliance
Alerts detect suspicious activity

Use audit logs for:

Debugging — Why was an action blocked?
Compliance — What did agents do?
Security — Detect malicious activity
Performance — Find bottlenecks
Cost tracking — Monitor spending per agent

Next, explore Pricing Plans to understand what OnceOnly costs.