Skip to main content

The contract

Send Idempotency-Key: <opaque-string> on a mutating request. Within 5 minutes of the first response, any retry with the same key returns the same status code and body, byte-for-byte, without re-executing the mutation. The cache is scoped per (API key, Idempotency-Key), so different accounts sending the same key never collide.

Where it applies

EndpointIdempotent
POST /v1/workflows
POST /v1/workflows/{id}/clips/{clip_id}/tracks
POST /v1/workflows/{id}/renders
POST /v1/ledger/reserve❌ (each call is a distinct hold)
POST /v1/ledger/commit✅ (committing an unknown token is a no-op)
POST /v1/ledger/refund
Use it whenever the same logical action might fire twice, network retries, browser double-clicks, agent tool-call timeouts.

Pattern: generate the key once

// Bad: a new key every render attempt = duplicate spend on retry.
const key = `render-${Date.now()}`;

// Good: deterministic per logical action.
const key = `render-${workflow_id}-${dateBucket}-${attempt_number}`;
For agents, a stable pattern is ${tool_name}-${workflow_id}-${YYYY-MM-DD} so the same prompt on the same day collapses to one underlying mutation.

What gets cached

The cache stores { status, body, ts }. After 5 min the entry is evicted; the cache is capped at 500 entries (LRU). A retry after the TTL window does re-execute, design for that.

What doesn’t get cached

Failures with code: 'unauthenticated' or other 4xx envelopes still bypass, we don’t want a retry blocked by a transient auth issue.