Skip to main content
Code Guide
Architecture Performance

Prompt caching

API feature that caches the static prefix of a prompt across calls, reducing latency and cost for repeated prefixes. Marked with `cache_control: {"type": "ephemeral"}`. Cache TTL: 5 minutes. Significant savings on long system prompts.