Architecture Performance
Prompt caching
API feature that caches the static prefix of a prompt across calls, reducing latency and cost for repeated prefixes. Marked with `cache_control: {"type": "ephemeral"}`. Cache TTL: 5 minutes. Significant savings on long system prompts.