/claude-sonnet

Versatile Claude — pro quality, controlled price

Input

Output

Message

Demo

Sample output — click Run to generate your own.

Prompt caching amortizes the cost of a long system prompt by reusing the model's work on the shared prefix across requests — you pay the full input price once, then a fraction for each follow-up. The prefix must match byte-for-byte, so whitespace, ordering, and dynamic variables need to live after the cached block. Providers typically keep caches warm for about five minutes, which fits bursty traffic well but means quiet workloads recompute every request.

Ready to use Claude Sonnet?

Create a free account API Documentation

500 free credits · No credit card