# Streaming

> Receive responses token-by-token as Server-Sent Events by setting stream: true.

- Canonical: https://tchavi.com/en/docs/streaming

---


Set `stream: true` to receive the response token-by-token as Server-Sent Events (SSE). This lets you display text as it arrives rather than waiting for the full response.

## How it works

When streaming is enabled, the API returns a series of SSE lines. Each line begins with `data: ` followed by a JSON chunk. The text fragment for each chunk lives in `choices[0].delta.content`. The stream ends with a final `data: [DONE]` sentinel line, which you should skip rather than parse.

## Examples

<CodeTabs>

```tchavi
import Tchavi from '@tchavi/sdk';

const client = new Tchavi({ apiKey: process.env.TCHAVI_API_KEY });

const stream = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Tell me a short story.' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
```

```javascript
const response = await fetch('https://tchavi.com/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: 'Bearer YOUR_API_KEY',
  },
  body: JSON.stringify({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: 'Tell me a short story.' }],
    stream: true,
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const lines = decoder.decode(value).split('\n');
  for (const line of lines) {
    if (!line.startsWith('data: ') || line.includes('[DONE]')) continue;
    const delta = JSON.parse(line.slice(6)).choices[0]?.delta?.content ?? '';
    process.stdout.write(delta);
  }
}
```

```python
import requests, json

response = requests.post(
    "https://tchavi.com/api/v1/chat/completions",
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_API_KEY",
    },
    json={
        "model": "gpt-4o-mini",
        "messages": [{"role": "user", "content": "Tell me a short story."}],
        "stream": True,
    },
    stream=True,
)

for line in response.iter_lines():
    if not line or line == b"data: [DONE]":
        continue
    data = json.loads(line.decode().removeprefix("data: "))
    print(data["choices"][0]["delta"].get("content", ""), end="", flush=True)
```

</CodeTabs>

<Callout type="tip">
  Streaming uses the same OpenAI-compatible request format as a normal call — you only add `stream:
  true`.
</Callout>

See [Chat Completions](/en/docs/chat-completions) for the full request and response reference.

