# Chat Completions

> Generate chat completions from a list of messages — the primary endpoint for text generation with all supported models.

- Canonical: https://tchavi.com/en/docs/chat-completions

---


<Endpoint method="POST" path="/v1/chat/completions" />

Generate a chat completion from a list of messages. This is the primary endpoint for text generation with all supported models.

<Callout type="note">
  The parameter tables below list the common fields shared across models. Each model may support
  additional model-specific parameters (e.g. vision input, tool calling, JSON mode). For the
  exhaustive list of parameters a given model accepts, open its details page at [Models](/en/models)
  and switch to the **API** tab — the parameter reference there is generated from the model's
  declared capabilities.
</Callout>

## Request body

| Parameter           | Type               | Required | Description                                                                                                                                                                                                                                                                              |
| ------------------- | ------------------ | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model`             | string             | Yes      | Model ID (e.g. "gpt-4o-mini", "claude-sonnet-4-6")                                                                                                                                                                                                                                       |
| `messages`          | array              | Yes      | Array of message objects. Each has a `role` (`system`, `user`, or `assistant`) and `content` that's either a plain string or an array of content blocks (`text`, `image_url`, `document_url`). `system` sets the AI's behavior; `user` is your message; `assistant` is a prior AI reply. |
| `temperature`       | number             | No       | Controls randomness. 0 = deterministic/focused, 1 = balanced (default), 2 = highly creative/random.                                                                                                                                                                                      |
| `max_tokens`        | integer            | No       | Maximum tokens to generate                                                                                                                                                                                                                                                               |
| `stream`            | boolean            | No       | Stream response as SSE. Default: false                                                                                                                                                                                                                                                   |
| `top_p`             | number             | No       | Nucleus sampling parameter (0–1)                                                                                                                                                                                                                                                         |
| `stop`              | string \| string[] | No       | Up to 4 stop sequences. The model stops generating when it hits one.                                                                                                                                                                                                                     |
| `frequency_penalty` | number             | No       | -2.0 to 2.0. Positive values penalize repeated tokens. Default: 0                                                                                                                                                                                                                        |
| `presence_penalty`  | number             | No       | -2.0 to 2.0. Positive values push the model toward new topics. Default: 0                                                                                                                                                                                                                |
| `seed`              | integer            | No       | Reproducibility seed. Same seed + params returns similar output (best-effort).                                                                                                                                                                                                           |
| `response_format`   | object             | No       | `{ type: "json_object" }` or `{ type: "json_schema", json_schema: ... }` for structured output. Model support varies — see the model's API tab.                                                                                                                                          |
| `tools`             | array              | No       | Function definitions the model can call. Paired with `tool_choice`. Available on tool-capable models only.                                                                                                                                                                               |

<Callout type="note">
**PDF input.** Attach a PDF on a `user` message by passing a content block with `{ type: "document_url", document_url: { url: ... } }`. Up to 5 PDFs per request. Only models with `supportsDocuments: true` accept this block — currently `claude-opus-4-7`, `claude-sonnet-4-6`, `claude-haiku-4-5-20251001`, `gpt-4.1`, and `gpt-4.1-mini`.
</Callout>

## Example request

<CodeTabs>

```tchavi
import Tchavi from '@tchavi/sdk';

const client = new Tchavi({ apiKey: 'YOUR_API_KEY' });

const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of Benin?' },
  ],
  temperature: 0.7,
});

console.log(response.choices[0].message.content);
console.log('Credits used:', response.tchavi.credits_used);
```

```openai
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tchavi.com/api/v1",
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of Benin?"},
    ],
    temperature=0.7,
)

print(response.choices[0].message.content)
```

```javascript
const response = await fetch('https://tchavi.com/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: 'Bearer YOUR_API_KEY',
  },
  body: JSON.stringify({
    model: 'gpt-4o-mini',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'What is the capital of Benin?' },
    ],
    temperature: 0.7,
  }),
});

const data = await response.json();
console.log(data.choices[0].message.content);

// Check remaining credits
console.log('Credits:', response.headers.get('X-Credits-Remaining'));
```

```python
import requests

response = requests.post(
    "https://tchavi.com/api/v1/chat/completions",
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_API_KEY",
    },
    json={
        "model": "gpt-4o-mini",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is the capital of Benin?"},
        ],
        "temperature": 0.7,
    },
)

data = response.json()
print(data["choices"][0]["message"]["content"])

# Check remaining credits
print("Credits:", response.headers.get("X-Credits-Remaining"))
```

```curl
curl -X POST https://tchavi.com/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "What is the capital of Benin?" }
    ],
    "temperature": 0.7
  }'
```

</CodeTabs>

## Example response

```json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1711234567,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Benin is Porto-Novo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 12,
    "total_tokens": 36
  },
  "tchavi": {
    "credits_used": 2,
    "credits_remaining": 498,
    "model_tier": "budget"
  }
}
```

## Streaming

Set `stream: true` to receive the response token-by-token as Server-Sent Events (SSE). This lets you display text as it arrives rather than waiting for the full response.

<CodeTabs>

```tchavi
import Tchavi from '@tchavi/sdk';

const client = new Tchavi({ apiKey: process.env.TCHAVI_API_KEY });

const stream = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Tell me a short story.' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
```

```javascript
const response = await fetch('https://tchavi.com/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: 'Bearer YOUR_API_KEY',
  },
  body: JSON.stringify({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: 'Tell me a short story.' }],
    stream: true,
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const lines = decoder.decode(value).split('\n');
  for (const line of lines) {
    if (!line.startsWith('data: ') || line.includes('[DONE]')) continue;
    const delta = JSON.parse(line.slice(6)).choices[0]?.delta?.content ?? '';
    process.stdout.write(delta);
  }
}
```

```python
import requests, json

response = requests.post(
    "https://tchavi.com/api/v1/chat/completions",
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_API_KEY",
    },
    json={
        "model": "gpt-4o-mini",
        "messages": [{"role": "user", "content": "Tell me a short story."}],
        "stream": True,
    },
    stream=True,
)

for line in response.iter_lines():
    if not line or line == b"data: [DONE]":
        continue
    data = json.loads(line.decode().removeprefix("data: "))
    print(data["choices"][0]["delta"].get("content", ""), end="", flush=True)
```

</CodeTabs>

