Chat Completions
Generate chat completions from a list of messages — the primary endpoint for text generation with all supported models.
/v1/chat/completionsGenerate a chat completion from a list of messages. This is the primary endpoint for text generation with all supported models.
The parameter tables below list the common fields shared across models. Each model may support additional model-specific parameters (e.g. vision input, tool calling, JSON mode). For the exhaustive list of parameters a given model accepts, open its details page at Models and switch to the API tab — the parameter reference there is generated from the model's declared capabilities.
Request body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID (e.g. "gpt-4o-mini", "claude-sonnet-4-6") |
messages | array | Yes | Array of message objects. Each has a role (system, user, or assistant) and content that's either a plain string or an array of content blocks (text, image_url, document_url). system sets the AI's behavior; user is your message; assistant is a prior AI reply. |
temperature | number | No | Controls randomness. 0 = deterministic/focused, 1 = balanced (default), 2 = highly creative/random. |
max_tokens | integer | No | Maximum tokens to generate |
stream | boolean | No | Stream response as SSE. Default: false |
top_p | number | No | Nucleus sampling parameter (0–1) |
stop | string | string[] | No | Up to 4 stop sequences. The model stops generating when it hits one. |
frequency_penalty | number | No | -2.0 to 2.0. Positive values penalize repeated tokens. Default: 0 |
presence_penalty | number | No | -2.0 to 2.0. Positive values push the model toward new topics. Default: 0 |
seed | integer | No | Reproducibility seed. Same seed + params returns similar output (best-effort). |
response_format | object | No | { type: "json_object" } or { type: "json_schema", json_schema: ... } for structured output. Model support varies — see the model's API tab. |
tools | array | No | Function definitions the model can call. Paired with tool_choice. Available on tool-capable models only. |
PDF input. Attach a PDF on a user message by passing a content block with { type: "document_url", document_url: { url: ... } }. Up to 5 PDFs per request. Only models with supportsDocuments: true accept this block — currently claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5-20251001, gpt-4.1, and gpt-4.1-mini.
Example request
import Tchavi from '@tchavi/sdk';
const client = new Tchavi({ apiKey: 'YOUR_API_KEY' });
const response = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is the capital of Benin?' },
],
temperature: 0.7,
});
console.log(response.choices[0].message.content);
console.log('Credits used:', response.tchavi.credits_used);Example response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1711234567,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of Benin is Porto-Novo."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 12,
"total_tokens": 36
},
"tchavi": {
"credits_used": 2,
"credits_remaining": 498,
"model_tier": "budget"
}
}Streaming
Set stream: true to receive the response token-by-token as Server-Sent Events (SSE). This lets you display text as it arrives rather than waiting for the full response.
import Tchavi from '@tchavi/sdk';
const client = new Tchavi({ apiKey: process.env.TCHAVI_API_KEY });
const stream = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Tell me a short story.' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}