OpenAI compatibility
Quant AI exposes an OpenAI-compatible API at /oai/v1. Point any OpenAI SDK or tool (the official Python/Node SDKs, LangChain, LiteLLM, Cline, OpenRouter-style proxies, Drupal AI, etc.) at this base URL and use your Quant API token as the API key — no adapter required.
Under the hood, Quant runs your requests on AWS Bedrock models (Anthropic Claude, Amazon Nova, and others) but returns them in the exact shapes the OpenAI SDKs expect.
Base URL
Section titled “Base URL”https://dashboard.quantcdn.io/oai/v1Configure your client’s base_url (or “API base”) to this value. The SDK appends /chat/completions, /embeddings, and /models automatically.
Authentication
Section titled “Authentication”Use a Quant API token (the qc_... value) as the OpenAI api_key. It is sent as a standard Authorization: Bearer <token> header.
The organisation is resolved from the token automatically. If your token is scoped to more than one organisation, select one with the OpenAI-Organization header (mapped to your Quant organisation machine name).
Supported endpoints
Section titled “Supported endpoints”| Endpoint | OpenAI method | Notes |
|---|---|---|
POST /oai/v1/chat/completions |
Chat completions | Buffered and streaming (stream: true); tool/function calling |
POST /oai/v1/embeddings |
Embeddings | Single string or array of strings |
GET /oai/v1/models |
List models | Returns the model ids available to your organisation |
GET /oai/v1/models/{model} |
Retrieve model |
Quick start
Section titled “Quick start”Python
Section titled “Python”from openai import OpenAI
client = OpenAI( api_key="qc_your_quant_token", base_url="https://dashboard.quantcdn.io/oai/v1",)
resp = client.chat.completions.create( model="anthropic.claude-sonnet-4-6", messages=[{"role": "user", "content": "Write a haiku about CDNs."}],)print(resp.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({ apiKey: "qc_your_quant_token", baseURL: "https://dashboard.quantcdn.io/oai/v1",});
const resp = await client.chat.completions.create({ model: "anthropic.claude-sonnet-4-6", messages: [{ role: "user", content: "Write a haiku about CDNs." }],});console.log(resp.choices[0].message.content);curl https://dashboard.quantcdn.io/oai/v1/chat/completions \ -H "Authorization: Bearer qc_your_quant_token" \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic.claude-sonnet-4-6", "messages": [{"role": "user", "content": "Hello!"}] }'Streaming
Section titled “Streaming”Set stream: true to receive Server-Sent Events. Each event is a chat.completion.chunk object, and the stream terminates with data: [DONE] — exactly as OpenAI’s SDKs expect.
stream = client.chat.completions.create( model="anthropic.claude-sonnet-4-6", messages=[{"role": "user", "content": "Count to five."}], stream=True,)for chunk in stream: print(chunk.choices[0].delta.content or "", end="")Pass stream_options={"include_usage": true} to receive a final usage chunk before [DONE].
Tool (function) calling
Section titled “Tool (function) calling”Standard OpenAI tools are supported, including tool_choice:
resp = client.chat.completions.create( model="anthropic.claude-sonnet-4-6", messages=[{"role": "user", "content": "What's the weather in Melbourne?"}], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "Get the current weather for a city", "parameters": { "type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"], }, }, }], tool_choice="auto",)print(resp.choices[0].message.tool_calls)tool_choice accepts "auto", "none", "required", or {"type": "function", "function": {"name": "..."}}. When you force a tool (required or a specific function), the tool call is returned to your client to execute — the OpenAI-standard behaviour.
Models
Section titled “Models”List the models available to your organisation and pass an id as the model field:
curl https://dashboard.quantcdn.io/oai/v1/models \ -H "Authorization: Bearer qc_your_quant_token"Model ids are Quant/Bedrock ids such as anthropic.claude-sonnet-4-6, amazon.nova-micro-v1:0, or amazon.titan-embed-text-v2:0.
Compatibility notes
Section titled “Compatibility notes”The /oai/v1 surface targets the stateless drop-in tier of the OpenAI API. The following are intentional differences:
- Single choice — responses always contain one choice;
n > 1is not supported. - Ignored parameters —
frequency_penalty,presence_penalty,seed,logprobs, andlogit_biasare accepted but not applied. finish_reason— streaming responses reportstoprather thanlengthon max-token truncation (buffered responses reportlengthcorrectly).- Not included — image generation, audio, moderations, the Assistants/Responses APIs, and batches. Use the native Quant AI API for agents, sessions, vector database, workflows, and other stateful features.
For everything beyond the drop-in tier — agents, orchestrations, skills, vector search, governance — see the AI API reference.
