OpenAI compatibility

Quant AI exposes an OpenAI-compatible API at /oai/v1. Point any OpenAI SDK or tool (the official Python/Node SDKs, LangChain, LiteLLM, Cline, OpenRouter-style proxies, Drupal AI, etc.) at this base URL and use your Quant API token as the API key — no adapter required.

Under the hood, Quant runs your requests on AWS Bedrock models (Anthropic Claude, Amazon Nova, and others) but returns them in the exact shapes the OpenAI SDKs expect.

Base URL

https://dashboard.quantcdn.io/oai/v1

Configure your client’s base_url (or “API base”) to this value. The SDK appends /chat/completions, /embeddings, and /models automatically.

Authentication

Use a Quant API token (the qc_... value) as the OpenAI api_key. It is sent as a standard Authorization: Bearer <token> header.

The organisation is resolved from the token automatically. If your token is scoped to more than one organisation, select one with the OpenAI-Organization header (mapped to your Quant organisation machine name).

Supported endpoints

Endpoint	OpenAI method	Notes
`POST /oai/v1/chat/completions`	Chat completions	Buffered and streaming (`stream: true`); tool/function calling
`POST /oai/v1/embeddings`	Embeddings	Single string or array of strings
`GET /oai/v1/models`	List models	Returns the model ids available to your organisation
`GET /oai/v1/models/{model}`	Retrieve model

Quick start

Python

from openai import OpenAI

client = OpenAI(
    api_key="qc_your_quant_token",
    base_url="https://dashboard.quantcdn.io/oai/v1",
)

resp = client.chat.completions.create(
    model="anthropic.claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Write a haiku about CDNs."}],
)
print(resp.choices[0].message.content)

Node

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "qc_your_quant_token",
  baseURL: "https://dashboard.quantcdn.io/oai/v1",
});

const resp = await client.chat.completions.create({
  model: "anthropic.claude-sonnet-4-6",
  messages: [{ role: "user", content: "Write a haiku about CDNs." }],
});
console.log(resp.choices[0].message.content);

curl

curl https://dashboard.quantcdn.io/oai/v1/chat/completions \
  -H "Authorization: Bearer qc_your_quant_token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic.claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Streaming

Set stream: true to receive Server-Sent Events. Each event is a chat.completion.chunk object, and the stream terminates with data: [DONE] — exactly as OpenAI’s SDKs expect.

stream = client.chat.completions.create(
    model="anthropic.claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Count to five."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Pass stream_options={"include_usage": true} to receive a final usage chunk before [DONE].

Tool (function) calling

Standard OpenAI tools are supported, including tool_choice:

resp = client.chat.completions.create(
    model="anthropic.claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What's the weather in Melbourne?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {"city": {"type": "string"}},
                "required": ["city"],
            },
        },
    }],
    tool_choice="auto",
)
print(resp.choices[0].message.tool_calls)

tool_choice accepts "auto", "none", "required", or {"type": "function", "function": {"name": "..."}}. When you force a tool (required or a specific function), the tool call is returned to your client to execute — the OpenAI-standard behaviour.

Models

List the models available to your organisation and pass an id as the model field:

curl https://dashboard.quantcdn.io/oai/v1/models \
  -H "Authorization: Bearer qc_your_quant_token"

Model ids are Quant/Bedrock ids such as anthropic.claude-sonnet-4-6, amazon.nova-micro-v1:0, or amazon.titan-embed-text-v2:0.

Compatibility notes

The /oai/v1 surface targets the stateless drop-in tier of the OpenAI API. The following are intentional differences:

Single choice — responses always contain one choice; n > 1 is not supported.
Ignored parameters — frequency_penalty, presence_penalty, seed, logprobs, and logit_bias are accepted but not applied.
finish_reason — streaming responses report stop rather than length on max-token truncation (buffered responses report length correctly).
Not included — image generation, audio, moderations, the Assistants/Responses APIs, and batches. Use the native Quant AI API for agents, sessions, vector database, workflows, and other stateful features.

For everything beyond the drop-in tier — agents, orchestrations, skills, vector search, governance — see the AI API reference.