Chat inference via streaming endpoint (true HTTP streaming) with multimodal support

POST

/api/v3/organizations/{organisation}/ai/chat/stream

QuantCDN Public Cloud
QuantGov Cloud

Streams responses from the AI streaming subdomain using Server-Sent Events (SSE). Tokens are streamed in real-time as they are generated. * * Multimodal Support: * - Text: Simple string content * - Images: Base64-encoded PNG, JPEG, GIF, WebP (up to 25MB) * - Videos: Base64-encoded MP4, MOV, WebM, etc. (up to 25MB) * - Documents: Base64-encoded PDF, DOCX, CSV, etc. (up to 25MB) * * Supported Models: * - Amazon Nova Lite, Micro, Pro (all support multimodal) * - Claude models (text only) * * Usage Tips: * - Use base64 encoding for images/videos < 5-10MB * - Place media before text prompts for best results * - Label multiple media files (e.g., ‘Image 1:’, ‘Image 2:’) * - Maximum 25MB total payload size * - Streaming works with all content types (text, image, video, document)

Authorizations

BearerAuth

Parameters

Path Parameters

organisation

required

string

The organisation ID

Request Body^required

Chat request with optional multimodal content blocks

object

messages

required

Array of chat messages. Content can be a simple string or an array of content blocks for multimodal input.

Array<object>

>= 1 items

object

role

required

string

Allowed values: user assistant system

content

required

One of:

string
Array<object>

Simple text message

string

modelId

required

Model ID. Use Nova models for multimodal support.

string

amazon.nova-lite-v1:0

temperature

number

default: 0.7 <= 2

maxTokens

integer

default: 1024 >= 1 <= 8192

topP

number

<= 1

systemPrompt

Optional custom system prompt. When tools are enabled, this is prepended with tool usage guidance.

string

stopSequences

Custom stop sequences

Array<string>

<= 4 items

responseFormat

Structured JSON output (Claude 3.5 Sonnet v1/v2, Nova Pro)

object

type

string

Allowed values: json

jsonSchema

JSON Schema defining expected structure

object

toolConfig

Function calling configuration (Claude 3+, Nova Pro)

object

tools

Array<object>

object

toolSpec

object

name

string

description

string

inputSchema

object

json

JSON Schema for function parameters

object

autoExecute

When true, backend automatically executes tools and feeds results back to AI. For async tools (e.g., image generation), returns executionId for polling. Security: Use allowedTools to whitelist which tools can auto-execute.

boolean

allowedTools

Whitelist of tool names that can be auto-executed. Required when autoExecute is true for security. Example: [‘get_weather’, ‘generate_image’]

Array<string>

Responses

200

Streaming response (text/event-stream)

Server-Sent Events stream with chunks of generated text. Format: id, event, data lines separated by newlines.

string

Example

id: chunk-0
event: start
data: {"requestId":"abc123","model":"amazon.nova-lite-v1:0","streaming":true}

id: chunk-1
event: content
data: {"delta":"Hello","complete":false}

id: chunk-2
event: content
data: {"delta":" there!","complete":false}

id: chunk-3
event: done
data: {"complete":true,"usage":{"inputTokens":8,"outputTokens":15,"totalTokens":23}}

500

Failed to perform streaming inference

Chat inference via streaming endpoint (true HTTP streaming) with multimodal support

Authorizations

Parameters

Path Parameters

Request Body required

Responses

200

Example

500

Request Body^required