Chat inference via streaming endpoint (true HTTP streaming) with multimodal support
  POST /api/v3/organizations/{organisation}/ai/chat/stream    
 Streams responses from the AI streaming subdomain using Server-Sent Events (SSE). Tokens are streamed in real-time as they are generated. * * Multimodal Support: * - Text: Simple string content * - Images: Base64-encoded PNG, JPEG, GIF, WebP (up to 25MB) * - Videos: Base64-encoded MP4, MOV, WebM, etc. (up to 25MB) * - Documents: Base64-encoded PDF, DOCX, CSV, etc. (up to 25MB) * * Supported Models: * - Amazon Nova Lite, Micro, Pro (all support multimodal) * - Claude models (text only) * * Usage Tips: * - Use base64 encoding for images/videos < 5-10MB * - Place media before text prompts for best results * - Label multiple media files (e.g., ‘Image 1:’, ‘Image 2:’) * - Maximum 25MB total payload size * - Streaming works with all content types (text, image, video, document)
Authorizations
Parameters
Path Parameters
The organisation ID
Request Body required
Chat request with optional multimodal content blocks
object
Array of chat messages. Content can be a simple string or an array of content blocks for multimodal input.
object
Simple text message
Multimodal content blocks
Text, image, video, or document block
object
Model ID. Use Nova models for multimodal support.
amazon.nova-lite-v1:0Optional custom system prompt. When tools are enabled, this is prepended with tool usage guidance.
Custom stop sequences
Structured JSON output (Claude 3.5 Sonnet v1/v2, Nova Pro)
object
JSON Schema defining expected structure
object
Function calling configuration (Claude 3+, Nova Pro)
object
object
object
object
JSON Schema for function parameters
object
Responses
200
Streaming response (text/event-stream)
Server-Sent Events stream with chunks of generated text. Format: id, event, data lines separated by newlines.
Example
id: chunk-0event: startdata: {"requestId":"abc123","model":"amazon.nova-lite-v1:0","streaming":true}
id: chunk-1event: contentdata: {"delta":"Hello","complete":false}
id: chunk-2event: contentdata: {"delta":" there!","complete":false}
id: chunk-3event: donedata: {"complete":true,"usage":{"inputTokens":8,"outputTokens":15,"totalTokens":23}}500
Failed to perform streaming inference
