Chat inference via streaming endpoint (true HTTP streaming) with multimodal support
POST /api/v3/organizations/{organisation}/ai/chat/stream
Streams responses from the AI streaming subdomain using Server-Sent Events (SSE). Tokens are streamed in real-time as they are generated. * * Multimodal Support: * - Text: Simple string content * - Images: Base64-encoded PNG, JPEG, GIF, WebP (up to 25MB) * - Videos: Base64-encoded MP4, MOV, WebM, etc. (up to 25MB) * - Documents: Base64-encoded PDF, DOCX, CSV, etc. (up to 25MB) * * Supported Models: * - Amazon Nova Lite, Micro, Pro (all support multimodal) * - Claude models (text only) * * Usage Tips: * - Use base64 encoding for images/videos < 5-10MB * - Place media before text prompts for best results * - Label multiple media files (e.g., ‘Image 1:’, ‘Image 2:’) * - Maximum 25MB total payload size * - Streaming works with all content types (text, image, video, document)
Authorizations
Parameters
Path Parameters
The organisation ID
Request Body required
Chat request with optional multimodal content blocks
object
Array of chat messages. Content can be a simple string or an array of content blocks for multimodal input.
object
Simple text message
Multimodal content blocks
Text, image, video, or document block
object
Model ID. Use Nova models for multimodal support.
amazon.nova-lite-v1:0Optional custom system prompt. When tools are enabled, this is prepended with tool usage guidance.
Custom stop sequences
Structured JSON output (Claude 3.5 Sonnet v1/v2, Nova Pro)
object
JSON Schema defining expected structure
object
Function calling configuration (Claude 3+, Nova Pro)
object
object
object
object
JSON Schema for function parameters
object
When true, backend automatically executes tools and feeds results back to AI. For async tools (e.g., image generation), returns executionId for polling. Security: Use allowedTools to whitelist which tools can auto-execute.
Whitelist of tool names that can be auto-executed. Required when autoExecute is true for security. Example: [‘get_weather’, ‘generate_image’]
Responses
200
Streaming response (text/event-stream)
Server-Sent Events stream with chunks of generated text. Format: id, event, data lines separated by newlines.
Example
id: chunk-0event: startdata: {"requestId":"abc123","model":"amazon.nova-lite-v1:0","streaming":true}
id: chunk-1event: contentdata: {"delta":"Hello","complete":false}
id: chunk-2event: contentdata: {"delta":" there!","complete":false}
id: chunk-3event: donedata: {"complete":true,"usage":{"inputTokens":8,"outputTokens":15,"totalTokens":23}}500
Failed to perform streaming inference