Chat inference via streaming endpoint (true HTTP streaming) with multimodal support
POST /api/v3/organizations/{organisation}/ai/chat/stream
Streams responses from the AI streaming subdomain using Server-Sent Events (SSE). Tokens are streamed in real-time as they are generated.
*
* Execution Modes:
* - Streaming Mode (default): Real-time SSE token-by-token responses
* - Async Mode: Set async: true for long-running tasks with polling (202 response)
*
* Async/Durable Mode (async: true):
* - Returns immediately with requestId and pollUrl (HTTP 202)
* - Uses AWS Lambda Durable Functions for long-running inference
* - Supports client-executed tools via waiting_callback state
* - Poll /ai/chat/executions/{requestId} for status
* - Submit client tool results via /ai/chat/callback
*
* Multimodal Support:
* - Text: Simple string content
* - Images: Base64-encoded PNG, JPEG, GIF, WebP (up to 25MB)
* - Videos: Base64-encoded MP4, MOV, WebM, etc. (up to 25MB)
* - Documents: Base64-encoded PDF, DOCX, CSV, etc. (up to 25MB)
*
* Supported Models (Multimodal):
* - Claude 4.5 Series: Sonnet 4.5, Haiku 4.5, Opus 4.5 (images, up to 20 per request)
* - Claude 3.5 Series: Sonnet v1/v2 (images, up to 20 per request)
* - Amazon Nova: Lite, Pro, Micro (images, videos, documents)
*
* Usage Tips:
* - Use base64 encoding for images/videos < 5-10MB
* - Place media before text prompts for best results
* - Label multiple media files (e.g., ‘Image 1:’, ‘Image 2:’)
* - Maximum 25MB total payload size
* - Streaming works with all content types (text, image, video, document)
Authorizations
Parameters
Path Parameters
The organisation ID
Request Body required
Chat request with optional multimodal content blocks
object
Array of chat messages. Content can be a simple string or an array of content blocks for multimodal input.
object
Simple text message
Multimodal content blocks
Text, image, video, or document block
object
Model ID. Use Nova models for multimodal support.
amazon.nova-lite-v1:0Max tokens. Claude 4.5 supports up to 64k.
Optional custom system prompt. When tools are enabled, this is prepended with tool usage guidance.
Custom stop sequences
Structured JSON output (Claude 3.5 Sonnet v1/v2, Nova Pro)
object
JSON Schema defining expected structure
object
Function calling configuration (Claude 3+, Nova Pro)
object
object
object
object
JSON Schema for function parameters
object
When true, backend automatically executes tools and feeds results back to AI. For async tools (e.g., image generation), returns executionId for polling. Security: Use allowedTools to whitelist which tools can auto-execute.
Whitelist of tool names that can be auto-executed. Required when autoExecute is true for security. Example: [‘get_weather’, ‘generate_image’]
Optional session ID for conversation continuity. Omit to use stateless mode, include to continue an existing session.
Enable async/durable execution mode. When true, returns 202 with pollUrl instead of streaming. Use for long-running inference, client-executed tools, or operations >30 seconds.
Top-level convenience alias for toolConfig.allowedTools. Whitelists which tools can be auto-executed.
[ "get_weather", "generate_image"]AWS Bedrock guardrails configuration for content filtering and safety.
object
Guardrail identifier from AWS Bedrock
Guardrail version
Enable guardrail trace output
Responses
200
Streaming response (text/event-stream, sync mode)
Server-Sent Events stream with chunks of generated text. Format: id, event, data lines separated by newlines.
Example
id: chunk-0event: startdata: {"requestId":"abc123","model":"amazon.nova-lite-v1:0","streaming":true}
id: chunk-1event: contentdata: {"delta":"Hello","complete":false}
id: chunk-2event: contentdata: {"delta":" there!","complete":false}
id: chunk-3event: donedata: {"complete":true,"usage":{"inputTokens":8,"outputTokens":15,"totalTokens":23}}202
Async execution started (when async: true in request)
object
Unique request identifier for polling
XkdVWiEfSwMEPrw=Session ID for conversation continuity
session-1769056496430Initial execution status
queuedHuman-readable status message
Execution started. Poll the status endpoint for updates.URL to poll for execution status
/ai/chat/executions/XkdVWiEfSwMEPrw%3D500
Failed to perform streaming inference