Skip to content

Chat inference via API Gateway (buffered responses) with multimodal support

POST
/api/v3/organizations/{organisation}/ai/chat

Sends requests to the AI API Gateway endpoint which buffers responses. Supports text, images, videos, and documents via base64 encoding. * * Multimodal Support: * - Text: Simple string content * - Images: Base64-encoded PNG, JPEG, GIF, WebP (up to 25MB) * - Videos: Base64-encoded MP4, MOV, WebM, etc. (up to 25MB) * - Documents: Base64-encoded PDF, DOCX, CSV, etc. (up to 25MB) * * Supported Models: * - Amazon Nova Lite, Micro, Pro (all support multimodal) * - Claude models (text only) * * Usage Tips: * - Use base64 encoding for images/videos < 5-10MB * - Place media before text prompts for best results * - Label multiple media files (e.g., ‘Image 1:’, ‘Image 2:’) * - Maximum 25MB total payload size * * Response Patterns: * - Text-only: Returns simple text response when no tools requested * - Single tool: Returns toolUse object when AI requests one tool * - Multiple tools: Returns toolUse array when AI requests multiple tools * - Auto-execute sync: Automatically executes tool and returns final text response * - Auto-execute async: Returns toolUse with executionId and status for polling

Authorizations

Parameters

Path Parameters

organisation
required
string

The organisation ID

Request Body required

Chat request with optional multimodal content blocks

object
messages
required

Array of chat messages. Content can be a simple string or an array of content blocks for multimodal input.

Array<object>
>= 1 items
object
role
required
string
Allowed values: user assistant system
content
required
One of:

Simple text message

string
modelId
required

Model ID. Use Nova models for multimodal support.

string
amazon.nova-lite-v1:0
temperature
number
default: 0.7 <= 2
maxTokens
integer
default: 1024 >= 1 <= 8192
topP
number
<= 1
stream

Ignored in buffered mode, always returns complete response

boolean
systemPrompt

Optional custom system prompt. When tools are enabled, this is prepended with tool usage guidance.

string
stopSequences

Custom stop sequences

Array<string>
<= 4 items
responseFormat

Structured JSON output (Claude 3.5 Sonnet v1/v2, Nova Pro)

object
type
string
Allowed values: json
jsonSchema

JSON Schema defining expected structure

object
toolConfig

Function calling configuration (Claude 3+, Nova Pro)

object
tools
Array<object>
object
toolSpec
object
name
string
description
string
inputSchema
object
json

JSON Schema for function parameters

object
autoExecute

When true, backend automatically executes tools and feeds results back to AI. For async tools (e.g., image generation), returns executionId for polling. Security: Use allowedTools to whitelist which tools can auto-execute.

boolean
allowedTools

Whitelist of tool names that can be auto-executed. Required when autoExecute is true for security. Example: [‘get_weather’, ‘generate_image’]

Array<string>

Responses

200

Chat inference completed (buffered response)

object
response

Assistant’s response message. May contain text content and/or tool use requests.

object
role
string
Allowed values: assistant
assistant
content

Text response content

string
I'll help you with that.
toolUse
One of:

Single tool request

object
toolUseId
string
abc123
name
string
get_weather
input
object
{
"location": "Sydney"
}
executionId

Present for async tools with autoExecute

string
exec_abc123def456
status

Present for async tools with autoExecute

string
Allowed values: pending running complete failed
result

Tool execution result (only present when status=‘complete’ for sync auto-executed tools). For async tools, poll /tools/executions/{executionId}

object
images

Base64 data URIs for images

Array<string>
s3Urls

Signed S3 URLs for downloads

Array<string>
model

Model used for generation

string
amazon.nova-pro-v1:0
requestId

Unique request identifier

string
req-abc123
finishReason

Why the model stopped generating

string
Allowed values: stop length content_filter tool_use
usage

Token usage information

object
inputTokens

Number of input tokens

integer
25
outputTokens

Number of output tokens

integer
150
totalTokens

Total tokens consumed

integer
175
Example
{
"response": {
"role": "assistant",
"content": "The capital of Australia is Canberra."
},
"model": "amazon.nova-lite-v1:0",
"requestId": "req-abc123",
"finishReason": "stop",
"usage": {
"inputTokens": 12,
"outputTokens": 8,
"totalTokens": 20
}
}

500

Failed to perform chat inference