Chat inference via API Gateway (buffered responses) with multimodal support

POST

/api/v3/organizations/{organisation}/ai/chat

QuantCDN Public Cloud
QuantGov Cloud

Sends requests to the AI API Gateway endpoint which buffers responses. Supports text, images, videos, and documents via base64 encoding. * * Multimodal Support: * - Text: Simple string content * - Images: Base64-encoded PNG, JPEG, GIF, WebP (up to 25MB) * - Videos: Base64-encoded MP4, MOV, WebM, etc. (up to 25MB) * - Documents: Base64-encoded PDF, DOCX, CSV, etc. (up to 25MB) * * Supported Models: * - Amazon Nova Lite, Micro, Pro (all support multimodal) * - Claude models (text only) * * Usage Tips: * - Use base64 encoding for images/videos < 5-10MB * - Place media before text prompts for best results * - Label multiple media files (e.g., ‘Image 1:’, ‘Image 2:’) * - Maximum 25MB total payload size * * Response Patterns: * - Text-only: Returns simple text response when no tools requested * - Single tool: Returns toolUse object when AI requests one tool * - Multiple tools: Returns toolUse array when AI requests multiple tools * - Auto-execute sync: Automatically executes tool and returns final text response * - Auto-execute async: Returns toolUse with executionId and status for polling

Authorizations

BearerAuth

Parameters

Path Parameters

organisation

required

string

The organisation ID

Request Body^required

Chat request with optional multimodal content blocks

object

messages

required

Array of chat messages. Content can be a simple string or an array of content blocks for multimodal input.

Array<object>

>= 1 items

object

role

required

string

Allowed values: user assistant system

content

required

One of:

string
Array

Simple text message

string

Multimodal content blocks (text, image, video, document)

Array

One of:

object

text

required

string

What's in this image?

object

document

required

object

format

required

string

Allowed values: pdf csv docx xlsx

name

required

string

report.pdf

source

required

object

bytes

required

Base64-encoded document data

string format: byte

modelId

required

Model ID. Use Nova models for multimodal support.

string

amazon.nova-lite-v1:0

temperature

number

default: 0.7 <= 2

maxTokens

integer

default: 1024 >= 1 <= 8192

topP

number

<= 1

stream

Ignored in buffered mode, always returns complete response

boolean

systemPrompt

Optional custom system prompt. When tools are enabled, this is prepended with tool usage guidance.

string

stopSequences

Custom stop sequences

Array<string>

<= 4 items

responseFormat

Structured JSON output (Claude 3.5 Sonnet v1/v2, Nova Pro)

object

type

string

Allowed values: json

jsonSchema

JSON Schema defining expected structure

object

toolConfig

Function calling configuration (Claude 3+, Nova Pro)

object

tools

Array<object>

object

toolSpec

object

name

string

description

string

inputSchema

object

json

JSON Schema for function parameters

object

autoExecute

When true, backend automatically executes tools and feeds results back to AI. For async tools (e.g., image generation), returns executionId for polling. Security: Use allowedTools to whitelist which tools can auto-execute.

boolean

allowedTools

Whitelist of tool names that can be auto-executed. Required when autoExecute is true for security. Example: [‘get_weather’, ‘generate_image’]

Array<string>

Responses

200

Chat inference completed (buffered response)

object

response

Assistant’s response message. May contain text content and/or tool use requests.

object

role

string

Allowed values: assistant

assistant

content

Text response content

string

I'll help you with that.

toolUse

One of:

object
Array<object>

Single tool request

object

toolUseId

string

abc123

name

string

get_weather

input

object

{
  "location": "Sydney"
}

executionId

Present for async tools with autoExecute

string

exec_abc123def456

status

Present for async tools with autoExecute

string

Allowed values: pending running complete failed

result

Tool execution result (only present when status=‘complete’ for sync auto-executed tools). For async tools, poll /tools/executions/{executionId}

object

images

Base64 data URIs for images

Array<string>

s3Urls

Signed S3 URLs for downloads

Array<string>

model

Model used for generation

string

amazon.nova-pro-v1:0

requestId

Unique request identifier

string

req-abc123

finishReason

Why the model stopped generating

string

Allowed values: stop length content_filter tool_use

usage

Token usage information

object

inputTokens

Number of input tokens

integer

outputTokens

Number of output tokens

integer

totalTokens

Total tokens consumed

integer

Example

{
  "response": {
    "role": "assistant",
    "content": "The capital of Australia is Canberra."
  },
  "model": "amazon.nova-lite-v1:0",
  "requestId": "req-abc123",
  "finishReason": "stop",
  "usage": {
    "inputTokens": 12,
    "outputTokens": 8,
    "totalTokens": 20
  }
}

500

Failed to perform chat inference

Chat inference via API Gateway (buffered responses) with multimodal support

Authorizations

Parameters

Path Parameters

Request Body required

Responses

200

Example

500

Request Body^required