OpenAI-Compatible Endpoint

PromptShuttle exposes an OpenAI-compatible chat completion endpoint. If you already use the OpenAI SDK, you can switch to PromptShuttle by changing the base URL and API key — no other code changes needed.

Endpoint

POST /api/v1/chat/completions

Request

The request body follows the OpenAI Chat Completionarrow-up-right format with PromptShuttle extensions.

Required fields

Field
Type
Description

model

string

Model identifier in provider/model format (e.g. openai/gpt-4o)

messages

array

Array of message objects (see Messages below)

Optional fields

Field
Type
Description

temperature

float

Sampling temperature (0-2). Lower = more deterministic.

top_p

float

Nucleus sampling threshold.

max_tokens

integer

Maximum tokens to generate.

seed

integer

Seed for deterministic sampling (provider support varies).

stream

boolean

Enable SSE streaming. Default false. See Streaming.

stream_options

object

Streaming configuration (see below).

tools

array

Tool definitions in OpenAI format.

tool_choice

string

Tool selection mode: "auto" (default), "none", "required", or a specific tool.

response_format

object

Structured output format. Use { "type": "json_schema", "json_schema": { ... } }.

user

string

End-user identifier. Used for per-customer tracking if X-Shuttle-Customer-Id header is not set.

PromptShuttle extensions

These fields are non-standard and specific to PromptShuttle:

Field
Type
Description

x_log_level

string

Override logging level: "Trace", "Debug", "Information", "Warning", "Error", "Critical"

x_nonce

string

Opaque cache-bust string. Included in the response-cache hash but never sent to providers.

Custom headers

Header
Description

X-Shuttle-Customer-Id

End-customer identifier for per-customer usage attribution.

X-Shuttle-Debug-Url

Attach a debug URL tag to the request for tracing.

Messages

Each message in the messages array has:

Field
Type
Values

role

string

"user", "assistant", "system"

content

array

Array of content parts

Content parts

Each content part has a type field:

Text content:

Image content (URL):

Image content (base64 data URL):

Data URLs are automatically parsed into base64 + media type for providers that require it (e.g. Gemini).

Response

Synchronous response

Streaming response

When stream: true, the endpoint returns Server-Sent Events. See Streaming (SSE) for the full event reference.

Stream options

When streaming, you can configure the event stream:

Field
Type
Default
Description

include_usage

boolean

true

Emit periodic usage.update events.

usage_interval_ms

integer

5000

Interval between usage updates (milliseconds).

include_tool_results

boolean

false

Include full tool results in events (verbose).

include_agent_results

boolean

false

Include full agent results in events (verbose).

heartbeat_interval_ms

integer

30000

Keep-alive heartbeat interval (milliseconds).

event_types

array

all

Filter to specific event types (supports wildcards).

Examples

Basic completion

With customer tracking

Using OpenAI Python SDK

With structured output

Other endpoints

List models

Returns all supported models with capabilities, pricing, and token limits.

List providers

Returns all configured LLM providers and their supported models.

Direct inference (PromptShuttle native)

A simpler endpoint for direct LLM inference without OpenAI response formatting. Supports the same models and features.

Field
Type
Description

messages

array

ChatMessage array (PromptShuttle format)

model

string

Model identifier

environment

string

Environment name for logging

temperature

float

Sampling temperature

top_p

float

Nucleus sampling

top_k

integer

Top-k sampling (supported by some providers)

max_tokens

integer

Max output tokens

seed

integer

Deterministic seed

max_thinking_tokens

integer

Extended thinking budget (reasoning models)

response_schema

object

JSON Schema for structured outputs

vendor_tools

array

Provider-native tools (e.g. web_search)

tags

array

Tags for filtering in invocation log

is_debug

boolean

Enable verbose logging

Last updated