# OpenAI-Compatible Endpoint

PromptShuttle exposes an OpenAI-compatible chat completion endpoint. If you already use the OpenAI SDK, you can switch to PromptShuttle by changing the base URL and API key — no other code changes needed.

## Endpoint

```
POST /api/v1/chat/completions
```

## Request

The request body follows the [OpenAI Chat Completion](https://platform.openai.com/docs/api-reference/chat/create) format with PromptShuttle extensions.

### Required fields

| Field      | Type   | Description                                                        |
| ---------- | ------ | ------------------------------------------------------------------ |
| `model`    | string | Model identifier in `provider/model` format (e.g. `openai/gpt-4o`) |
| `messages` | array  | Array of message objects (see [Messages](#messages) below)         |

### Optional fields

| Field             | Type    | Description                                                                                       |
| ----------------- | ------- | ------------------------------------------------------------------------------------------------- |
| `temperature`     | float   | Sampling temperature (0-2). Lower = more deterministic.                                           |
| `top_p`           | float   | Nucleus sampling threshold.                                                                       |
| `max_tokens`      | integer | Maximum tokens to generate.                                                                       |
| `seed`            | integer | Seed for deterministic sampling (provider support varies).                                        |
| `stream`          | boolean | Enable SSE streaming. Default `false`. See [Streaming](/api-reference/streaming.md).              |
| `stream_options`  | object  | Streaming configuration (see below).                                                              |
| `tools`           | array   | Tool definitions in OpenAI format.                                                                |
| `tool_choice`     | string  | Tool selection mode: `"auto"` (default), `"none"`, `"required"`, or a specific tool.              |
| `response_format` | object  | Structured output format. Use `{ "type": "json_schema", "json_schema": { ... } }`.                |
| `user`            | string  | End-user identifier. Used for per-customer tracking if `X-Shuttle-Customer-Id` header is not set. |

### PromptShuttle extensions

These fields are non-standard and specific to PromptShuttle:

| Field         | Type   | Description                                                                                         |
| ------------- | ------ | --------------------------------------------------------------------------------------------------- |
| `x_log_level` | string | Override logging level: `"Trace"`, `"Debug"`, `"Information"`, `"Warning"`, `"Error"`, `"Critical"` |
| `x_nonce`     | string | Opaque cache-bust string. Included in the response-cache hash but never sent to providers.          |

### Custom headers

| Header                  | Description                                                 |
| ----------------------- | ----------------------------------------------------------- |
| `X-Shuttle-Customer-Id` | End-customer identifier for per-customer usage attribution. |
| `X-Shuttle-Debug-Url`   | Attach a debug URL tag to the request for tracing.          |

## Messages

Each message in the `messages` array has:

| Field     | Type   | Values                              |
| --------- | ------ | ----------------------------------- |
| `role`    | string | `"user"`, `"assistant"`, `"system"` |
| `content` | array  | Array of content parts              |

### Content parts

Each content part has a `type` field:

**Text content:**

```json
{ "type": "text", "text": "Your prompt here" }
```

**Image content (URL):**

```json
{ "type": "image_url", "image_url": { "url": "https://example.com/image.png" } }
```

**Image content (base64 data URL):**

```json
{ "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBOR..." } }
```

Data URLs are automatically parsed into base64 + media type for providers that require it (e.g. Gemini).

## Response

### Synchronous response

```json
{
  "id": "request_id",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}
```

### Streaming response

When `stream: true`, the endpoint returns Server-Sent Events. See [Streaming (SSE)](/api-reference/streaming.md) for the full event reference.

## Stream options

When streaming, you can configure the event stream:

| Field                   | Type    | Default | Description                                          |
| ----------------------- | ------- | ------- | ---------------------------------------------------- |
| `include_usage`         | boolean | `true`  | Emit periodic `usage.update` events.                 |
| `usage_interval_ms`     | integer | `5000`  | Interval between usage updates (milliseconds).       |
| `include_tool_results`  | boolean | `false` | Include full tool results in events (verbose).       |
| `include_agent_results` | boolean | `false` | Include full agent results in events (verbose).      |
| `heartbeat_interval_ms` | integer | `30000` | Keep-alive heartbeat interval (milliseconds).        |
| `event_types`           | array   | all     | Filter to specific event types (supports wildcards). |

## Examples

### Basic completion

```bash
curl -X POST https://app.promptshuttle.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": [{"type": "text", "text": "What is the capital of France?"}]}
    ],
    "temperature": 0.3,
    "max_tokens": 100
  }'
```

### With customer tracking

```bash
curl -X POST https://app.promptshuttle.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Shuttle-Customer-Id: user_456" \
  -d '{
    "model": "anthropic/claude-sonnet-4-20250514",
    "messages": [
      {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
      {"role": "user", "content": [{"type": "text", "text": "Explain quantum computing simply."}]}
    ]
  }'
```

### Using OpenAI Python SDK

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://app.promptshuttle.com/api/v1",
    api_key="YOUR_API_KEY",
)

# Streaming
stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Write a haiku about APIs"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
```

### With structured output

```bash
curl -X POST https://app.promptshuttle.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": [{"type": "text", "text": "List the planets in our solar system"}]}
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "planets",
        "schema": {
          "type": "object",
          "properties": {
            "planets": {
              "type": "array",
              "items": { "type": "string" }
            }
          },
          "required": ["planets"]
        }
      }
    }
  }'
```

## Other endpoints

### List models

```
GET /api/v1/models/descriptors
```

Returns all supported models with capabilities, pricing, and token limits.

### List providers

```
GET /api/v1/providers
```

Returns all configured LLM providers and their supported models.

### Direct inference (PromptShuttle native)

```
POST /api/v1/inference
```

A simpler endpoint for direct LLM inference without OpenAI response formatting. Supports the same models and features.

| Field                 | Type    | Description                                  |
| --------------------- | ------- | -------------------------------------------- |
| `messages`            | array   | ChatMessage array (PromptShuttle format)     |
| `model`               | string  | Model identifier                             |
| `environment`         | string  | Environment name for logging                 |
| `temperature`         | float   | Sampling temperature                         |
| `top_p`               | float   | Nucleus sampling                             |
| `top_k`               | integer | Top-k sampling (supported by some providers) |
| `max_tokens`          | integer | Max output tokens                            |
| `seed`                | integer | Deterministic seed                           |
| `max_thinking_tokens` | integer | Extended thinking budget (reasoning models)  |
| `response_schema`     | object  | JSON Schema for structured outputs           |
| `vendor_tools`        | array   | Provider-native tools (e.g. `web_search`)    |
| `tags`                | array   | Tags for filtering in invocation log         |
| `is_debug`            | boolean | Enable verbose logging                       |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.promptshuttle.com/api-reference/openai-compatible.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.