# OpenAI-Compatible Endpoint

PromptShuttle exposes an OpenAI-compatible chat completion endpoint. If you already use the OpenAI SDK, you can switch to PromptShuttle by changing the base URL and API key — no other code changes needed.

## Endpoint

```
POST /api/v1/chat/completions
```

## Request

The request body follows the [OpenAI Chat Completion](https://platform.openai.com/docs/api-reference/chat/create) format with PromptShuttle extensions.

### Required fields

| Field      | Type   | Description                                                        |
| ---------- | ------ | ------------------------------------------------------------------ |
| `model`    | string | Model identifier in `provider/model` format (e.g. `openai/gpt-4o`) |
| `messages` | array  | Array of message objects (see [Messages](#messages) below)         |

### Optional fields

| Field             | Type    | Description                                                                                                     |
| ----------------- | ------- | --------------------------------------------------------------------------------------------------------------- |
| `temperature`     | float   | Sampling temperature (0-2). Lower = more deterministic.                                                         |
| `top_p`           | float   | Nucleus sampling threshold.                                                                                     |
| `max_tokens`      | integer | Maximum tokens to generate.                                                                                     |
| `seed`            | integer | Seed for deterministic sampling (provider support varies).                                                      |
| `stream`          | boolean | Enable SSE streaming. Default `false`. See [Streaming](https://docs.promptshuttle.com/api-reference/streaming). |
| `stream_options`  | object  | Streaming configuration (see below).                                                                            |
| `tools`           | array   | Tool definitions in OpenAI format.                                                                              |
| `tool_choice`     | string  | Tool selection mode: `"auto"` (default), `"none"`, `"required"`, or a specific tool.                            |
| `response_format` | object  | Structured output format. Use `{ "type": "json_schema", "json_schema": { ... } }`.                              |
| `user`            | string  | End-user identifier. Used for per-customer tracking if `X-Shuttle-Customer-Id` header is not set.               |

### PromptShuttle extensions

These fields are non-standard and specific to PromptShuttle:

| Field         | Type   | Description                                                                                         |
| ------------- | ------ | --------------------------------------------------------------------------------------------------- |
| `x_log_level` | string | Override logging level: `"Trace"`, `"Debug"`, `"Information"`, `"Warning"`, `"Error"`, `"Critical"` |
| `x_nonce`     | string | Opaque cache-bust string. Included in the response-cache hash but never sent to providers.          |

### Custom headers

| Header                  | Description                                                 |
| ----------------------- | ----------------------------------------------------------- |
| `X-Shuttle-Customer-Id` | End-customer identifier for per-customer usage attribution. |
| `X-Shuttle-Debug-Url`   | Attach a debug URL tag to the request for tracing.          |

## Messages

Each message in the `messages` array has:

| Field     | Type   | Values                              |
| --------- | ------ | ----------------------------------- |
| `role`    | string | `"user"`, `"assistant"`, `"system"` |
| `content` | array  | Array of content parts              |

### Content parts

Each content part has a `type` field:

**Text content:**

```json
{ "type": "text", "text": "Your prompt here" }
```

**Image content (URL):**

```json
{ "type": "image_url", "image_url": { "url": "https://example.com/image.png" } }
```

**Image content (base64 data URL):**

```json
{ "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBOR..." } }
```

Data URLs are automatically parsed into base64 + media type for providers that require it (e.g. Gemini).

## Response

### Synchronous response

```json
{
  "id": "request_id",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}
```

### Streaming response

When `stream: true`, the endpoint returns Server-Sent Events. See [Streaming (SSE)](https://docs.promptshuttle.com/api-reference/streaming) for the full event reference.

## Stream options

When streaming, you can configure the event stream:

| Field                   | Type    | Default | Description                                          |
| ----------------------- | ------- | ------- | ---------------------------------------------------- |
| `include_usage`         | boolean | `true`  | Emit periodic `usage.update` events.                 |
| `usage_interval_ms`     | integer | `5000`  | Interval between usage updates (milliseconds).       |
| `include_tool_results`  | boolean | `false` | Include full tool results in events (verbose).       |
| `include_agent_results` | boolean | `false` | Include full agent results in events (verbose).      |
| `heartbeat_interval_ms` | integer | `30000` | Keep-alive heartbeat interval (milliseconds).        |
| `event_types`           | array   | all     | Filter to specific event types (supports wildcards). |

## Examples

### Basic completion

```bash
curl -X POST https://app.promptshuttle.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": [{"type": "text", "text": "What is the capital of France?"}]}
    ],
    "temperature": 0.3,
    "max_tokens": 100
  }'
```

### With customer tracking

```bash
curl -X POST https://app.promptshuttle.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Shuttle-Customer-Id: user_456" \
  -d '{
    "model": "anthropic/claude-sonnet-4-20250514",
    "messages": [
      {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
      {"role": "user", "content": [{"type": "text", "text": "Explain quantum computing simply."}]}
    ]
  }'
```

### Using OpenAI Python SDK

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://app.promptshuttle.com/api/v1",
    api_key="YOUR_API_KEY",
)

# Streaming
stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Write a haiku about APIs"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
```

### With structured output

```bash
curl -X POST https://app.promptshuttle.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": [{"type": "text", "text": "List the planets in our solar system"}]}
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "planets",
        "schema": {
          "type": "object",
          "properties": {
            "planets": {
              "type": "array",
              "items": { "type": "string" }
            }
          },
          "required": ["planets"]
        }
      }
    }
  }'
```

## Other endpoints

### List models

```
GET /api/v1/models/descriptors
```

Returns all supported models with capabilities, pricing, and token limits.

### List providers

```
GET /api/v1/providers
```

Returns all configured LLM providers and their supported models.

### Direct inference (PromptShuttle native)

```
POST /api/v1/inference
```

A simpler endpoint for direct LLM inference without OpenAI response formatting. Supports the same models and features.

| Field                 | Type    | Description                                  |
| --------------------- | ------- | -------------------------------------------- |
| `messages`            | array   | ChatMessage array (PromptShuttle format)     |
| `model`               | string  | Model identifier                             |
| `environment`         | string  | Environment name for logging                 |
| `temperature`         | float   | Sampling temperature                         |
| `top_p`               | float   | Nucleus sampling                             |
| `top_k`               | integer | Top-k sampling (supported by some providers) |
| `max_tokens`          | integer | Max output tokens                            |
| `seed`                | integer | Deterministic seed                           |
| `max_thinking_tokens` | integer | Extended thinking budget (reasoning models)  |
| `response_schema`     | object  | JSON Schema for structured outputs           |
| `vendor_tools`        | array   | Provider-native tools (e.g. `web_search`)    |
| `tags`                | array   | Tags for filtering in invocation log         |
| `is_debug`            | boolean | Enable verbose logging                       |
