> ## Documentation Index
> Fetch the complete documentation index at: https://novita.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Interleaved Thinking Supporty

> **Last Updated**: 2025-12-03 <br />
> **Status**: Supported (OpenAI-Compatible)

## 1. Overview

**Interleaved Thinking** is an advanced reasoning framework that enables models to perform explicit reasoning steps between tool calls.

Models with Interleaved Thinking can:

* Reflect on the current environment and tool outputs
* Decide the next action based on updated reasoning
* Maintain a continuous reasoning chain across multiple tool invocations
* Provide transparent, inspectable multi-step thinking through `reasoning_details` or `reasoning_content`

This capability transforms traditional function-calling into **agent-level tool use**, making complex workflows more accurate, reliable, and context-aware.

Novita fully supports Interleaved Thinking for all models that natively expose reasoning streams (e.g., MiniMax-M2 and other OpenAI-compatible reasoning models).

## 2. Key Concepts

### 2.1 Interleaving

Instead of executing a single reasoning phase followed by a tool call, the model performs:

```BASH theme={"system"}
Reason → Tool Call → Observe → Reason → Tool Call → ...
```

This allows the model to adjust its strategy dynamically based on previous tool outputs.

### 2.2 Reasoning Details (`reasoning_details`)

For some models, the content of the model's thinking will be returned in the form of a separate structure:

```JSON theme={"system"}
"reasoning_details": [
  {
    "type": "reasoning.text",
    "format": "openai-responses-v1",
    "text": "Model’s step-by-step reasoning..."
  }
]
```

For such models, Novita supports returning this field in both streaming and non-streaming modes.

### 2.3 Conversation Memory Requirement

To maintain reasoning continuity: You **must append the model’s full response** including `reasoning_details`, `tool_calls`, and `content` to subsequent `messages`.

Failing to preserve the chain may result in:

* Incorrect tool use
* Lost reasoning context
* Repeated or circular tool calls
* Reduced reliability

This requirement mirrors OpenAI’s reasoning APIs.

## 3. API Behavior

### 3.1 Request Format

No changes are required on the user side.
Interleaved Thinking works with the standard OpenAI-compatible Chat Completions API.

### 3.2 Response Format

The model may return the following fields:

* `reasoning_content`: original thinking content
* `reasoning_details`: structured reasoning segments, this field is optioanl
* `tool_calls`: tool invocation plan
* `content`: natural language output

These extend the standard OpenAI format.

## 4. Example Request (MiniMax-M2)

```JSON theme={"system"}
{
  "model": "minimax/minimax-m2",
  "messages": [
    {
      "role": "user",
      "content": "How's the weather in San Francisco?"
    },
    {
      "role": "assistant",
      "name": "MiniMax AI",
      "content": "",
      "tool_calls": [
        {
          "id": "call_function_asqvfevfc8af_1",
          "type": "function",
          "function": {
            "name": "get_weather",
            "arguments": "{\"location\": \"San Francisco, US\"}"
          }
        }
      ],
      "reasoning_details": [
        {
          "type": "reasoning.text",
          "id": "reasoning-text-1",
          "format": "openai-responses-v1",
          "index": 0,
          "text": "The user is asking about the weather in San Francisco..."
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "call_function_asqvfevfc8af_1",
      "content": "24℃, sunny"
    }
  ],
  "stream": true,
  "reasoning_split": true,
  "max_tokens": 1024,
  "temperature": 0.7,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather for a specific location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          },
          "required": ["location"]
        }
      }
    }
  ]
}
```

## 5. Example Response (Non-Streaming)

```JSON theme={"system"}
{
  "id": "07a4dedfdb1498b045498dfd42497639",
  "object": "chat.completion",
  "created": 1764303147,
  "model": "MiniMax",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "name": "MiniMax",
        "tool_calls": [
          {
            "index": 0,
            "id": "call_function_9w7wq1j9zmpl_1",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"ShangHai\"}"
            }
          }
        ],
        "reasoning_content": "The user asked for Shanghai weather...",
        "reasoning_details": [
          {
            "type": "reasoning.text",
            "text": "The user is asking about the weather in Shanghai...",
            "id": "reasoning-text-1",
            "format": "openai-responses-v1",
            "index": 0
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}
```

## 6. Streaming Response Example

```JSON theme={"system"}
{
  "id": "664c5ad870c1888fbcbd267d9829e354",
  "object": "chat.completion.chunk",
  "created": 1764303504,
  "model": "minimax-m2",
  "choices": [
    {
      "index": 0,
      "delta": {
        "role": "assistant",
        "reasoning_content": "...\n\nThe user has specifically asked...",
        "reasoning_details": [
          {
            "type": "reasoning.text",
            "text": ".\n\nThe user has specifically asked...",
            "id": "reasoning-text-1",
            "format": "openai-responses-v1",
            "index": 0
          }
        ]
      },
      "finish_reason": null
    }
  ]
}
```

## 7. Developer Notes

### 7.1 Models That Support Interleaved Thinking

All models exposing `reasoning_details` through OpenAI-compatible APIs, including:

* MiniMax-M2
* (Upcoming) Novita Reasoning Series
* Other reasoning-enabled partner models

### 7.2 Pricing

Billing is based on reasoning tokens, following the model’s pricing rules.
`reasoning_details` will increase token usage.

### 7.3 Error Handling

You may encounter:

* Missing tool parameters
* Recursive or repeated tool calls
* Incorrect assumptions in the reasoning phase

Ensure your application validates tool arguments and handles model errors gracefully.

## 8. Best Practices

**✓ Always include full model messages in the next request**

Include:

* `content`
* `tool_calls`
* `reasoning_details`

**✓ Enable streaming for long-chain reasoning**

Streaming allows the client to:

* Monitor the reasoning process
* Detect incorrect tool plans early
* Provide faster user feedback

**✓ Combine with server-side state machines for stability**

For production systems, we recommend pairing Interleaved Thinking with deterministic guardrails, e.g.:

* Parameter validators
* Execution sandbox
* Maximum recursion safeguards

## 9. Summary

Interleaved Thinking significantly enhances multi-step reasoning and tool-use reliability:

* Transparent and inspectable step-wise reasoning
* Adaptive planning between tool invocations
* Stronger context retention across long workflows
* Fully compatible with OpenAI-style Chat Completions API

Novita will continue expanding support for advanced reasoning models, bringing agent-level intelligence to the API layer.
