> ## Documentation Index
> Fetch the complete documentation index at: https://novita.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# MOSS TTS v1.5

MOSS TTS v1.5 text-to-speech API. Supports both JSON body and multipart (reference audio) requests; returns a complete WAV file or a streaming PCM audio binary. The response is a binary audio stream. When testing with curl, add --output to save it to a file (e.g. --output moss-tts-local.wav), otherwise the binary content prints directly to the terminal.

## Request Headers

<ParamField header="Content-Type" type="string" required={true}>
  Supports: `application/json`, `multipart/form-data`
</ParamField>

<ParamField header="Authorization" type="string" required={true}>
  Bearer authentication format, for example: Bearer \{\{API Key}}.
</ParamField>

## Request Body

<Tabs>
  <Tab title="application/json">
    <ParamField body="input" type="string" required={true}>
      Required, the text to synthesize. Submitting a complete sentence or paragraph at once is recommended.
    </ParamField>

    <ParamField body="model" type="string" required={true} default="MOSS-TTS">
      Required, fixed value MOSS-TTS to select MOSS TTS v1.5.

      Optional values: `MOSS-TTS`
    </ParamField>

    <ParamField body="stream" type="boolean" default={false}>
      Optional, false returns a complete WAV; true returns a PCM stream suitable for play-while-generating.
    </ParamField>

    <ParamField body="response_format" type="string" default="wav">
      Optional, use wav for non-streaming; must be pcm when stream=true.

      Optional values: `wav`, `pcm`
    </ParamField>
  </Tab>

  <Tab title="multipart/form-data">
    <ParamField body="ref_audio" type="string">
      Reference audio file field; upload WAV or MP3.
    </ParamField>

    <ParamField body="request_json" type="string" required={true}>
      Required, a string whose content is the JSON body fields, e.g. model, input, stream, response\_format.
    </ParamField>
  </Tab>
</Tabs>

## Response

Returns audio binary on success. Non-streaming is a complete WAV; streaming is raw PCM chunks. Streaming PCM format is described by response headers (defaults to 48000Hz, mono, 16-bit little-endian).

Format: `binary`
