> ## Documentation Index > Fetch the complete documentation index at: https://novita.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # Fish Audio Text to Speech For best results, upload reference audio using the [create model](/api-reference/model-apis-fish-audio-voice-cloning) before using this one. This improves speech quality and reduces latency. Fish Audio converts text into speech. Audio formats supported: * WAV / PCM * Sample Rate: 8kHz, 16kHz, 24kHz, 32kHz, 44.1kHz * Default Sample Rate: 44.1kHz * 16-bit, mono * MP3 * Sample Rate: 32kHz, 44.1kHz * Default Sample Rate: 44.1kHz * mono * Bitrate: 64kbps, 128kbps (default), 192kbps * Opus * Sample Rate: 48kHz * Default Sample Rate: 48kHz * mono * Bitrate: -1000 (auto), 24kbps, 32kbps (default), 48kbps, 64kbps ## Request Headers Enum: `application/json` Bearer authentication format, for example: Bearer \{\{API Key}}. Specify which TTS model to use. Only supports model: `s1`. ## Request Body Text to be converted to speech. Controls randomness in the speech generation. Higher values (e.g., 1.0) make the output more random, while lower values (e.g., 0.1) make it more deterministic. We recommend `0.9` for `s1` model. Required range: `0 <= x <= 1` Controls diversity via nucleus sampling. Lower values (e.g., 0.1) make the output more focused, while higher values (e.g., 1.0) allow more diversity. We recommend `0.9` for `s1` model. Required range: `0 <= x <= 1` References to be used for the speech, this requires MessagePack serialization, this will override reference\_voices and reference\_texts. Reference audio file. Reference text corresponding to the audio. ID of the reference model to be used for the speech. Prosody to be used for the speech. Speech speed control. Speech volume control. Chunk length to be used for the speech. Required range: `100 <= x <= 300` Whether to normalize the speech, this will reduce the latency but may reduce performance on numbers and dates. Format to be used for the speech. Available options: `wav`, `pcm`, `mp3`, `opus` Sample rate to be used for the speech. MP3 Bitrate to be used for the speech. Available options: `64`, `128`, `192` Opus Bitrate to be used for the speech. Available options: `-1000`, `24`, `32`, `48`, `64` Latency to be used for the speech, balanced will reduce the latency but may lead to performance degradation. Available options: `normal`, `balanced` ## Response The API will directly return the audio stream in the format specified by the `format` parameter (default: mp3).