> ## Documentation Index
> Fetch the complete documentation index at: https://novita.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# MiniMax Quick Voice Cloning

This interface will allow users to efficiently clone a voice, as specified in the audio file. Potential use cases include: IP voice cloning, voice cloning, and other scenarios where quick voice cloning is required.
The quick voice clone generated by this API is temporary. If you wish to permanently retain a cloned voice, you must use this voice in any T2A speech synthesis API within 168 hours (7 days) (excluding preview actions within this API). Otherwise, the cloned voice will be deleted.

Applicable scenarios for this API include: IP voice cloning, timbre cloning, and other use cases requiring rapid voice cloning.

Notes:

* The uploaded audio file must be in mp3, m4a, or wav format.
* The duration of the uploaded audio must be at least 10 seconds and no more than 5 minutes.
* The uploaded audio file size must not exceed 20 MB.

## Request Headers

<ParamField header="Content-Type" type="string" required={true}>
  Enum: `application/json`
</ParamField>

<ParamField header="Authorization" type="string" required={true}>
  Bearer authentication format, for example: Bearer \{\{API Key}}.
</ParamField>

## Request Body

<ParamField body="audio_url" type="string" required={true}>
  The URL of the audio file to be cloned. Supported formats: mp3, m4a, wav.
</ParamField>

<ParamField body="clone_prompt">
  Voice cloning parameters. Providing this parameter can help improve the similarity and stability of the synthesized voice.

  If this parameter is used, you must also upload a short sample audio (duration less than 8 seconds) and the corresponding transcript. Supported audio formats: mp3, m4a, wav.

  <Expandable title="properties">
    <ParamField body="prompt_audio_url" type="string" required={true}>
      Audio prompt parameter. The URL of the sample audio. Duration must be less than 8 seconds.
    </ParamField>

    <ParamField body="prompt_text" type="string" required={true}>
      Audio prompt parameter. The transcript corresponding to the sample audio. The text must match the audio content, and end with punctuation.
    </ParamField>
  </Expandable>
</ParamField>

<ParamField body="text" type="string">
  Voice cloning preview parameter. The model will use the cloned voice to synthesize this text and return the result as an audio URL for preview. Maximum 2000 characters. Note: Preview will be charged according to the number of characters, at the same rate as T2A APIs.
</ParamField>

<ParamField body="model" type="string">
  Voice cloning preview parameter. Specifies the speech model to use for preview. Required if the "text" field is provided.<br />
  Options: `speech-02-hd`, `speech-02-turbo`, `speech-2.5-hd-preview`, `speech-2.5-turbo-preview`, `speech-2.6-hd`, `speech-2.6-turbo`, `speech-2.8-hd`, `speech-2.8-turbo`
</ParamField>

<ParamField body="accuracy" type="float">
  Voice cloning parameter. Value range: \[0, 1]. If provided, sets the text validation accuracy threshold. Default is 0.7 if not specified.
</ParamField>

<ParamField body="need_noise_reduction" type="bool">
  Voice cloning parameter. Whether to enable noise reduction. Defaults to false if not specified.
</ParamField>

<ParamField body="need_volume_normalization" type="bool">
  Voice cloning parameter. Whether to enable volume normalization. Defaults to false if not specified.
</ParamField>

## Response

<ResponseField name="demo_audio_url" type="string">
  If both the preview text (<code>text</code>) and preview model (<code>model</code>) are provided in the request body, this parameter returns the preview audio as a URL.
</ResponseField>

<ResponseField name="voice_id" type="string">
  The generated <code>voice\_id</code>.
</ResponseField>

## Example

Below is an example of how to use the Minimax Voice Cloning API to clone a voice.

`Request:`

```bash theme={"system"}
curl \
-X POST https://api.novita.ai/v3/minimax-voice-cloning \
-H "Authorization: Bearer $your_api_key" \
-H "Content-Type: application/json" \
-d '{
  "audio_url": "https://example.com/voice.mp3",
  "text": "Audio generation technology is evolving rapidly, enabling the creation of speech, music, and sound effects from text or data inputs. It supports applications in media, accessibility, customer service, and content creation. With improved quality and customization, these tools are increasingly integrated into digital platforms across various industries.",
  "model": "speech-01-hd",
  "need_noise_reduction": true,
  "need_volume_normalization": true
}'
```

`Response:`

```js theme={"system"}
{
  "demo_audio_url": "https://demo.com/audio.mp3", // Audio sample
  "voice_id": "xxxxxxx" // Generated voice_id
}
```
