EnglishModels OverviewMiniMax (Voice Clone & TTS)

MiniMax (Voice Clone and TTS)

Overview

MiniMax provides two speech routes in StoryFlow:

  1. Voice Clone: clone timbre from a reference sample.
  2. TTS: synthesize speech from text using preset or custom voice IDs.

1) Voice Clone

Inputs

  • prompt (demo text, required)
  • reference_audio (required)

Audio requirements (current config)

  • MP3 / M4A / WAV
  • roughly 10s to 5min
  • up to 20MB

Common parameters

  • voice_model (speech-2.5-hd-preview)
  • accuracy
  • need_noise_reduction
  • need_volume_normalization

2) MiniMax TTS

Inputs

  • prompt (required)

Common parameters

  • voice_model: speech-2.6-turbo / speech-2.6-hd
  • voice_id: preset system voice
  • use_custom_voice + custom_voice_id
  • emotion
  • speed / vol / pitch

Tips

  • Use clean single-speaker samples for cloning.
  • Split long scripts for better pacing control.
  • Keep voice settings consistent across scenes.