Fish Audio

Studio-grade AI text-to-speech and voice cloning with 2M+ voices in 8+ languages. Clone any voice from 10 seconds of audio, control emotions with 60+ tags and integrate via API with sub-300ms latency.

Audio & VoiceFree trialNew

What is Fish Audio?

Studio-grade AI text-to-speech and voice cloning with 2M+ voices in 8+ languages. Clone any voice from 10 seconds of audio, control emotions with 60+ tags and integrate via API with sub-300ms latency.

Key highlights

  • 2M+ community voices with 60+ emotion tags for granular tone control
  • Clone any voice from just 10 seconds of audio - 8 languages supported
  • Developer API with sub-300ms streaming latency and pay-as-you-go pricing