espeak-ng

next-generation formant-based speech synthesis

TLDR

Speak text

$ espeak-ng "[Hello world]"

Use specific voice

$ espeak-ng -v [en-gb] "[Hello]"

Read from file

$ espeak-ng -f [document.txt]

Output to WAV

$ espeak-ng -w [output.wav] "[Hello]"

Adjust speaking rate

$ espeak-ng -s [175] "[Hello]"

List voices

$ espeak-ng --voices

Use phoneme input

$ espeak-ng -x "[h@l'oU]"

espeak-ng (eSpeak New Generation) is a fork of eSpeak with active development, additional features, and improved voice quality. It's a formant-based speech synthesizer supporting 100+ languages.
The tool provides text-to-speech capabilities for accessibility, voice assistants, and applications. It includes improvements in pronunciation rules, language support, and phoneme handling over the original eSpeak.
espeak-ng is the default TTS engine in many Linux distributions and speech synthesis frameworks.

PARAMETERS

WORDS

Text to speak.

-v VOICE

Select voice/language.

-f FILE

Read from file.

-w FILE

Write to WAV file.

-s SPEED

Words per minute.

-p PITCH

Pitch adjustment.

-x

Input is phonemes.

--voices

List voices.

--help

Display help information.

CAVEATS

Still sounds robotic (formant synthesis). Some advanced SSML features unsupported. Voice quality varies by language. Output format options limited.

HISTORY

espeak-ng was forked from eSpeak by Reece H. Dunn to continue development after the original project became inactive. It's now the actively maintained version used in most Linux distributions.

espeak-ng

next-generation formant-based speech synthesis

TLDR

SYNOPSIS

DESCRIPTION

PARAMETERS

CAVEATS

HISTORY

SEE ALSO

> TERMINAL_GEAR