LinuxCommandLibrary

flite

Convert text to speech

TLDR

List all available voices

$ flite -lv
copy

Convert a text string to speech
$ flite -t "[string]"
copy

Convert the contents of a file to speech
$ flite -f [path/to/file.txt]
copy

Use the specified voice
$ flite -voice [file://path/to/filename.flitevox|url]
copy

Store output into a wav file
$ flite -voice [file://path/to/filename.flitevox|url] -f [path/to/file.txt] -o [output.wav]
copy

Display version
$ flite --version
copy

SYNOPSIS

flite [options] [-t <text> | -f <file>] [-o <output_file.wav>]
flite -l
flite -i

PARAMETERS

-t <text>
    Synthesizes speech from the provided text string.

-f <file>
    Synthesizes speech from text read from the specified input file.

-o <output_file>
    Outputs the synthesized speech to the specified WAV audio file. If omitted, audio may be played directly or raw samples output to stdout.

-v <voice>
    Selects the voice to be used for synthesis. Use -l to list available voices.

-s <float>
    Sets the speech rate. A value of 1.0 is normal, higher values increase speed, lower values decrease speed.

-p <float>
    Adjusts the speech pitch. A value of 1.0 is normal, higher values increase pitch, lower values decrease pitch.

-l
    Lists all available voices installed with flite.

-i
    Enters interactive mode, reading text from standard input until EOF.

-h
    Displays help information and exits.

DESCRIPTION

flite (Festival-Lite) is a small, fast, run-time text-to-speech (TTS) synthesis engine. Developed at Carnegie Mellon University (CMU), it's designed specifically for embedded systems and applications where a full-featured TTS system like Festival would be too resource-intensive or slow. It converts text input, either from the command line or a file, into spoken audio. flite prioritizes efficiency and a compact footprint, making it ideal for devices with limited memory and processing power. While its voice quality might not always match larger, more complex synthesizers, it provides a highly functional and remarkably fast solution for generating speech programmatically.

CAVEATS

flite is designed for speed and a small footprint, which can sometimes come at the cost of the most natural-sounding voice quality compared to larger, more complex TTS systems. It primarily supports English voices, and its prosodic control is more limited than research-grade synthesizers. Performance can vary depending on the specific voice chosen and the underlying system's audio configuration.

TYPICAL USAGE SCENARIOS

Due to its small size and efficiency, flite is commonly used in embedded Linux devices, mobile applications (though less common now with cloud TTS), interactive voice response (IVR) systems, and lightweight command-line utilities requiring quick speech output. It's often bundled with minimalist Linux distributions or custom device firmwares.

HISTORY

flite (Festival-Lite) was developed by Alan W. Black and others at Carnegie Mellon University (CMU) as a lightweight, faster version of the Festival Speech Synthesis System. The project aimed to create a robust and efficient text-to-speech engine suitable for embedded systems and applications where resource constraints or real-time performance were critical. Its development focused on optimizing the synthesis process to minimize memory usage and CPU cycles, making it a prominent choice for integrating speech capabilities into small devices and applications.

SEE ALSO

festival(1), espeak(1), mbrola(1)

Copied to clipboard