flite
lightweight text-to-speech synthesis
TLDR
SYNOPSIS
flite [options] [text]
DESCRIPTION
flite (Festival Lite) is a small, fast text-to-speech synthesis engine developed at Carnegie Mellon University. It converts text to audio using concatenative synthesis, producing speech from recorded fragments.The engine is designed for embedded systems with limited resources, providing reasonable quality without large runtime requirements. Multiple voices are available with different characteristics.flite works offline without internet connectivity, making it suitable for accessibility applications and audio generation.
PARAMETERS
TEXT
Text to speak. If it contains a space, it is treated as a literal text string rather than a filename.-t TEXT
Explicitly set input text string.-f FILE
Explicitly set input filename.-o FILE
Output audio to file (WAV format). If omitted or set to "play", audio is played on the default audio device. Set to "none" to discard output.-p PHONES
Synthesize input as phonemes.-voice NAME
Voice to use (name, filename, or URL).-voicedir DIR
Directory containing voice data.-lv
List available voices.-ssml
Read input text/file in SSML mode.-b
Benchmark mode.-l
Loop endlessly.-s F=V
Set feature to value (guesses type).-v
Verbose mode.--version
Display version number.--help
Display help information.
CAVEATS
Limited voice naturalness compared to neural TTS. Few voice options. Output quality varies by text type.
HISTORY
flite was developed at Carnegie Mellon University as a lightweight version of the Festival speech synthesis system. It's used in accessibility applications, embedded systems, and offline TTS scenarios.
