piper
Manage gaming mice profiles on Linux
TLDR
Output a WAV [f]ile using a text-to-speech [m]odel (assuming a configuration file at model_path + .json)
Output a WAV [f]ile using a [m]odel and specifying its JSON [c]onfig file
Select a particular speaker in a voice with multiple speakers by specifying the speaker's ID number
Stream the output to the mpv media player
Speak twice as fast, with huge gaps between sentences
SYNOPSIS
piper [OPTIONS] --model <MODEL_PATH> [--] [<TEXT>]...
PARAMETERS
--model <MODEL>
Path to .onnx model file (required)
--output_file <OUTPUT>
Output WAV/MP3 file path; default: output.wav
--speaker <ID>
Numeric speaker ID; default: 0
--speaker-name <NAME>
Speaker name for multi-speaker models
--length-scale <SCALE>
Control speech speed; default: 1.0 (higher=faster)
--noise-scale <SCALE>
Phoneme-level variance; default: 0.667
--noise-w <SCALE>
Word-level prosody noise; default: 0.8
--split-sentences
Split input on punctuation; default: on
--no-split-sentences
Disable sentence splitting
--cuda
Enable CUDA acceleration (if available)
--precision <fp32|fp16>
Model precision; default: fp32
--output-format <wav|mp3>
Output format; default: wav
--threads <NUM>
Worker threads; default: 1
-h, --help
Show help
-V, --version
Print version
DESCRIPTION
Piper is a high-quality, lightweight neural text-to-speech (TTS) system designed for low-resource devices like Raspberry Pi, but performant on desktops too.
It uses VITS-based models trained on datasets like LJSpeech, supporting dozens of languages and voices. Piper synthesizes speech entirely on CPU (with optional CUDA), producing natural-sounding audio from text input.
Key features include multi-speaker support, customizable prosody via length/noise scales, sentence splitting, and output in WAV or MP3. Models are compact (~50-150MB), enabling offline use. Ideal for embedded apps, assistants, or accessibility tools.
Unlike cloud TTS, Piper ensures privacy and low latency (<200ms on capable hardware). Voices sound expressive with good intonation. Install via pip (pip install piper-tts) or distro packages; download models from rhasspy.github.io/piper.
CAVEATS
Requires .onnx model files (download separately); CPU-only by default (slow on low-end hardware); MP3 needs libsndfile; no built-in phonemizer for some languages.
EXAMPLE USAGE
piper --model en_US-lessac-medium.onnx --output_file speech.wav "Hello, world!"
Plays via aplay speech.wav.
MODEL SOURCES
Download from https://rhasspy.github.io/piper/; multilingual support (en, de, fr, es, etc.).
HISTORY
Developed by the Rhasspy team (2021-2022) as part of open-source voice assistants for Home Assistant. Evolved from Mozilla TTS; v1.0 in 2022 with ONNX Runtime for speed/portability. Active community contributions; now at v1.2+ with more voices.


