LinuxCommandLibrary

bark

TLDR

Generate speech from text

$ python -m bark --text "[Hello, how are you?]" --output_filename [output.wav]
copy
Use a specific speaker preset
$ python -m bark --text "[Hello]" --output_filename [output.wav] --history_prompt [v2/en_speaker_6]
copy
Generate with emotions/effects
$ python -m bark --text "[laughs] Oh that's funny! [sighs]" --output_filename [output.wav]
copy
Generate in another language
$ python -m bark --text "[Bonjour le monde]" --output_filename [output.wav] --history_prompt [v2/fr_speaker_1]
copy
Generate with music notation
$ python -m bark --text "[♪ La la la ♪]" --output_filename [output.wav]
copy

SYNOPSIS

python -m bark --text text --output_filename file [options]

DESCRIPTION

Bark is a transformer-based text-to-audio model by Suno AI. Unlike traditional TTS, Bark generates highly expressive speech including laughter, sighs, breathing, crying, and even music.
Special tokens in the text control non-speech sounds: `[laughs]`, `[sighs]`, `[gasps]`, `[clears throat]`, and `[music]`. Musical notation with `♪` symbols can generate singing. Capitalizing words adds emphasis, and `...` adds hesitation.
Speaker presets select voice characteristics. Presets are available for multiple languages: English, German, Spanish, French, Hindi, Italian, Japanese, Korean, Polish, Portuguese, Russian, Turkish, and Chinese.
Install with `pip install suno-bark`. Models are downloaded automatically on first use. GPU (CUDA) is strongly recommended for reasonable generation speed.

PARAMETERS

--text TEXT

Input text to synthesize.
--output_filename FILE
Output audio file path (.wav).
--history_prompt PRESET
Speaker voice preset (e.g., v2/enspeaker0 through v2/enspeaker9).
--text_temp FLOAT
Text generation temperature (default: 0.7).
--waveform_temp FLOAT
Waveform generation temperature (default: 0.7).

CAVEATS

Slow on CPU (GPU strongly recommended). Large model downloads (~5GB). Output quality varies. Long text should be split into sentences. Not suitable for real-time synthesis. May produce unexpected audio artifacts.

HISTORY

Bark was released by Suno AI in April 2023 as an open-source text-to-audio model. Its ability to generate expressive speech with emotions and non-verbal sounds set it apart from conventional TTS systems. The model quickly gained popularity for creative audio generation.

SEE ALSO

piper(1), tts(1), espeak(1)

Copied to clipboard