Lyria 3, Google’s family of music generation models, is designed to give you granular control over vocals, instrumentation, and arrangement. So we spent weeks testing against every musical genre and use case we could imagine.
We put together this guide to share exactly what we learned and how you can get the best results.
What you’ll learn in this guide:
-
Model overview
-
Breakdown of tech specs
-
Best practices for effective prompting
-
The core prompting framework
-
Mastering vocals and lyrics
-
Advanced creative workflows
-
How Lyria 3 models work with other generative media models
Model overview
Lyria 3 and Lyria 3 Pro are music generation models designed to support your creative workflows. The models excel in three key areas:
-
Structural control: Prompt for specific elements like intros, verses, choruses, and bridges to build a complete arrangement.
-
High-quality audio: Both models deliver high-fidelity stereo audio
-
Precision control: Dictate structural changes using timed lyrics, descriptive tempo conditioning, and multimodal inputs.
Breakdown of tech specs for Lyria 3 and Lyria 3 Pro
Here is a breakdown of what the models can handle via the API on Vertex AI:
-
Track length: Lyria 3 generates 30-second long songs, ideal for rapid prototyping and short-form assets. Lyria 3 Pro supports compositions up to three minutes long.
-
Vocal support: Both models feature improved realism and expressiveness for vocals, supporting multi-vocal conditioning and generation in eight languages (English, German, Spanish, French, Hindi, Japanese, Korean, and Portuguese).
-
Controls and conditioning: Lyria 3 Pro includes advanced controls for timed lyrics and tempo control through natural language descriptions.
-
Multimodal inputs: You can generate music using text, PDF files, or up to 10 reference images.
-
Trust and safety: All outputs include SynthID watermarking and support the C2PA open standard for cryptographically signed metadata.
For more, visit Lyria 3 models card.
Best practices for effective prompting
There are a few guidelines to ensure your generated audio matches your intent:
-
Be descriptive and specific: Use adjectives to create a clear description. The more detail you provide, the better Lyria understands your prompt.
-
Reference genres and eras: Clearly state the musical category (for example, Rock or Pop) and stylistic timeframe (e.g. the 1950s, early 90s).
-
Specify key instruments: Mention the important instruments driving the track, or Lyria chooses defaults based on the genre.
-
Iterate: If the first result isn’t perfect, refine your prompt by adjusting keywords.
The core prompting framework
A simple list of keywords will generate great songs, but to control the models, use this framework.
[Genre and style] + [Mood] + [Instrumentation] + [Tempo and rhythm] + [Vocal style & language] + [Lyrics]
-
Genre and style: Define the primary category, for example, “cinematic orchestral fantasy”.
-
Mood: Describe the emotional intent, for example, “tense and suspenseful”.
-
Instrumentation: Name the specific instruments, for example, “guitar”, “piano”.
-
Tempo and rhythm: Set the speed, pace, and groove using descriptive terms, such as, “a fast, energetic pace with a driving beat”.
-
Instrumental vs. vocal: Specify “instrumental” to exclude vocals.
-
Vocal style & language: Specify gender, tone (e.g., raspy, smooth), delivery (e.g. rapping), and language.
-
Lyrics: Either provide a theme for Lyria to generate the words (e.g., “song about a cross-cultural connection”), or provide your exact lyrics in quotes for the model to perform.
Example prompt: “A romantic fusion of classic Bossa Nova and modern R&B. The mood is intimate, warm, and deeply affectionate. Features a gentle acoustic nylon-string guitar, warm electric piano chords, and a crisp, laid-back modern hip-hop drum beat. A slow, swaying tempo. Featuring a vocal duet: a smooth male vocalist singing in English, and a soft, breathy female vocalist singing in French. The lyrics are a beautiful love song about an undeniable, cross-cultural connection”






