Best Veo 3 Prompts for Cinematic Video With Native Audio
Google Veo 3 generates 8-second clips with synced dialogue, sound effects, and ambient sound built in. These copy-paste prompts show you exactly how to write the subject, camera move, lighting, and an explicit audio line so the model nails both picture and sound.
In short: This page contains 21 copy-paste ready prompts, organized into 5 categories with a description and pro tip for each. The first 15 prompts are free instantly โ no signup needed. Hand-curated and tested by the AI Academy team.
Cinematic Scenes with Dialogue
5 promptsRain-soaked street confession
1/21A medium close-up of a young woman in a soaked trench coat standing under a flickering streetlamp at night, rain falling hard around her. Slow dolly-in toward her face as she looks up. Cinematic, shallow depth of field, cool blue and amber color grade, neon reflections on wet asphalt. She says, calmly and quietly: "I'm not running anymore." Audio: steady heavy rain, distant thunder rumble, the soft buzz of the streetlamp, her clear measured voice with natural reverb off the wet street.
A moody single-character moment that leans on Veo 3's lip-sync and ambient audio. The quoted line plus the explicit audio cue keep voice and rain balanced.
Pro tip: Describe the speaker's tone ("calmly and quietly") inside the audio line, not just the words. Veo 3 uses tone descriptors to set vocal delivery and prevents a flat read.
Two-shot diner conversation
2/21A warm two-shot of two friends sitting across a booth in a cozy 1970s diner, late afternoon sun through the window. Static camera, eye-level, 35mm look, soft golden light, gentle film grain. The first friend leans in and asks: "So are you actually going to do it?" The second smiles and replies: "I already booked the ticket." Audio: low diner chatter, clinking cutlery, a coffee machine hiss in the background, both voices warm and conversational, no music.
A back-and-forth exchange that tests multi-speaker dialogue. Veo 3 handles short two-person conversations well when each line is clearly attributed.
Pro tip: Keep each spoken line under about eight words. Long monologues in an 8-second clip get rushed or clipped; two short lines pace naturally.
Astronaut window monologue
3/21Interior of a quiet spacecraft cabin, an astronaut floating gently beside a round window with Earth glowing blue outside. Slow push-in past their shoulder toward the window. Cinematic sci-fi, soft rim lighting from the cabin instruments, deep shadows, realistic zero-gravity hair and fabric motion. The astronaut whispers in awe: "From up here, it all looks so peaceful." Audio: low hum of life-support systems, faint electronic beeps, the astronaut's soft breathing inside the helmet, hushed reverent voice.
A contemplative sci-fi beat that showcases realistic physics (floating hair, fabric) alongside intimate whispered audio.
Pro tip: Pair a slow push-in with a whisper. Matching camera energy to vocal energy reads as deliberate direction rather than random motion.
Detective reveal close-up
4/21Extreme close-up of a weathered detective's eyes in a dim interrogation room, a single overhead bulb casting a hard pool of light. Camera slowly racks focus from a photo on the table up to the detective's face. High-contrast noir lighting, deep blacks, subtle haze in the air. The detective mutters: "This must be the key." Audio: a low room tone, the faint buzz of the bulb, a distant clock ticking, gravelly low voice with close-mic intimacy.
A noir reveal that uses a focus rack and tight framing to build tension before the line lands.
Pro tip: Name the focus transition explicitly ("racks focus from X to Y"). Veo 3 follows directed focus pulls and it makes the reveal feel intentional.
Mountaintop reunion
5/21A wide-to-medium shot of two hikers meeting at a windy mountain summit at golden hour, vast ranges behind them. Handheld camera drifts slightly, then settles. Epic natural light, lens flare from the low sun, crisp cold atmosphere. One hiker shouts over the wind: "You actually made it!" The other laughs: "Told you I would." Audio: strong gusting wind, fabric flapping, footsteps on loose gravel, both voices raised and breathless with genuine warmth.
An outdoor reunion that demands raised, wind-fighting voices. Veo 3 modulates vocal effort when you cue the environment.
Pro tip: When the environment is loud, write the dialogue as raised or shouted. The model adjusts vocal projection to match the ambient noise you describe.
Prompts get you started. Tutorials level you up.
A growing library of 300+ hands-on AI tutorials. New tutorials added every week.
Product & Ad Spots
4 promptsSneaker hero spin
6/21A glossy product shot of a single white-and-orange running sneaker rotating slowly on a matte dark pedestal in a softbox studio. Smooth 360-degree orbit camera, macro detail on the laces and sole texture, clean rim light, subtle reflections on the floor. A confident voiceover says: "Built for the long run." Audio: a subtle whoosh as the camera orbits, a soft cinematic bass swell, crisp confident voiceover, minimal and premium.
A clean studio product spin built for ads. The orbit plus macro detail highlights texture while a short VO carries the tagline.
Pro tip: For ads, write the voiceover as a short tagline, not narration. One punchy line under six words fits an 8-second spot and feels like a real commercial.
Skincare morning ritual
7/21A bright lifestyle shot of a woman applying a dab of cream from a frosted glass jar in a sunlit bathroom, soft morning light through sheer curtains. Gentle slow-motion, soft focus background, warm clean color grade, dewy skin texture in close-up. Calm voiceover: "Start every morning glowing." Audio: soft ambient morning birdsong outside, a gentle water trickle, light airy background pad, warm reassuring voiceover.
An aspirational beauty spot using slow motion and soft natural light. The ambient morning sound reinforces the calm tone.
Pro tip: Add a tactile detail ("a dab of cream," "dewy skin texture"). Concrete material cues push Veo 3 toward believable product interaction instead of generic hands.
Coffee pour macro
8/21An extreme macro shot of rich espresso pouring into a clear glass over ice, crema swirling as the liquid hits, on a dark cafe counter. Locked-off camera, shallow depth of field, warm tungsten light, steam rising. A barista's hand sets the glass down at the end. Voiceover: "Bold, every single cup." Audio: the trickle and splash of liquid, ice gently cracking, faint cafe murmur, a smooth low voiceover at the end.
A sensory food-and-beverage macro that sells texture. The pouring and ice-crack SFX do most of the appetite appeal.
Pro tip: Lead the audio line with the hero sound (the pour) before ambient noise. Veo 3 weights audio roughly in the order you list it.
Tech gadget unboxing
9/21A clean overhead shot of two hands opening a minimalist white box to reveal a sleek silver smartwatch on a light wooden desk. Smooth top-down camera with a slight push-in as the lid lifts. Bright even studio light, soft shadows, premium minimal aesthetic. Voiceover: "Everything you need. Nothing you don't." Audio: the soft slide of the box lid, a gentle satisfying click, a light uplifting synth note, crisp modern voiceover.
A top-down unboxing reveal in the style of premium tech ads, with satisfying mechanical SFX.
Pro tip: Top-down with a slight push-in reads as polished commercial framing. Pure top-down can feel static, so always add a small camera move.
Nature & B-Roll
4 promptsMisty forest dawn aerial
10/21A sweeping aerial drone shot rising over a dense pine forest blanketed in low morning mist, the sun breaking through the trees on the horizon. Slow ascending crane-style move revealing the valley. Cinematic nature documentary look, soft warm dawn light, volumetric god rays, crisp detail on treetops. Audio: layered birdsong, a gentle breeze through pine needles, distant flowing water, no music, fully natural ambience.
A serene establishing aerial perfect for intros or B-roll. The natural-only audio keeps it documentary-authentic.
Pro tip: Write "no music, fully natural ambience" when you want clean B-roll. Veo 3 will otherwise often add a score you'll have to fight in the edit.
Ocean wave slow motion
11/21A close slow-motion shot of a single large turquoise wave curling and breaking against dark volcanic rocks, spray catching the late-afternoon sun. Locked tripod with the action moving through frame, high frame rate look, backlit water glowing, fine mist droplets in the air. Audio: the deep roar and crash of the wave, hissing foam receding over rock, distant gulls, raw natural ocean ambience.
A high-impact slow-motion nature shot. The wave crash gives a strong, well-synced single audio event.
Pro tip: For slow motion, say "high frame rate look" rather than a number. Veo 3 responds to the descriptive cue more reliably than to specific fps values.
Desert timelapse-style sweep
12/21A wide tracking shot gliding low over rolling orange sand dunes at sunset, long shadows rippling across the ridges, a clear gradient sky from amber to deep blue. Smooth steady lateral dolly, anamorphic widescreen feel, warm directional light, fine grains of sand drifting in the wind. Audio: a soft continuous desert wind, faint sand hissing as it moves, deep spacious silence underneath, no dialogue.
An expansive desert B-roll sweep with minimalist, atmospheric audio that emphasizes scale.
Pro tip: Cue "deep spacious silence underneath" to get clean, sparse audio. Naming silence as a layer stops the model from over-filling the soundscape.
Autumn park stroll POV
13/21A first-person POV walking shot down a tree-lined park path carpeted in red and gold autumn leaves, dappled afternoon sunlight flickering through the branches. Smooth steady walking camera with natural gentle bob, warm seasonal color grade, soft bokeh in the distance. Audio: leaves crunching underfoot with each step, a light rustling breeze, faraway laughter, calm relaxed ambience, no voiceover.
A cozy POV B-roll clip. The footstep crunch synced to the walking motion is a strong native-audio showcase.
Pro tip: When using a POV walking shot, explicitly mention footstep sound. Veo 3 will sync the crunch to the camera bob, which sells the first-person feel.
Like these prompts? There are full tutorials behind them.
Learn the workflows, not just the prompts. 300+ easy-to-follow tutorials inside AI Academy โ and growing every week.
Characters & Storytelling
4 promptsStreet musician moment
14/21A medium shot of an older street musician playing an acoustic guitar on a cobblestone square at dusk, a small crowd blurred behind them, warm string lights overhead. Slow arc around the musician, cinematic, soft warm light, shallow depth of field, gentle film grain. The musician looks up and smiles, saying: "Music keeps me young." Audio: a warm fingerpicked acoustic guitar melody, faint crowd murmur, the musician's warm gentle voice, distant city ambience.
A character vignette where the diegetic guitar and the spoken line coexist. Veo 3 can layer performed instrument audio under dialogue.
Pro tip: When a character plays an instrument, describe the music as diegetic in the audio line. This keeps it sounding like it comes from the scene, not a backing track.
Child wonder discovery
15/21A low-angle medium shot of a young child crouching in a green garden, gently cupping their hands around a glowing firefly at twilight. Slow push-in to the child's amazed face. Soft magical lighting, warm bokeh of more fireflies behind, dreamy storybook color grade. The child whispers in wonder: "It's like a tiny star." Audio: soft evening crickets, a faint breeze in the grass, the child's hushed delighted voice, gentle warm ambience.
An emotive storytelling beat that pairs a low angle with a child's whispered line for an intimate, magical feel.
Pro tip: Use a low angle for child characters to put the viewer in their world. The shift in perspective adds emotional point of view automatically.
Chef plating finale
16/21A close-up of a focused chef's hands placing a final microgreen garnish on an elegant plated dish in a bustling professional kitchen, stainless steel and warm pendant lights behind. Smooth slight tilt-up from the plate to the chef's satisfied face. Cinematic, rich warm tones, shallow depth of field, gentle steam rising. The chef nods and says: "That's the one." Audio: a busy kitchen ambience, pans sizzling, the soft clink of tweezers on the plate, the chef's confident quiet voice.
A craft-focused character moment. The detailed kitchen SFX plus a short confident line builds a satisfying payoff.
Pro tip: Tilt-up from a detail to a face is a reliable storytelling move in 8 seconds. It links the action and the reaction in one continuous beat.
Mechanic garage banter
17/21A medium shot of a friendly mechanic in coveralls wiping their hands on a rag beside a vintage car in a sunlit garage, tools on a pegboard behind them. Static handheld camera with a slight natural sway, warm daylight from the open garage door, realistic dust motes in the light. The mechanic grins and says: "She'll run like new." Audio: a faint radio playing softly in the background, metal tools clinking, the mechanic's warm easygoing voice, garage room tone.
A grounded, relatable character clip with everyday ambience and a single warm line of dialogue.
Pro tip: A faint background radio is a great way to add realism without competing with dialogue. Keep it explicitly "soft" so it sits under the voice.
ASMR & Sound-Led Clips
4 promptsCrackling fireplace ASMR
18/21An extreme close-up of orange embers and flames crackling in a stone fireplace at night, a warm cozy living room blurred behind. Locked-off camera, very shallow depth of field, sparks drifting upward, rich warm amber glow. Audio: a detailed crisp fire crackle and pop, soft hissing of burning wood, a faint settling log, deep cozy room tone, no music and no dialogue, pure ASMR fireplace ambience.
A sound-first relaxation clip where the crackle is the whole point. Veo 3 renders detailed, layered fire audio.
Pro tip: For ASMR, repeat "no music and no dialogue" and label it "pure ASMR." Being explicit twice strongly steers Veo 3 toward a clean, voice-free soundscape.
Knife on cutting board
19/21A top-down close-up of a sharp knife slicing cleanly through a ripe red tomato on a wooden cutting board, juice glistening, soft kitchen daylight. Locked overhead camera, macro detail on the blade and tomato flesh, soft natural shadows. Audio: a crisp rhythmic chop with each slice, the gentle squelch of the tomato, the soft tap of the blade on wood, quiet kitchen ambience, satisfying and clean, no music.
A satisfying food-prep ASMR clip. The rhythmic, synced chopping sounds are the hero element.
Pro tip: Describe the rhythm of the sound ("rhythmic chop with each slice"). Cueing rhythm helps Veo 3 sync each audio hit to the visible blade motion.
Rain on window
20/21A close static shot of raindrops trickling and merging down a cold window pane, a soft-focus warm room behind with a blurred lamp. Locked camera, shallow depth of field, cool blue exterior contrasting warm interior glow, droplets catching light. Audio: gentle steady rain pattering on glass, occasional heavier droplets, distant muffled thunder, a faint warm room hum, deeply calming, no music or voice.
A classic cozy-rain ASMR loop emphasizing the contrast of cool rain and a warm interior.
Pro tip: Contrast a cool exterior with a warm interior in one frame. The visual warmth plus rain audio is what makes these clips feel comforting and shareable.
Sand sifting through hands
21/21An extreme close-up of fine golden sand slowly pouring and sifting through a pair of open hands on a sunlit beach, individual grains catching the light as they fall. Locked macro camera, very shallow depth of field, warm beach light, soft bokeh of waves behind. Audio: a soft granular hiss of falling sand, a faint distant wave wash, a light gentle breeze, soothing minimal ambience, no music, no dialogue.
A tactile macro ASMR clip where the delicate sand hiss carries the whole sensory experience.
Pro tip: Macro plus a granular, textural sound (sand, sugar, beads) is the ideal ASMR pairing for Veo 3. The closer the framing, the more detailed the audio it renders.
Frequently Asked Questions
Prompts are the starting line. Tutorials are the finish.
A growing library of 300+ hands-on tutorials on ChatGPT, Claude, Midjourney, and 50+ AI tools. New tutorials added every week.
7-day free trial. Cancel anytime.