Quick Summary:
- Prompting matters – Gemini gave way better results than ChatGPT when scripting for VEO 3.
- Output quality – Some scenes were impressive, but the overall video fell flat without strong storytelling or visuals.
- Time spent – It took me over 6 hours to create a 1-minute video, primarily due to character design, background tweaking, audio syncing, and render wait times.
- Audio features – The built-in voice (still in beta) wasn’t usable, random captions appeared, and syncing was painful. I used ElevenLabs instead.
- Creative control is limited – Tools like “Flow” sound great, but the results were icky unless I created everything inside Gemini.
- Is it worth it? – Not for casual users or non-creators yet. But if you love experimenting and have patience, there’s future potential.
Recently, I tested Google’s VEO 3, a $200/month tool built into the Gemini app that generates videos from text prompts. I wanted to see how it compared to other text-to-video tools like Sora or Runway. Spoiler: the tech is cool, the process is messy, and my first video flopped, but the learning curve was worth it.
If you’re considering diving into AI video tools, here’s my honest experience to help you decide if it’s right for you (or at least avoid a few common mistakes).
What Is Google VEO?
VEO 3 is Google’s most advanced generative video model to date. It’s built into the Gemini app and allows users to create short films and visuals with realistic motion, cinematic lighting, and evolving camera angles all from a simple text prompt. As of now, it’s available via waitlist, but I got access through my Gemini Advanced subscription.
The tool has seen massive growth since integrating image-to-video and prompting features directly in Gemini, making it a worthy competitor to OpenAI’s Sora (which is still not widely available).
[Source: Technology.org article on VEO 3’s launch and capabilities]
Gemini Outperforms ChatGPT for Prompting VEO
When I tried scripting my prompt inside ChatGPT, the output from VEO wasn’t as strong. I assumed my prompt-writing skills would carry over, but Gemini clearly understands its own sibling better.
The scenes made more sense, and it just “clicked” better. My advice: if you’re using VEO, start your script in Gemini.
Voiceovers: Proceed with Caution
VEO 3 now lets you add AI voice to your videos. It’s… not great yet. The tone wasn’t right, and syncing the voice to the scenes? Even worse. I ended up using ElevenLabs for the voiceover. Much better tone control, but the syncing still ate up a huge chunk of my time.
Tip: If your video has narration, prep your full voiceover before rendering your visuals. Syncing after the fact is brutal.
Real Talk: My Video Took 5–6 Hours
From prompt scripting to testing VEO’s output, adjusting scenes, trying voiceovers, and syncing it all… this video took me nearly 6 hours to complete.
That’s not a time win. But here’s the thing: now that I know what works (and what doesn’t), the next one should take under 90 minutes. The key takeaway? VEO is not plug-and-play yet. It requires:
- Understanding how to write effective visual prompts
- Knowing how to structure a story visually
- Having patience for edits and render delays
And if you’re not a seasoned video storyteller? Like me! You’ll feel it.
Why This Took Me 6 Hours (Yes, 6)
Let’s talk about why this single video took me over six hours to complete, because it wasn’t just about scripting and hitting “generate.” The real time suck? Getting the character, setting, and voice right. I went through several versions of my AI character “Moxie” (you’ll see the flops embedded below), and the challenge was real. Creating something that felt right visually and matched the tone I was going for wasn’t easy.
I tested Gemini prompts vs. ChatGPT prompts extensively, and it was night and day. Gemini “understood” VEO 3 better and got me closer to the result I was after. I even tried using “Flow,” the feature that lets you upload a starting image, but the image I LOVED I created with ChatGPT. And again, the two didn’t play nice together. I also tried Midjourney for the starting image – TOTAL miss. The vibe was off, the look was off… just, no.
For Context, the video idea was to create “Moxie” a meerkat who was hired as a summer intern at eJenn Solutions. Moxie is an aspiring DJ with visions of Dj’ing in Ibiza. Her job at eJenn solutions, is to pair our client videos with music. The following image was a flop – the VIDEO images we used were on point though!

Audio is still in beta in VEO 3, and while the idea is great, the execution isn’t quite there yet. Captions randomly appeared on some outputs, and strangely, they didn’t even match my script. Still scratching my head on that one.
The bulk of my time went into:
- Creating a character that matched my vision
- Getting the background and vibe right
- Troubleshooting the AI voice and syncing
- Waiting for renders when VEO 3 servers were busy
So yes, there’s power here. However, the workflow isn’t yet smooth. It’s like having a high-tech espresso machine with no instructions and no coffee to go with it. Looks great, but you’ll spend hours trying to get something usable.
Final Verdict
VEO 3 is powerful, but it’s not magic. It still requires human creativity, some technical know-how, and a lot of trial and error, especially if you’re trying to produce something with a voiceover and coherent pacing.
Would I use it again and pay $200 each month? Yes, because I see the potential. However, for small business owners seeking a quick “text-to-video” solution to repurpose blog posts or social captions, we’re not there yet. Not unless you have time to tweak.
Here’s my finished video, yes, I know it’s rough, but it’s messy action forward:
➡️ https://youtu.be/_ZJGzeO1Lqg?si=_sTUi1BNUGMAedEC
Tips for First-Time VEO Users
If you’re giving it a go, here’s what I’d recommend:
- Script directly in Gemini for best results
- Use a tool like ElevenLabs or Descript for voiceovers
- Keep your concept simple, avoid complex narratives until you get the feel
- Study short video storytelling, YouTube Shorts, and Reels are great models
- Block at least 3–5 hours the first time you try it
Why This Matters for Small Business Owners
AI tools like VEO sound magical, and they can be, but they don’t replace your creativity or storytelling just yet. If you’re looking to experiment, do it when you have time to tinker.
Let me know if you’d like a breakdown of VEO compared to other tools, such as Descript, Runway, or Opus, next.
Want to keep up with social media changes? We send out a free monthly newsletter keeping you updated on all things AI, Social Media, and beyond! Easily sign up here: eJenn Clickable Bytes Newsletter signup
