Start with the twist, because it reframes the whole roundup. For a year, the assumption was that OpenAI's Sora would define AI video. It won't. Per OpenAI's own Help Center, the Sora app and website were retired on April 26, 2026, and the Sora API is set to shut down on September 24, 2026, with the Sora 2 model already marked Legacy. So the headline isn't who beat Sora. It's that Sora left, and Google walked into the gap.
Veo 3.1: the one to beat
Google's Veo 3.1 is the current flagship, and its edge is sound. It generates synchronized native audio, dialogue, effects, and ambience, always on and produced jointly with the picture, so you're not dubbing a silent clip after the fact. Clips run 4, 6, or 8 seconds, and Scene Extension chains them into videos a minute or longer. It does up to 4K at 24fps, takes reference images for character and scene consistency, and supports first-and-last-frame transitions. You can use it in the Gemini app, in Google's Flow filmmaking tool, and through the Gemini API.
The pricing is reasonable for what you get, and Google notes Veo 3.1 costs the same as Veo 3 did. The Fast tier drops to as little as $0.10 per second and the Lite tier to $0.05, so you can prototype cheap and finish in 4K. Veo's strength on prompt-following and visual quality is the same thread we pull in the multimodal capability ranking, and the always-on audio connects to the wider voice models comparison. It also rides on the same Gemini stack assessed in the Gemini evaluation, which is why it shows up everywhere Google does.
The rest of the field
Veo leads, but it isn't the only serious tool, and a couple of these beat it on specific axes.
| Tool | Current model | Native audio | Where to use it |
|---|---|---|---|
| Google Veo | Veo 3.1 | Yes, always on | Gemini app, Flow, API |
| Runway | Gen-4.5 | Video-first | Runway app and API |
| Kling | Kling 3.0 | Yes | Kling app and API |
| Luma | Ray3.14 | Video-first | Dream Machine, API |
| Pika | Pika 2.5 | Video-first | pika.art, iOS |
| OpenAI Sora | Being discontinued | — | Retiring through 2026 |
Runway's Gen-4.5, from late 2025, is the strongest challenger on pure visual fidelity, physics, and motion control, and it's the favorite of a lot of working video people for exactly that craft. Kling 3.0, out in early 2026, matches Veo on native audio and pushes clip length toward 15 seconds. Luma's Ray3.14 brings native 1080p across its Dream Machine workflows and is fast and cheap. Pika 2.5 is the friendly consumer app for quick, fun clips. The catch for the last three is sound: they're video-first, so you'll usually add audio yourself.
What none of them can do yet
Be honest with your expectations, because the demos oversell. Every tool here still generates short base clips, from a few seconds up to around 25, that you extend by stitching. Long-form coherence is shaky. On-screen text comes out garbled more often than not. Hands and fine physics still betray the model. And keeping one character consistent across several shots is a real fight, even with reference images. These are clip generators, not film studios, and pretending otherwise is how you waste a budget.
Go with Veo 3.1 as your default: it leads on quality, it's the only top tool with always-on synchronized audio plus 4K, and it's everywhere you already are in Google's apps. Reach for Runway Gen-4.5 when visual craft and motion control are the whole job, or Kling 3.0 when you want native audio from a non-Google tool. Skip Sora entirely; it's being retired. And treat all of them as sources of short clips you'll assemble, not finished films.