Elon Musk has been hellbent on challenging the top dogs in the AI industry with his models. Grok 4.1 is already a very powerful model. Its Grok Image is now #1 in text to video and image to video. Like Sora and Veo, it supports native audio generation. It costs $4.2 per minute with audio, which is cheaper than Veo 3.1 ($12/min) and Sora 2 Pro ($30/min).
xAI’s Grok Imagine takes the #1 spot in both Text to Video and Image to Video in the Artificial Analysis Video Arena, surpassing Runway Gen-4.5, Kling 2.5 Turbo, and Veo 3.1!
Grok Imagine is the latest video model from @xAI, and joins an increasing roster of models such as… pic.twitter.com/ciIzSljBll
— Artificial Analysis (@ArtificialAnlys) January 29, 2026
We are still skeptical that Grok can generate as high quality videos as Sora or Veo but it has improved a lot over the years and is becoming better all the time. It is incredibly good at image to video and following instructions.
[HT]

