CineMaster: 3D-Aware Controllable Text to Video Generation

Video models are getting better all the time. The latest models give you plenty of control over camera movement. CineMaster lets you manipulate objects and camera in 3D space. With this approach, it is possible to make videos of men walking in front of another object, cars passing one another, a hot balloon circling a tower, and a lot more. As the researchers explain:

To achieve this, CineMaster operates in two stages. In the first stage, we design an interactive workflow that allows users to intuitively construct 3D-aware conditional signals by positioning object bounding boxes and defining camera movements within the 3D space. In the second stage, these control signals—comprising rendered depth maps, camera trajectories and object class labels—serve as the guidance for a text-to-video diffusion model, ensuring to generate the user-intended video content.

[HT]

What's Hot

Web Capture: MagicPath’s Extension for HTML to React Conversion

Kimi K2 Thinking Open Source Reasoning Model Announced

NarTick GPT Powered E-Ink Calendar

Web Capture: MagicPath’s Extension for HTML to React Conversion

Kimi K2 Thinking Open Source Reasoning Model Announced

Cursor Announces New Improvements

Grok 4 Gets Auto Mode Selection

Seedream 4.0 Top Image Generation Model Launches on fal

LatentSync Video to Video Model for Lip Sync

Kimi CLI Is Now Available for Coding via Terminal

Google AI Studio Gets New Vibe Coding Experience

Cursor 2.0 Composer: Coding Model for Agentic Use

Most Popular

Prompt Cannon: Run Prompts Across Multiple Models

Dipal D1 2.5K Curved Screen 3D AI Character

GPTARS: GPT Powered TARS Robot

Our Picks

Web Capture: MagicPath’s Extension for HTML to React Conversion

Kimi K2 Thinking Open Source Reasoning Model Announced

NarTick GPT Powered E-Ink Calendar

What's Hot

CineMaster: 3D-Aware Controllable Text to Video Generation

Related Posts