Nari Labs Dia Outperforms ElevenLabs, Sesame CSM-1B

Sesame and ElevanLabs have some realistic sources. The Dia-1.6B model can give them a run for their money. It generates realistic dialog from a transcript. It comes with emotion and tone control, so it can produce laughter, coughing, and other nonverbal communications.

Here is how a prompt would look like when using this model:

[S1] Oh fire! Oh my goodness! What’s the procedure? What to we do people? The smoke could be coming through an air duct!

[S2] Oh my god! Okay.. it’s happening. Everybody stay calm!

[S1] What’s the procedure…

[S2] Everybody stay fucking calm!!!… Everybody fucking calm down!!!!! [S1] No! No! If you touch the handle, if its hot there might be a fire down the hallway!

You can test this tool on Hugging Face.

What's Hot

Video & Image JSON Prompts Cheatsheet

Deepseek V3.2 Changes the Game, Competes with GPT 5, Gemini 3.0

Top Black Friday Deals for AI: Higgsfield, Suno, Freepik

Lipsync-2-pro: Edit What Anyone Says In Any Video

ElevenLabs Voice Design v3 Announced

OpenAI Introduces New Stunning AI Audio Models

Kling AI Video Tool Gets Prompt Dictionary & Presets

Bolt3D: Generates Interactive 3D Scenes in Less Than 7 Seconds on a Single GPU

DeepSeek V3 671B-Parameter Model Drops on Hugging Face

Mureka O2 & V7.6 Music Models Debut

SOUYIE SW-9 GPT Powered Smartwatch

Gemini 3 Breaks the Internet. Here Are a Few Examples

Most Popular

Prompt Cannon: Run Prompts Across Multiple Models

Dipal D1 2.5K Curved Screen 3D AI Character

GPTARS: GPT Powered TARS Robot

Our Picks

Video & Image JSON Prompts Cheatsheet

Deepseek V3.2 Changes the Game, Competes with GPT 5, Gemini 3.0

Top Black Friday Deals for AI: Higgsfield, Suno, Freepik

What's Hot

Nari Labs Dia Outperforms ElevenLabs, Sesame CSM-1B

Related Posts