How to Use DeepSeek-R1 671billion Model for Agents

DeepSeek R1 has generated a lot of excitements in the AI industry. While OpenAI is responding with o3 and o3-mini later today, plenty of companies are racing to add support for DeepSeek R1, including Perplexity and Windsurf. Thanks to NVIDIA, you can now try the the 671-billion-parameter DeepSeek-R1 model to build your own agents. Keep in mind, this is the model that is over 400GB if you try to run it locally.

This model is now available as an NVIDIA NIM microservice in preview. It can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system. As the company explains:

Delivering real-time answers for R1 requires many GPUs with high compute performance, connected with high-bandwidth and low-latency communication to route prompt tokens to all the experts for inference. Combined with the software optimizations available in the NVIDIA NIM microservice, a single server with eight H200 GPUs connected using NVLink and NVLink Switch can run the full, 671-billion-parameter DeepSeek-R1 model at up to 3,872 tokens per second.

DeepSeek-R1 in Action with NVIDIA NIM Microservices

Watch this video on YouTube

[HT]

What's Hot

MiniMax Music 2.0 AI Model with Lifelike Vocals

Kimi CLI Is Now Available for Coding via Terminal

Google AI Studio Gets New Vibe Coding Experience

MiniMax M2 Open Weight Model Now Just Behind Claude 4.5 Sonnet

ChatGPT Atlas: New OpenAI Browser

Qwen Deep Research Now Can Create Reports and Podcasts

Microsoft Releases a 21-Lesson Course for Generative AI

AGENTS.md: Readme file for AI Agents

Boston Dynamics’ Atlas Using Machine Learning for Autonomous Task Completion

MiniMax M2 Open Weight Model Now Just Behind Claude 4.5 Sonnet

ChatGPT Atlas: New OpenAI Browser

Qwen Deep Research Now Can Create Reports and Podcasts

Most Popular

Prompt Cannon: Run Prompts Across Multiple Models

Dipal D1 2.5K Curved Screen 3D AI Character

GPTARS: GPT Powered TARS Robot

Our Picks

MiniMax Music 2.0 AI Model with Lifelike Vocals

Kimi CLI Is Now Available for Coding via Terminal

Google AI Studio Gets New Vibe Coding Experience

What's Hot

How to Use DeepSeek-R1 671billion Model for Agents

Related Posts