How to Use DeepSeek-R1 671billion Model for Agents

DeepSeek R1 has generated a lot of excitements in the AI industry. While OpenAI is responding with o3 and o3-mini later today, plenty of companies are racing to add support for DeepSeek R1, including Perplexity and Windsurf. Thanks to NVIDIA, you can now try the the 671-billion-parameter DeepSeek-R1 model to build your own agents. Keep in mind, this is the model that is over 400GB if you try to run it locally.

This model is now available as an NVIDIA NIM microservice in preview. It can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system. As the company explains:

Delivering real-time answers for R1 requires many GPUs with high compute performance, connected with high-bandwidth and low-latency communication to route prompt tokens to all the experts for inference. Combined with the software optimizations available in the NVIDIA NIM microservice, a single server with eight H200 GPUs connected using NVLink and NVLink Switch can run the full, 671-billion-parameter DeepSeek-R1 model at up to 3,872 tokens per second.

DeepSeek-R1 in Action with NVIDIA NIM Microservices

Watch this video on YouTube

[HT]

What's Hot

Prompt Library Introduced on Bolt: Lets You Save Your Best Prompts

June 2025 release of Visual Studio Code: GitHub Copilot Chat Opensourced, MCP Support Generally Available

Grok 4 & SuperGrok Heavy Announced, Grok 4 Jailbreak Out Already?

June 2025 release of Visual Studio Code: GitHub Copilot Chat Opensourced, MCP Support Generally Available

Grok 4 & SuperGrok Heavy Announced, Grok 4 Jailbreak Out Already?

KANAAN K1 Pro AI Glasses with OpenAI, Meta Support

Leonardo’s AI Video Tool Gets Motion Control

Leonardo’s Omni Editing Announced with FLUX.1 Kontext and GPT-Image-1

OmniHuman-1 Generates Realistic Human Videos

June 2025 release of Visual Studio Code: GitHub Copilot Chat Opensourced, MCP Support Generally Available

Grok 4 & SuperGrok Heavy Announced, Grok 4 Jailbreak Out Already?

KANAAN K1 Pro AI Glasses with OpenAI, Meta Support

Most Popular

Prompt Cannon: Run Prompts Across Multiple Models

GPTARS: GPT Powered TARS Robot

Simple Grok 2 Jailbreak

Our Picks

Prompt Library Introduced on Bolt: Lets You Save Your Best Prompts

June 2025 release of Visual Studio Code: GitHub Copilot Chat Opensourced, MCP Support Generally Available

Grok 4 & SuperGrok Heavy Announced, Grok 4 Jailbreak Out Already?

What's Hot

How to Use DeepSeek-R1 671billion Model for Agents

Related Posts