GPT 4.5 Leads in Elimination Game That Tests Reasoning, Strategy & Deception

Since its release, there has been a lot of discussion about how smart GPT 4.5 really is. It doesn’t score as high as o3-mini when it comes to coding in certain benchmarks. It is also very expensive to use for developers. But as it turns out, in the Elimination Game, which tests LLMs in social reasoning, strategy, and deception, it is leading other models.

The idea is simple: in this game, players engage in public and private conversations, form alliances and vote to eliminate each other round by round. A jury of eliminated players then casts deciding votes to crown the winner. As far as double crossing, Claude 3.7 Sonnet had a greater tendency to do so.

[HT]

What's Hot

Dreamina Introduces Multi-Frames: Now You Can Use 10 Keyframes

Nano Banana Pro Hits Higgsfield & Others

Replit’s AI Designer with Gemini 3 Pro

Grok 4.1 Released Ahead of Gemini 3.0 Pro Launch

Deep Research Coming to NotebookLM?

Qwen DeepResearch 2511 with File Uploads, Boosted Search

Leonardo Blueprints Changes the Game with 50+ Workflows

Claude’s Prompt Improver Tested: How Much Does It Cost?

Claude Gets Web Search in API

Nano Banana Pro Hits Higgsfield & Others

ImagineArt 1.5 Hyper-realistic Image Model Launched

Pin GPTs Lets You Pin Chats In Folders On ChatGPT, DeepSeek, Claude

Most Popular

Prompt Cannon: Run Prompts Across Multiple Models

Dipal D1 2.5K Curved Screen 3D AI Character

GPTARS: GPT Powered TARS Robot

Our Picks

Dreamina Introduces Multi-Frames: Now You Can Use 10 Keyframes

Nano Banana Pro Hits Higgsfield & Others

Replit’s AI Designer with Gemini 3 Pro

What's Hot

GPT 4.5 Leads in Elimination Game That Tests Reasoning, Strategy & Deception

Related Posts