PUBG maker Krafton, Nvidia launch benchmark for AI game agents

Tech in Asia·2025-06-07 17:00

🔍 In one sentence

Researchers developed Orak, a benchmark to train and evaluate Large Language Model (LLM) agents across multiple popular video games, improving their interaction and performance.

🏛️ Paper by:

Krafton, Seoul National University, Nvidia, University of Wisconsin-Madison.

Authors: Dongmin Park et al.

🧠 Key discovery

Orak addresses gaps in existing game evaluation methods by offering a comprehensive platform that tests LLMs in complex gameplay across twelve popular video games, unlike prior benchmarks that mostly focused on simpler text-based games.

📊 Surprising results

Key stat: Orak covers 12 diverse games, providing a broader evaluation of LLM capabilities than earlier benchmarks. Breakthrough: The plug-and-play Model Context Protocol (MCP) enables LLMs to interact directly with game environments, improving evaluation consistency. Comparison: Proprietary LLMs like GPT-4o outperformed open-source models, revealing a performance gap in complex game interactions.

📌 Why this matters

This research shows that effective LLM evaluation requires complex, realistic environments rather than simplistic ones, as seen in gaming applications where LLMs can improve NPC intelligence and dynamic narratives, enhancing player engagement.

💡 What are the potential applications?

Creating adaptive NPCs that respond to player strategies in real time. Supporting AI-driven storytelling with dynamic character responses to gameplay. Assisting game designers in testing and refining mechanics through LLM-simulated player interactions.

⚠️ Limitations

High computational demands for training and running these models may limit access for smaller developers or indie projects.

👉 Bottom line:

Orak advances the evaluation of LLMs in gaming, enabling more interactive and responsive AI experiences in the industry.

📄 Read the full paper: Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

……

Read full article on Tech in Asia

Games

HOME

PROPERTY

SALE

RENT

NEW LAUNCH

CONDOS

OVERSEAS

GROUP

SERVICES

LOTTERY

🔍 In one sentence

🏛️ Paper by:

🧠 Key discovery

📊 Surprising results

📌 Why this matters

💡 What are the potential applications?

⚠️ Limitations

👉 Bottom line:

Get Nestia App Free Now

Property Agent Program

Properties for sale

Properties for rent

Singapore New Launch

Singapore Condo

Sale by area

Rent by area

Popular properties for sale

Popular properties for rent

Singapore News

Singapore Online Groups

External Links