PUBG maker Krafton, Nvidia launch benchmark for AI game agents
Researchers developed Orak, a benchmark to train and evaluate Large Language Model (LLM) agents across multiple popular video games, improving their interaction and performance.
Krafton, Seoul National University, Nvidia, University of Wisconsin-Madison.
Authors: Dongmin Park et al.
Orak addresses gaps in existing game evaluation methods by offering a comprehensive platform that tests LLMs in complex gameplay across twelve popular video games, unlike prior benchmarks that mostly focused on simpler text-based games.
This research shows that effective LLM evaluation requires complex, realistic environments rather than simplistic ones, as seen in gaming applications where LLMs can improve NPC intelligence and dynamic narratives, enhancing player engagement.
High computational demands for training and running these models may limit access for smaller developers or indie projects.
Orak advances the evaluation of LLMs in gaming, enabling more interactive and responsive AI experiences in the industry.
📄 Read the full paper: Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games
……Read full article on Tech in Asia
Games
Comments
Leave a comment in Nestia App