SkillJavaScriptv1.0.0
arxiv-gamedevbench-evaluating-agentic-capabili
Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development.
0 downloads
wanng-ide
Updated Feb 15, 2026arxiv-gamedevbench-evaluating-agentic-capabili
Source
- Paper key: 44f3ad505bee7a5c25a60d2a3686cb7e
- Title: GameDevBench: Evaluating Agentic Capabilities Through Game Development
- Categories: cs.AI,cs.CL,cs.SE
Learned insight
Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed as agents must navigate large, dense codebases while manipulating intrinsically multimodal assets such as shaders, sprites, and animations within a visual game scene. We present GameDevBench, the first
Node.js implementation entry
node {baseDir}/scripts/run.js