The World’s Most Capable Mobile Agent

BY
ATHARVA GUNDAWAR
AGI reaches 97.4% task success on AndroidWorld benchmark, setting a new global record. (Leaderboard)

Our AGI-0 Agent has achieved state-of-the-art performance on the AndroidWorld leaderboard, surpassing every previous system (as of October 8, 2025).

Why is AndroidWorld significant? 

AndroidWorld shows the ability of our AGI agent to do useful tasks on a mobile screen by tapping, swiping, and typing like a human would. AndroidWorld is one of the most complete environments for benchmarking autonomous agents on mobile devices.

The benchmark covers 116 diverse tasks across 20 real-world Android apps, such as Calendar, Notes, Maps, VLC, Messages, Settings, Web Browser, and more.

Each task is written in natural language, for example:

How did the AGI-0 model solve AndroidWorld? 

Most top agents rely on both the screenshot and the phone's accessibility tree to understand screen structure.

We built our agent to operate entirely from screenshots- no accessibility trees or hidden UI metadata. Our agent interprets the raw screenshot through a multimodal vision-language model that's fine-tuned specifically for UI understanding. The model learns visual hierarchies and clickable regions directly from image context, removing the dependency on system metadata. This means it can generalize to any Android version, even custom ROMs or apps where A11y metadata is incomplete.

When given a new task, the agent builds a clear internal plan before acting. It breaks the user's instruction into smaller goals, explores the interface to find what it needs, and executes each step while tracking progress on an internal to-do list. After every action, it verifies whether the intended change occurred and adjusts its plan if necessary, a tight loop of planning, execution, and self-correction. This loop makes it reliable, even across complex mobile workflows.

This combination of planning, memory, and continuous verification creates robustness that is rarely seen in mobile agents.

Join Us + Try the API

We’re building the future of everyday AGI, agents that can use your phone, your laptop, and your browser as fluidly as you do.

If you’re a researcher or engineer who wants to push embodied intelligence forward, join our team.  

If you’re a developer or company that wants to integrate computer control into your own applications, try our API.

👉 Join AGI Inc. Careers  

👉 Use the Agent API