Google Launches Gemini 2.5 Computer Use Model for Advanced Digital Automation

Edited by: Veronika Radoslavskaya

Google officially introduced its specialized artificial intelligence model, Gemini 2.5 Computer Use, on October 7, 2025. This new AI iteration is specifically engineered to interact directly with user interfaces, enabling it to master complex digital actions such as navigating websites, executing precise clicks, and accurately completing digital forms.

The technology builds upon the strong visual comprehension and reasoning capabilities already present in Gemini 2.5 Pro, signaling a significant step forward in digital automation. The model operates via a sophisticated, continuous feedback loop: it receives a directive, interprets the current visual state of the interface via a screenshot, formulates the necessary user interface action, executes it, and then repeats this cycle until the assigned objective is fully achieved. This iterative method allows for the creation of agents that can mimic complex human interaction within digital environments.

The development is a collaborative effort involving both Google and Google DeepMind, highlighting the substantial research investment. This tool is immediately available to developers through API access on the Google AI Studio and Vertex AI platforms, encouraging rapid integration and testing of novel digital agents. Industry analysts note that the focus on low-latency performance is crucial, as it directly impacts the fluidity of agent deployment for time-sensitive operations. Gemini 2.5 Computer Use has demonstrated superiority over existing counterparts in key benchmarks such as Online-Mind2Web, WebVoyager, and AndroidWorld, and is estimated to surpass Claude Sonnet 4.5 in certain tests.

This release aligns with a broader industry trend toward visual-based control agents capable of operating across diverse software environments. The capability provided by Gemini 2.5 Computer Use offers developers a potent instrument for automating intricate and repetitive digital workflows. Internally, Google has already deployed the model for interface testing, where it demonstrates the ability to recover up to 70% of failures in test runs. By equipping creators with tools that understand the visual language of software, Google facilitates a shift toward intelligently managed digital tasks, potentially redefining productivity by freeing human focus from routine interface manipulation.

Sources

  • El Español

  • Introducing the Gemini 2.5 Computer Use model

  • Google's AI can now surf the web for you, click on buttons, and fill out forms with Gemini 2.5 Computer Use

  • Google announces Gemini 2.5 Computer Use AI model that can control web browsers like humans do

Did you find an error or inaccuracy?

We will consider your comments as soon as possible.