Google Launches Gemini 3, Ushering in the Age of "Deep Think" and Autonomous Agents

18:50, 18 November

Author: Veronika Radoslavskaya

Google Launches Gemini 3

Two years into the generative AI boom, Google has officially released Gemini 3, a model that claims to shift the landscape from chatbots that simply predict text to AI agents that can reason, plan, and act. The release introduces two primary tiers: Gemini 3 Pro, available immediately, and a more powerful Gemini 3 Deep Think mode, which is designed to tackle complex problems by "thinking" before it responds.

The standout feature of this generation is its focus on "mechanistic reasoning." The immediate launch of Gemini 3 Pro demonstrates state-of-the-art reasoning, achieving 91.9% on the difficult GPQA Diamond benchmark and 37.5% on the Humanity's Last Exam (HLE) without using tools. This capability allows the model to grasp depth and nuance across science and mathematics with a high degree of reliability.

The new Deep Think mode, available soon to Ultra subscribers, pushes these boundaries even further. Designed to tackle the most complex, novel problems, Deep Think achieved a score of 45.1% on ARC-AGI-2, a rigorous benchmark that tests an AI's ability to solve logic puzzles it has never seen before, and 41.0% on HLE. This enhanced mode is built for genuine problem-solving, going beyond standard retrieval and synthesis.

For developers, the launch is accompanied by a new platform called Google Antigravity. This "agent-first" development environment allows software engineers to work alongside AI agents that have direct access to terminals, browsers, and code editors. Instead of just autocompleting a line of code, these agents can autonomously plan, execute, and validate complex software tasks. Google describes this as the ultimate tool for "vibe coding"—a style of programming where developers focus on high-level creative intent while the AI handles the implementation details.

On the consumer side, Gemini 3 leverages its multimodal capabilities and a massive 1 million-token context window—allowing it to process vast amounts of data, equivalent to over 1,500 pages of text or entire video lectures. This enables it to act as a personalized coach: for example, the model can analyze a video of a user's pickleball match, identify specific flaws in their form, and generate a custom training plan. For students, it can ingest academic papers or long video tutorials and generate interactive study aids like flashcards or visualizations to help them master the material. It can also decipher handwritten recipes and convert them into digital formats.

Google also claims dominance on the leaderboards. Gemini 3 Pro has already taken the top spot on LMArena, a crowdsourced benchmarking site where users blindly rate AI models, achieving an Elo score of 1501. The model's immediate ascent to the top continues the legacy of its predecessor, Gemini 2.5 Pro, which previously held the highly competitive ranking.

The model is currently rolling out across Google’s ecosystem, including the Gemini app, Vertex AI, and a new "AI Mode" in Google Search that generates interactive simulations on the fly. While the "Deep Think" mode is being held back for final safety checks, the core Gemini 3 Pro model is live today, signaling that Google is ready to put "agentic" AI into the hands of millions.

Gemini

Google DeepMind

Generative AI

Large Language Models (LLMs)

Deep Think

Notification Center

Notification Center

Google Launches Gemini 3, Ushering in the Age of "Deep Think" and Autonomous Agents

Read more news on this topic: