Alibaba Cloud and Moonshot AI Launch Next-Generation Reasoning Models

18:28, 27 January

Edited by: Veronika Radoslavskaya

🚀 Introducing Qwen3-Max-Thinking, our most capable reasoning model yet. Trained with massive scale and advanced RL, it delivers strong performance across reasoning, knowledge, tool use, and agent capabilities. ✨ Key innovations: ✅ Adaptive tool-use: intelligently leverages

3:13 PM · Jan 26, 2026

4.2K

Read 200 replies

Watch on X

In late January 2026, the artificial intelligence landscape saw the simultaneous arrival of two high-performance flagship models from China: Alibaba Cloud’s Qwen3-Max-Thinking and Moonshot AI’s Kimi K2.5. Both releases represent a significant shift toward "reasoning-first" architectures designed for complex logic and autonomous task execution.

Kimi.ai

@Kimi_Moonshot

·Follow

🥝 Meet Kimi K2.5, Open-Source Visual Agentic Intelligence. 🔹 Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%) 🔹 Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%) 🔹 Code with Taste: turn chats,

5:42 AM · Jan 27, 2026

15.9K

Read 776 replies

Watch on X

Alibaba Cloud: Qwen3-Max-Thinking

Released on January 26, 2026, Qwen3-Max-Thinking is a large-scale reasoning model featuring an architecture that exceeds one trillion parameters. It is engineered to handle multi-step logical synthesis and advanced technical problem-solving.

Adaptive Tool Use: A core feature that enables the model to autonomously select between Search, Memory, or Code Interpreter functions during a conversation. The AI independently determines which external tool is required to verify facts or perform calculations based on the user's query.
Test-Time Scaling (TTS): The model utilizes inference-time compute scaling, allowing it to devote more processing power to "thinking" through difficult problems. This technique has resulted in a score of 90.2 on the Arena-Hard v2 benchmark.
Benchmark Performance: Qwen3-Max-Thinking has demonstrated high efficiency across multiple reasoning benchmarks, emphasizing its capability in scientific computing, mathematical logic, and complex coding tasks.

Moonshot AI: Kimi K2.5

On January 27, 2026, Moonshot AI (supported by Alibaba Group) introduced Kimi K2.5, an open-source, natively multimodal agentic model. This version is optimized for coordination and large-scale data processing.

Mixture-of-Experts (MoE) Architecture: While the model’s total capacity reaches one trillion parameters, the MoE design ensures that only 32 billion parameters are activated during operation. It was pre-trained on a massive dataset of 15 trillion mixed visual and text tokens.
Agent Swarm Mode: K2.5 introduces a sophisticated "Agent Cluster" capability, allowing it to coordinate up to 100 specialized sub-agents for a single project. In "Swarm" mode, the system can self-direct these agents to solve complex problems without requiring pre-defined workflows or human intervention.
Agentic Efficiency: The model is specifically designed for enterprise-level automation, scoring high on agentic benchmarks like HLE and BrowseComp by focusing on multi-step planning and browser-based research.

While both models utilize trillion-parameter foundations, their focus differs slightly: Alibaba's Qwen3-Max-Thinking prioritizes deep, iterative reasoning and autonomous tool selection, while Moonshot’s Kimi K2.5 focuses on multimodal agentic coordination and large-scale autonomous workflows.

Large Language Models (LLMs)

Reasoning AI

11 Views

Sources

europa press
Qwen Team
Moonshot AI Open Platform - Kimi Large Language Model API Service
Atlas Cloud
Vertu
Seeking Alpha - Power to Investors

Notification Center

Alibaba Cloud and Moonshot AI Launch Next-Generation Reasoning Models

Alibaba Cloud: Qwen3-Max-Thinking

Moonshot AI: Kimi K2.5

Sources

Read more news on this topic: