DeepSeek Unveils New AI Models with Enhanced Reasoning Capabilities

18:59, 21 tháng 1

Chỉnh sửa bởi: Veronika Nazarova

DeepSeek has released its main DeepSeek-R1-Zero and DeepSeek-R1 models, along with six smaller distilled versions ranging from 1.5 billion to 70 billion parameters. These new models, based on open-source architectures like Qwen and Llama, utilize data generated from the full R1 model.

The smallest model can operate on a laptop, while the full version requires substantial computing power. This release has drawn significant attention from the AI community, as many existing open-weight models have struggled to match proprietary models like OpenAI's o1 in reasoning benchmarks.

Independent AI researcher Simon Willison emphasised the models' unique reasoning abilities, noting that even simple cues trigger extensive reasoning.

The R1 model distinguishes itself by employing an inference-time reasoning approach, simulating a human-like thought process to solve queries. This innovative class of models, termed simulated reasoning (SR), gained prominence following the release of OpenAI's o1 model family in September 2024.

Đọc thêm tin tức về chủ đề này:

06 tháng 4

Alibaba Unveils QwQ-32B, a New Open-Source AI Model Rivaling DeepSeek-R1

29 tháng 1

DeepSeek Challenges Industry Standards with Unique AI Architecture

07 tháng 5

Meta's Llama 4: New Multimodal AI Models with Restrictions for EU Developers

Bạn có phát hiện lỗi hoặc sai sót không?

Chúng tôi sẽ xem xét ý kiến của bạn càng sớm càng tốt.