Nvidia Licenses Groq LPU Technology in $20 Billion Strategic Agreement

Edited by: Aleksandr Lytviak

NVIDIA Corporation finalized a strategic agreement in December 2025 with AI chip startup Groq, reportedly valued up to $20 billion in cash. This transaction, which constitutes Nvidia's largest acquisition of assets to date, was structured as a non-exclusive licensing agreement for Groq's inference technology. The deal specifically included the acqui-hire of key personnel, notably founder Jonathan Ross, who previously co-developed Google's Tensor Processing Unit (TPU) and served as Groq's CEO.

Groq, established in 2016 by Ross and Douglas Wightman, has focused on accelerating AI inference—the real-time execution of trained models—a process increasingly prioritized by major firms for capital expenditure. Groq's core innovation is its Language Processing Unit (LPU), which utilizes a deterministic architecture to address latency issues common in traditional Graphics Processing Unit (GPU) designs. A key architectural difference is the LPU's reliance on on-die Static Random Access Memory (SRAM) instead of the High Bandwidth Memory (HBM) used in competitors like Nvidia's GPUs and Google's TPUs.

While a single LPU chip possesses a smaller SRAM capacity—230 MB compared to an H100's 80GB—this design choice eliminates the need for data to leave the chip for processing. This results in superior low-latency performance, often cited as a five-to-tenfold improvement for specific inference tasks. This focus on speed and deployment cost-efficiency positions Groq's technology to meet the rising demand for real-time AI applications, particularly as hyperscalers develop custom silicon to challenge established market dominance.

Nvidia's integration of the LPU technology is a direct maneuver to reinforce its standing in the inference segment, complementing its existing GPU offerings which excel at model training. The structure of the deal allows Groq to remain independent under new leadership, with CFO Simon Edwards assuming the CEO role, a model chosen to secure specialized talent and intellectual property while potentially reducing regulatory hurdles associated with full mergers. Industry anticipation centers on the official product debut leveraging this licensed technology, slated for the GPU Technology Conference (GTC) in San Jose, California.

GTC 2026 is scheduled to begin on March 16, 2026, with CEO Jensen Huang's keynote address set for 11:00 a.m. PT that Monday. Analysts, including Vivek Arya from Bank of America Securities, anticipate Nvidia will unveil an inference-only chip at the event, expected to use SRAM memory for enhanced cost efficiency in inference workloads. This new silicon is projected to operate alongside Nvidia's next-generation GPU platforms, such as the 'Vera Rubin' architecture, broadening the company's AI portfolio to manage diverse workloads, from bulk training to instantaneous agentic reasoning.

3 Views

Sources

  • Republic World

  • Seoul Economic Daily

  • InsiderFinance

  • Alpha Spread

  • The Times of India

  • BNN Bloomberg

Did you find an error or inaccuracy?We will consider your comments as soon as possible.