Nvidia Enters "Shrimp Farming" Business, All-In on Trillion-Dollar AI Inference Era

CryptocurrencySniper

2026-03-17 18:32:16

Abstract generation in progress

Global computing power giant NVIDIA is transforming from a chip manufacturer into an AI factory, betting on opportunities in the AI inference market. At the NVIDIA GTC 2026 Conference (GPU Technology Conference) opening on March 17, CEO Jensen Huang significantly raised revenue expectations for the next-generation AI chips, targeting $1 trillion, and officially launched the next-generation hardware platform, along with products like software stacks supporting “shrimp farming.”

Industry insiders note that a strong signal from this year’s GTC is that the inference era is accelerating. Meanwhile, NVIDIA’s new computing architecture will lead to transformations in industries such as cooling and packaging materials.

Enhancing AI Inference

At this GTC, NVIDIA emphasized that in the new stage of AI agents, inference will become the core of AI infrastructure competition. The company officially launched the next-generation computing platforms Vera Rubin and Groq3 LPU (Language Processing Unit) chips.

“In the past, when I mentioned Hopper, I would hold up a chip; but when I mention Vera Rubin, people think of the entire system,” Huang said. He estimates that over the past few years, computing demand has increased by 1 million times, and from 2025 to 2027, this demand growth is expected to bring at least $1 trillion in revenue to the company.

The newly unveiled NVIDIA Vera Rubin platform includes 7 chips, 5 rack-level systems, and a supercomputer designed for agent-based AI, featuring the new Vera CPU and BlueField-4S TX storage architecture. Compared to the previous Blackwell platform, the new platform requires only a quarter of the GPUs to train large hybrid expert models, with inference throughput per watt increasing by up to 10 times.

Huang highlighted the Groq 3 LPU inference chip, revealing details reserved during the February earnings call. This chip stems from NVIDIA’s December acquisition of Groq’s core technology assets for about $20 billion, positioning it as the “inference co-processor” for Rubin GPUs, and playing a key role in NVIDIA’s inference strategy.

Huang stated that in the AI agent era, inference needs are diversifying rapidly. For tasks requiring extremely high interactivity and ultra-short response times, traditional GPU architectures have performance redundancies. To address this, NVIDIA introduced the LPU architecture focused on “extreme low-latency token generation,” working alongside GPUs. Vera Rubin handles the “pre-filling” stage with massive computation needs, while LPU manages the latency-sensitive “decoding” stage. Under this hybrid architecture, system inference throughput and power efficiency can be improved by up to 35 times.

“The AI inference era is no longer just about peak parameters but about whether we can perform finer-grained heterogeneous optimization around real workloads, making every bit of compute power as effective as possible,” said a representative from Cloud Tianli Fei. They emphasized that the era of inference demands extreme cost-performance ratio, increasingly relying on heterogeneous computing. By segmenting the computational load characteristics during inference, different hardware can handle tasks best suited to them, boosting overall system efficiency. NVIDIA’s approach aligns with this philosophy. Domestic AI chip companies like Cloud Tianli Fei are also advancing inference architecture innovations around GPNPU, PD separation, and 3D stacking storage, following similar industry directions.

Revolutionizing AI Agents

OpenClaw, an open-source autonomous AI agent platform, has sparked a “shrimp farming” craze worldwide. At this GTC, Huang praised OpenClaw, calling it “the next frontier of AI for everyone and the fastest-growing open-source project in history,” marking the start of the personal AI agent creation era.

NVIDIA plans to “enter the shrimp farming” scene by launching the NVIDIA Nemo Claw software stack, compatible with the OpenClaw platform, allowing users to install with a single command and enhancing AI agent security, trustworthiness, scalability, and usability.

The conference also saw strengthened software collaborations, with announcements of partnerships with leading industrial software companies like Cadence, Siemens, and Synopsys, as well as integrating NVIDIA CUDA-X, Omniverse software platform, and GPU-accelerated industrial software and tools into companies such as Honda, Jaguar Land Rover, Samsung, SK Hynix, and TSMC, to accelerate industrial design, engineering, and manufacturing processes.

Huang said, “A new industrial revolution has begun. Physical AI and autonomous AI agents are fundamentally reshaping global design, engineering, and manufacturing. Through close cooperation with software giants, cloud service providers, and OEMs in the global ecosystem, NVIDIA is providing a full-stack accelerated computing platform to enable industries to turn this vision into reality at unprecedented scale and speed.”

On the first day of the launch, NVIDIA’s stock rose 1.65%, closing at $183.22 per share; however, the same day, the A-share NVIDIA industry chain index retreated, with optical modules leading the decline—Tianfu Communication down about 10%, Zhongji Xuchuang down 3.33%, and leading AI PCB company Shenghong Technology down about 3%.

Leading the Next-Generation Computing Infrastructure

NVIDIA continues to lead the transformation of the AI industry chain. As NVIDIA’s AI Fab architecture becomes increasingly complex and power consumption surges, traditional air cooling technology has reached physical limits. The newly introduced Rubin cabinet adopts a 100% liquid cooling design, making liquid cooling core components a new essential for next-generation computing infrastructure.

At the conference, Limin Da, a subsidiary of Lingyi Zhizao, appeared as the only mainland Chinese supplier in the NVIDIA Vera Rubin architecture’s Manifold (distributor) ecosystem. As a key component of the liquid cooling circulation system, the distributor and quick connectors’ technical performance directly determine the efficiency and stability of the entire cooling system.

Additionally, NVIDIA’s latest Rubin architecture may also drive a transformation in packaging materials.

“Due to the extreme thermal and signal transmission requirements of the Rubin architecture, the commercialization of glass substrates has been significantly accelerated,” said Lu Bing, an industry analyst at Shenmeng. Under extreme compute density, traditional organic substrates (ABF) face serious physical bottlenecks.

Domestic and international manufacturers are at a critical point transitioning from “technology validation” to “early mass production.” According to forecasts from Yole Group and others, 2026 is the year for glass substrates to enter small-batch commercialization, with the demand for glass materials in HBM (High Bandwidth Memory) and logic chip packaging expected to grow at a compound annual rate of 33%.

Lu Bing pointed out that China has the most complete panel industry chain and a large consumer market. Leveraging this scale advantage, domestic companies are expected to make breakthroughs in some materials and equipment segments (such as laser micro-hole devices), securing a core position in the AI compute chip supply chain.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.