Behind the 31.8% surge, has Zhipu successfully implemented the API sales?

By/Archie

Yesterday, Zhipu released its full-year 2025 performance, which is also its first financial report since going public.

For the full year, revenue reached 724 million yuan, up 131.9% year over year; but with 3.18 billion yuan in R&D spending driving it, the adjusted net loss reached 3.18B yuan.

Although it lost this much money, the market’s reaction has been extremely positive. Today, Zhipu’s stock price rose by 31.8%.

One of the most important reasons is that selling APIs appears to have worked.

In 2025, API-selling revenue grew from 48 million yuan in 2024 to 190 million yuan, up 296% year over year. At the same time, management at Zhipu explicitly mentioned in a conference call that the current API service ARR is about $250 million and is expected to reach $1 billion by year-end.

More importantly, this is not an isolated case. With other large-model companies as well, we’re gradually seeing a similar trend: Token call volume is rising, and APIs are becoming one of the most direct paths to monetization.

So how should we view this phenomenon? Today, let’s talk about it by combining Zhipu’s financial report.

/ 01 /

Base-model growth—powered by selling APIs

In this financial report from Zhipu, the biggest change worth paying attention to is the shift in its revenue mix.

Cloud deployment has become the most core source of growth. What cloud deployment is, in essence, is API services. In 2025, this portion of revenue grew from 48 million yuan in 2024 to 190 million yuan, up 296% year over year; and its share of revenue also increased from 15.5% to 26.3%.

The core logic behind API growth is the rise in call volume.

And behind that is the push from OpenClaw. When agents start automatically executing tasks, one demand often corresponds to multiple rounds of calls; Token consumption is amplified many times, and API call volume rises accordingly.

A growing industry consensus is forming around this: when large models gain the ability to execute long-horizon tasks, calls no longer remain at a single input-output exchange, but evolve into a sustainable, systemized operating process.

Under this kind of structure, Token itself becomes the most direct—and most certain—billing unit.

In other words, when model capability is strong enough, APIs themselves converge into the clearest business model for large models.

This trend is becoming a shared choice among large-model vendors.

Overseas moved earlier. Around 80% of Anthropic’s revenue comes from enterprise-grade API call services; in essence, it’s a pricing system centered on Token consumption.

Domestically, it’s also rapidly converging to this structure.

Currently, in China, mainstream foundation-model companies such as Zhipu AI, MiniMax, and Moonshot’s core revenue are gradually shifting toward API calls. MaaS (Model as a Service) has become the main path to capture growth.

In a conference call, Zhipu’s management explicitly stated that the current API service ARR is about $250 million and is expected to reach $1 billion by year-end. In the future, the company will also place greater emphasis on standardized API services. By 2026, API services and localized deployments are expected to each account for half; and over the next 2–3 years, the focus will further tilt toward APIs.

Similar changes are also happening at MiniMax.

In 2025, revenue from its open platform and enterprise services reached $25.96M, up 197.8% year over year; its share of revenue increased from 28.6% to 32.8%.

As of February 2026, the company’s ARR has surpassed $150 million. Compared with the 2025 fiscal year’s $79 million, that is a doubling of growth. The core driving force behind the growth is also the increase in Token consumption—especially the rollout of coding assistants and agent scenarios.

Goldman Sachs expects MiniMax’s share of revenue from the open platform (APIs) to reach about 40% in 2026.

The consolidation of the business model for large models also makes the way to measure value clearer: shifting from “capability metrics” to “Token measurement.”

/ 02 /

Behind route divergence: two approaches to solving the problem

As AI moves into the application stage, one question becomes specific: as model capability gradually converges, what exactly is the core competitive advantage of large models?

On this question, Zhipu and MiniMax have offered two different solutions.

Zhipu’s logic is to pursue the utmost model ceiling.

Zhang Peng proposed the concept of TAC (Token Architecture Capability, Token architecture capability). In essence, it can be broken down into three points: scale of calls, quality of calls, and the ability to convert into revenue.

Its core judgment is: the quality of intelligence determines pricing power.

Zhipu’s judgment is: “As agents evolve, Tokens will be layered. Low-complexity, standardized tokens will move toward low pricing and even be free; only high-complexity, high-reliability high-quality tokens will have sustained pricing power.”

This has already shown up in the data. In the first quarter, Zhipu’s API pricing increased by 83%, but demand did not shrink—instead, it showed a supply-demand imbalance, with call volume rising by 400%.

If Zhipu AI is talking about “quality determines pricing power,” then MiniMax, in fact, is following another logic: the competitive strength of the model comes from “path differentiation” and “efficiency.”

MiniMax chose a route that isn’t mainstream—building a fully multimodal model in parallel across four major modalities: text, video, voice, and music. This is not common among today’s large-model vendors.

The core of this route is not “more,” but “wider.”

In Yan Junjie’s view, the value of a platform company in the AI era essentially comes down to: intelligence density × Token throughput.

The significance of multimodality is that, without significantly reducing intelligence density, it amplifies Token throughput. Because what it changes is not the ceiling of capability, but the barrier to use.

When you add interaction methods such as images and voice into the product, users’ comprehension costs and operational barriers drop noticeably, and the user base expands to a broader population—including older adults and children—groups that were previously hard to cover.

This has actually happened once before in mobile internet: from graphical/text information feeds to the explosive rise of short videos—at its core, it’s about achieving a jump in penetration by lowering the interaction threshold.

Now look at efficiency. MiniMax’s other main line is extreme resource utilization efficiency.

In 2025, the company’s R&D spending was $253 million, up 33.8% year over year, which is clearly lower than the 158.9% growth rate of revenue.

By contrast, Zhipu AI’s strategy is closer to “go big, go hard.”

In 2025, Zhipu’s revenue was 724 million yuan, corresponding to R&D expenses of 3.18 billion yuan, with an R&D expense ratio as high as 439%. In the same period, MiniMax’s revenue was 540 million yuan, R&D expenses were 1.74B yuan, and the R&D expense ratio was 323%.

In terms of operating efficiency, for every 1 yuan of revenue Zhipu earns, it corresponds to about 4.4 yuan in losses; MiniMax is 3.2 yuan. In terms of labor efficiency, Zhipu is about 660k yuan per person, while MiniMax reaches 1.26 million.

Of course, some of these differences come from the business model. MiniMax relies more on product revenue, while Zhipu is still mainly driven by localized deployments.

But even so, the divergence between the two paths remains clear:

One side is pursuing the “intelligence ceiling” and gaining pricing power by improving capability;

The other side is optimizing “efficiency and coverage,” expanding usage scale to amplify Token throughput.

In essence, under the same formula, these are two completely different solutions.

/ 03 /

An oligopoly structure is the biggest certainty for base-model businesses

Leaving valuation aside, this business of model vendors has already begun to show a relatively clear outline.

This base-model business is not like traditional software.

Traditional software is characterized by heavy upfront investment and then slower recovery later, but base models are different: costs rise in steps, while revenue may not grow thicker in parallel—and could even be squeezed continuously as competition intensifies.

From this perspective, it resembles a structure that is inherently “somewhat fragile.” But interestingly, this structure points to another outcome:

It naturally moves toward oligopoly.

Because only a small number of companies can continuously withstand investment at this scale. In terms of business form, it’s more like batteries or semiconductor foundries: huge upfront investment, but once a firm position is secured, there are few competitors, and the “pie” is large enough.

At the same time, large models have another more subtle aspect: it’s not entirely a “winner-takes-all” market; it’s closer to a tiered market.

For the top-tier models, even with only a 5% advantage in effectiveness, in complex, efficiency-oriented scenarios such as coding, that advantage gets amplified into a premium of more than 50%—a multiplier effect.

But meanwhile, not every task needs the strongest model.

So the market will naturally tier: the top tier captures the premium, the middle tier runs at scale, and the bottom tier handles long-tail demand. Even between different tiers, there may form a kind of “Token flow”—complex tasks move upward, simple tasks move downward.

Even if you can’t achieve global SOTA, reaching SOTA in a specific niche field is still a valid path.

And in this structure, efficiency is also a very critical variable.

Because this industry has almost no network effects and users have extremely low switching costs. That means: as long as a company can build a “90-point” model with a lower price, it can quickly scale up.

In this process, efficiency also becomes an important variable in commercialization.

Because there are no network effects and switching costs are low, as long as a company can build a “90-point” model and price it lower, it can quickly expand.

The reason is direct: in some scenarios, you don’t actually need the strongest model. When the performance gap is limited, price becomes the decisive factor.

And behind price is cost. This depends not only on technology, but also on differences in a whole set of costs such as computing power and electricity.

Take China as an example: through engineering optimization, large-scale deployment, and lower electricity costs, inference costs can be significantly reduced. This allows models with the same capability to provide Token services at a lower price.

Now, some Chinese model vendors going overseas are, in essence, doing a “Token price-spread” business.

Admittedly, revenue growth comes from explosive demand and the oligopoly market structure. However, it can’t be overly optimistic—because due to differences in the competitive environment, China and the U.S. still have many differences when selling APIs:

For example, the U.S. large-model ecosystem relies more on developer long-tail demand. Enterprise customers and developers are more willing to pay for capability, and model performance can more easily convert directly into premium pricing.

Whereas in China, calls are more concentrated among top customers, including internet platforms and government/enterprise clients. Combined with competition on the supply side, Token premiums likely will not exist for the long term.

To a certain extent, in the U.S., base models are closer to a combination of software and platforms; in China, they are more like part of the infrastructure.

From this angle, how far China’s large-model companies’ business model can really go may still need to be observed further.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments