Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Futures Kickoff
Get prepared for your futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to experience risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Launchpad
Be early to the next big token project
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Domestic models dominate OpenRouter: Surge in token usage reveals programming and intelligent agents as the key to victory
During the Spring Festival, the usage of domestic large models was extremely active. According to the latest weekly data from OpenRouter, the total token count of the top ten models on the platform is approximately 87 trillion, with Chinese models dominating at 53 trillion, accounting for 61%.
The top three models in token calls for that week were all domestic large models, namely Minimax M2.5, Kimi K2.5, and GLM-5, with week-over-week changes of +197%, -20%, and +158%, respectively. Among them, MiniMax M2.5 surged to the top with 2.45 trillion tokens, followed by Kimi K2.5 with 1.21 trillion. Zhipu GLM 5 and DeepSeek V3.2 ranked third and fifth.
OpenRouter is the world’s largest large model API aggregation platform, providing developers with a unified API interface to access hundreds of large language models worldwide. Its core features include multi-model invocation, intelligent routing optimization, and transparent performance rankings, aiming to solve the complexity of multi-model integration and vendor lock-in issues.
Data from the platform shows that programming and agent capabilities are becoming the two main competitive focus areas for large models.
Recently, overall invocation volume on OpenRouter has surged significantly. The official confirmed that M2.5 has driven increased demand for long text segments ranging from 100K to 1M tokens, which is a typical consumption scenario for agent workflows.
In terms of token calls, the top three domestic large models on this platform all focus on enhancing programming abilities and automating agent tasks, representing a significant application-level breakthrough for domestic large models in early 2026.
Xiyu Technology (MiniMax) released MiniMax M2.5 on February 13, claiming it to be the world’s first production-level flagship model designed natively for agent scenarios. Within seven days of release, its invocation volume exceeded 3.07 trillion tokens. Thanks to its excellent performance in programming and agent workflows and extremely low costs, it has become developers’ first choice.
Moonshot AI released KimiK2.5 on January 27. This model adopts a native multimodal architecture, capable of scheduling up to 100 “agent clones” to work in parallel, increasing efficiency of complex tasks by 3 to 10 times. It ranks first in multiple subcategories such as programming and tool invocation, with invocation volume far surpassing Gemini 3 and Claude models. According to The Paper, less than a month after its release, Kimi’s revenue over 20 days already exceeded its total revenue for all of 2025. The growth was mainly driven by a surge in global paid users and API calls, with overseas paid user numbers increasing rapidly.
Zhipu released GLM-5 on February 12. The model’s parameter scale was further expanded, employing sparse attention mechanisms, and was specially designed for complex system engineering and long-range agent tasks. With advantages such as free access and a 200K context window, user growth accelerated after release. Zhipu implemented measures such as limiting sales and price increases for the Coding Plan, and on New Year’s Eve announced a nationwide search for “computing power partners.”
As AI model application scenarios deepen, users are shifting from simple Q&A to complex workflows, such as code refactoring, file rewriting, document generation, and the proliferation of agent modes, leading to a clear trend of “inflation” in token consumption.
While performance continues to improve, domestic models still stand out for their cost-effectiveness. For example, compared to Claude Opus 4.6, MiniMax M2.5 and Zhipu GLM-5 have significant cost advantages: in input costs, both MiniMax M2.5 and GLM-5 are priced at $0.3 per million tokens, while Claude Opus 4.6 costs as much as $5 per million tokens, about 16.7 times higher; in output costs, MiniMax M2.5 is $1.1 per million tokens, GLM-5 is $2.55, and Claude Opus 4.6 reaches $25, making it roughly 22.7 times and 9.8 times more expensive than MiniMax M2.5 and GLM-5, respectively.
These domestic models do not fully reflect the token invocation volume of all Chinese model vendors. Haitong International Securities data shows that the daily token calls for Volcano Engine’s large models have grown from 20 trillion at the end of 2024 to 63 trillion by the end of 2025; Alibaba Cloud’s external clients’ daily token calls approached 5 trillion in 2025, with a target of at least 15-20 trillion in 2026. Internal business daily calls are planned to increase from 16-17 trillion to 100 trillion. Industry-wide, China’s total daily token consumption was about 100 billion at the start of 2024, surpassing 30 trillion by mid-2025, and by February 2026, the combined daily token consumption of mainstream large models had reached approximately 180 trillion.
Dongan Securities’ latest research report states that as domestic models improve their programming and agent capabilities, their invocation volume has increased significantly. Domestic large models in programming and agent fields are expected to further accelerate application deployment and increase token consumption.
Changjiang Securities previously indicated that as programming and multimodal models mature, downstream application scenarios are expected to be truly unlocked, bringing a large demand for high-quality tokens. Referring to overseas AI industry development patterns, there is about a two-year lag from capital expenditure investment to token demand explosion. Domestic major AI companies’ capital expenditure cycles lag about a year behind overseas, starting in the second half of 2024. Therefore, revenue for domestic cloud providers has begun to grow, and the true explosion of token volume is expected to occur in 2026.
(Source: Cailian Press)