Open-source models are catching up, but what exactly are they catching up to?

robot
Abstract generation in progress

Open source is catching up, but we need to be clear about what it has caught up to

Z.ai releases GLM-5.1, and Modal is almost simultaneously rolled out as a hosted service. Two things layered together are more interesting than looking at either one alone.

The model is a 754B MoE (40B active parameters). SWE-Bench Pro score is 58.4%; on coding tasks it’s about on par with GPT-5.4 and Opus 4.6. It can run a full 8 hours in autonomous mode, and it doesn’t crash after thousands of iterations. BenchLM is currently ranked #10; KernelBench shows it’s 3.6x faster than prior open-source approaches.

Reactions on social media are split: Bindu Reddy says this is evidence that open source has caught up to closed source; Victor Taelin doubts that “500+ tokens/s” is realistic at FP8 precision, and that a real deployment might be closer to around 200 tps. Both sides have a point—this model can perform—but the marketing numbers are a bit optimistic.

This open-source release differs from previous ones in a few ways:

  • Modal’s free endpoints change the algorithm for usability and cost. Z.ai (formerly Zhipu; now publicly listed in Hong Kong) reaches Western developers via Modal, so developers don’t have to worry about geopolitical frictions; the $1 per million input token pricing also lowers the price anchor for proprietary services.
  • Context matters for the messaging around inference efficiency. GLM-5.1 uses sparse mixture attention and asynchronous reinforcement learning to control expansion costs. But the “500+ tps” depends on infrastructure that most people don’t have. The real bottleneck is in service-ization and scheduling, not in the model’s paper specifications.
  • It can plug directly into existing toolchains. Compatibility with Claude Code and OpenClaw means it can be swapped straight into existing proprietary workflows. The pressure this brings to Anthropic and OpenAI is mainly on pricing, not on having capabilities flattened.

MarkTechPost and Constellation both interpret this as open source and closed source’s “6-month gap” converging. In the direction of coding agents, that assessment is very likely true. Z.ai uses an MIT license, and second-stage fine-tuning is already on the way.

But don’t take this to mean open source has fully turned the tables. Proprietary models still lead by a lot in safety alignment and multimodal reasoning. What’s being eroded is the moat in the coding-agent scenario: enterprises value deployment cost more for these kinds of tasks, and they’re less sensitive to that marginal difference in capability.

What matters more than the model is the infrastructure

Modal is built on a B200 cluster. It deploys GLM-5.1 with SGLang; in interactive scenarios it can run at 30–75 tokens/s. These seemingly boring engineering details are what truly matter.

Z.ai demonstrates throughput of 21.5k QPS on VectorDBBench (after 600 iterations of optimization). This kind of performance requires Modal’s serverless elasticity for stable delivery; the model alone can’t reach that scale.

It also changes how we think about “model releases”: they’re no longer isolated events, but part of an ecosystem strategy. The combination of “open-source models + Western infrastructure” becomes a hedge against being locked into a single lab’s API.

As for the boundaries of GLM-5.1: coding benchmark scores reach 94.6% of Opus, but there’s still a gap in reasoning. A more “balanced” capability profile is more meaningful for specific use cases.

Looking ahead: Z.ai’s revenue grew 131% year over year last year. If inference costs fall below $0.50 per million tokens, open source could capture 30–50% of coding-agent deployment share within a year. Changes in U.S. policy may cause some disruption, but the current risk looks low.

Viewpoint Evidence Industry impact My take
Open-source optimists SWE-Bench Pro 58.4%, 8-hour autonomous runs Enterprises begin piloting open-source replacements A bit overstated. The advantage is in integration and usability, not in scores. Modal’s free trials matter more than leaderboard rankings.
Proprietary defenders BenchLM #10; reasoning capability still below Opus Closed source continues to lead in safety and multimodality Pricing mismatch. GLM’s efficiency compresses the pricing power of the competition, and Anthropic has to respond.
Infrastructure pragmatists Modal endpoints, OpenClaw compatibility Capital concentrates toward serverless platforms This is the key. Whichever model wins, infrastructure companies benefit.
Geopolitical skeptics Z.ai publicly listed in Hong Kong, MIT license, tensions between China and the U.S. The source of the models will face more scrutiny Overestimated for now. It’s more practical to focus on monetization opportunities with Western hosting partners.

Conclusion: This one-two punch confirms one thing: in the vertical of coding agents, open-source capability has basically caught up. The beneficiaries are the Builders who first built an “infrastructure-agnostic” architecture, and the investors who set up hosting platforms. Anthropic faces pricing pressure. Enterprises that remain deeply bound to closed-source APIs are paying a premium for capabilities that are shrinking with every passing day.

Importance: High
Category: Model releases, partnerships, open source

Judgment: For the coding-agent track, this is still a relatively early window. The first beneficiaries are two types of people: (1) Builders and integrators building infrastructure-agnostic workflows; (2) capital backing serverless hosting and inference platforms. For short-term traders, unless they can catch the cadence of price cuts and traffic migration, the edge is limited. For long-term holders, they need to watch whether the cost curve truly drops below $0.50 per million tokens, to validate whether market share can jump.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments