Recently, Codex (Pro) double quota is very attractive until the end of May, and many people are opening fast thinking they can accelerate again 🚀


I checked the Rust source code of Codex CLI. Its fast just tells OpenAI "this request should go through the priority channel." That's right, the model hasn't changed, inference hasn't decreased, quality remains the same
It's not acceleration, it's cutting in line 😂
So, does cutting in line make it faster? That depends on how many people are in line; during peak hours, you might notice a difference, but during off-peak hours, almost no difference
Additionally, Claude Code also has a fast mode, with the same name, but fundamentally different. It runs the same model with a different configuration, making output 2.5 times faster, but token costs are directly 6 times higher 💸
One is free line-cutting, the other is burning money to accelerate. Truly, one "fast," two explanations
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin