HappyHorse Tops AI Video Blind Test Anonymously, Alibaba's Taotian and Sand.ai Under Scrutiny

robot
Abstract generation in progress

According to monitoring by 1M AI News, an anonymous model named HappyHorse-1.0 topped the Video Arena ranking on the AI video evaluation platform Artificial Analysis last week, securing first place in both text-to-video and image-to-video categories (excluding audio). This pushed ByteDance’s Seedance 2.0 to second place. In the audio category, Seedance 2.0 still leads by a narrow margin. There was no press conference, no technical blog, and no company attribution, and no one has publicly claimed it yet. The Video Arena ranking is based on an Elo blind testing system, where users vote for their preferred video from two generated options without knowing the model’s identity. HappyHorse has been on the list for a short time, with a sample size of about 3,500, which is less than half of Seedance 2.0, resulting in a wide confidence interval (±12-13 points). However, the lead in the no-audio category (approximately 76 points for text-to-video and about 48 points for image-to-video) far exceeds the margin of error. Based on the order of languages on the official website (with Chinese and Cantonese listed before English) and the ‘HappyHorse’ reference to the Year of the Horse in 2026, industry insiders speculate that the model originates from a Chinese team. There are two mainstream theories: 1. Several industry media outlets claim the model comes from Alibaba’s Taotian Group’s Future Life Lab, led by Zhang Di, who previously served as Vice President of Technology at Kuaishou and will lead the development of Keling AI starting in 2024, with a planned release of Keling 2.0 Master Edition in April 2025. In November of the same year, he will return to Alibaba. 2. User Vigo Zhao conducted a detailed comparison and found that HappyHorse completely matches multiple benchmark indicators of daVinci-MagiHuman, which was open-sourced by AI video startup Sand.ai in March of this year, and the structure of the official websites is also highly similar. Sand.ai was founded by Cao Yue, the first author of Swin Transformer, and is referred to in the industry as the ‘DeepSeek of AI video.’ HappyHorse’s official website indicates that the model has 15 billion parameters, 40 layers of self-attention transformers, uses a Transfusion architecture (which unifies text autoregressive prediction and video audio diffusion generation within the same model), has 8-step inference, outputs 1080p video with synchronized audio, and supports lip-sync in seven languages: Chinese, English, Japanese, Korean, German, French, and Cantonese. It is fully open-source and allows commercial use.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments