- ByteDance froze Seedance 2.0's global rollout on March 16, 2026 after the Motion Picture Association and studios including Disney, Warner Bros., and Paramount sent cease-and-desist letters citing large-scale copyright infringement.
- OpenAI shut the Sora consumer app on April 26, 2026. The API sunsets September 24, 2026. The reason: the economics of running large video generation at consumer scale did not work.
- Alibaba launched HappyHorse 1.1 on June 22, 2026 with zero-drift lip sync, a new video editing modality, and context-aware speech pacing. HappyHorse 1.0 already holds No. 2 globally on the Artificial Analysis text-to-video leaderboard at 1,444 Elo.
- Alibaba Cloud opened two Paris data center availability zones on June 17, 2026 as part of a $52.7 billion global infrastructure investment, giving HappyHorse EU-compliant deployment options for the first time.
- The 90-day window between Seedance's global freeze and HappyHorse 1.1's launch is not coincidence. The competitive field for enterprise-grade, globally-deployable AI video has compressed to a much shorter list than it was at the start of 2026.
At the start of 2026, the enterprise AI video generation market had several serious global contenders. By June 22, 2026, two of the most credible ones had exited the international market within 90 days of each other.
OpenAI announced it was discontinuing Sora on March 24, 2026. The consumer app went dark on April 26. The API runs through September 24, 2026, and then it is gone. ByteDance froze Seedance 2.0's planned global rollout on March 16, 2026 after the Motion Picture Association issued a public statement and studios sent cease-and-desist letters. Seedance 2.0 remains active inside China, but the international API launch ByteDance had planned simply did not happen.
On June 22, 2026, Alibaba Cloud launched HappyHorse 1.1 into that gap. This post is about what actually caused the exits, what HappyHorse 1.1 brings technically, and what the compressed competitive field means for teams building AI video workflows.
How the Vacuum Formed
The two exits share no common cause. They are different failure modes that happened to produce the same outcome: fewer viable options for teams that need a global enterprise AI video API.
Sora: The Economics of Scale Did Not Work
Sora's shutdown was an economics problem, not a quality problem. The model could generate convincing video. The issue was what it cost to do so at consumer scale.
Generating video requires significantly more compute than generating text or images. A single high-quality video clip involves rendering hundreds of frames at high resolution, managing temporal coherence across those frames, and doing it faster than a human waits. The hardware cost per output second is orders of magnitude higher than it is for language generation. Sora's pricing did not recover those costs at the subscription tier OpenAI had set.
OpenAI announced the shutdown on March 24, citing the need to focus on sustainable products. The Sora consumer app closed April 26. The API continues through September 24, 2026, existing integrations get a wind-down window, but no new development is planned. We covered the full breakdown of the shutdown economics in our Sora shutdown analysis.
What makes this notable is that Sora was not a marginal product. It was the most publicly visible AI video model in the world, the model that generated headlines for two years before it launched, and that shaped how people thought about AI-generated video. Its exit tells you something about how hard the unit economics of this category are to get right.
Seedance 2.0: Hollywood Said No
Seedance 2.0's situation is structurally different. ByteDance launched the model inside China on February 12, 2026 to strong reception. The plan was a global rollout in mid-March. That plan ended on March 16.
The proximate trigger: a viral clip generated with Seedance 2.0 showing Brad Pitt and Tom Cruise in a rooftop fight. The clip spread widely. Hollywood's legal teams moved quickly. The Motion Picture Association issued a public statement accusing ByteDance of "large-scale unprotected copyright use." Disney's lawyers reportedly described the model's outputs as a "virtual smash-and-grab of Disney's IP." Warner Bros. Discovery, Paramount Skydance, Netflix, and Sony Pictures all sent separate cease-and-desist letters, according to reporting by TechCrunch and Dataconomy.
US senators Marsha Blackburn and Peter Welch wrote to ByteDance calling Seedance "the most glaring example of copyright infringement from a ByteDance product to date" and demanded an immediate shutdown. ByteDance responded by freezing the international rollout and promising to implement stronger IP safeguards. As of this writing, Seedance 2.0 operates only within China's domestic short-drama market at pricing of 28 to 46 yuan per million tokens. A global API is not available.

The two cases make an interesting contrast. Sora failed because the compute required to run it was more expensive than users would pay for. Seedance 2.0 stalled because it generated outputs that Hollywood could not accept. Both outcomes narrowed the field of globally-available enterprise AI video models significantly.
Who Built HappyHorse 1.0, and Why They Kept It Secret
HappyHorse 1.0 arrived in early April 2026 under a name with no corporate affiliation. It entered the Artificial Analysis leaderboard anonymously, climbed to No. 2 in text-to-video with a 1,444 Elo score, and attracted significant attention from the AI video research community before anyone knew who made it.
On April 10, 2026, Alibaba confirmed it was the creator. The model came from Alibaba's Taotian Future Life Lab, led by Zhang Di, formerly VP of Kuaishou and the head of the Kling AI technology team, who joined Alibaba at the end of 2025, according to CNBC's reporting. The anonymous launch was deliberate: Alibaba wanted benchmark validation before attaching its corporate identity to the product.
HappyHorse 1.0 is a 15-billion-parameter unified Transformer that generates synchronized video and audio in a single forward pass. It outputs 1080p video in approximately 38 seconds on a single H100 GPU. Native lip sync works across seven languages out of the box. The model became available via FAL.ai on April 27, 2026, with early-access pricing and a 10% discount for API integrators, according to the FAL press release.
What HappyHorse 1.1 Adds
HappyHorse 1.1, released June 22, 2026, is an enterprise-focused upgrade. Three capabilities are new relative to 1.0.
First, zero-drift lip sync for dialogue scenes. This addresses one of the most visible failure modes in AI video: avatar or talking-head content where lip movement diverges from the audio over the course of a sentence or paragraph. The 1.1 release specifically targets extended dialogue rather than short clips, which matters for product demos, training content, and anything that runs longer than 10 to 15 seconds.
Second, a new video editing modality. HappyHorse 1.0 supports text-to-video, image-to-video, and reference-video generation, three ways to get video out of the model. Version 1.1 adds a fourth: video editing mode, where the model takes an existing clip as input and modifies it per natural-language instructions. This is the category that tools like Runway and Adobe have been building toward, and its inclusion means HappyHorse 1.1 can now sit inside editing workflows, not just generation workflows.
Third, context-aware speech pacing. Generated voiceover has traditionally produced speech at a fixed or manually-specified pace. Context-aware pacing adapts delivery speed based on the content's emotional register, faster for energetic content, slower for explanatory or instructional scenes. For teams generating training videos or product explainers at scale, this reduces the manual voiceover correction pass that often follows AI generation.
HappyHorse 1.1 is available now on Alibaba Cloud Model Studio with full API access. A 40% launch discount applies for the first two weeks, according to VentureBeat's coverage.
Where HappyHorse 1.0 Sits on the Leaderboard
The Artificial Analysis Video Arena uses blind human preference votes to score models on Elo, the same method used in competitive chess. A higher Elo means more head-to-head preference wins against other models in the dataset. HappyHorse 1.0 holds 1,444 Elo in both text-to-video and image-to-video as of June 2026, No. 2 globally, tied with Grok Imagine 1.5 in text-to-video and ahead of Google Veo 3.1 with audio by 69 points in the T2V category.

The leaderboard position matters here for a specific reason. When Sora exited, teams building enterprise video workflows needed a replacement that met a quality bar roughly equivalent to what Sora could do. When Seedance 2.0 froze globally, the same question applied. HappyHorse 1.0 clearing the No. 2 position on a global benchmark, before 1.1 shipped, makes it a credible technical substitute, not just a cheaper alternative.
The Infrastructure Bet Behind This
On June 17, 2026, five days before HappyHorse 1.1 launched, Alibaba Cloud opened its first data centers in France, two availability zones in the Paris region. The France expansion brings Alibaba's global footprint to 105 availability zones across 32 regions, backed by a $52.7 billion global infrastructure investment announced earlier in 2026, according to Data Center Dynamics.
The timing is not accidental. Enterprise AI video in Europe requires data residency options that comply with GDPR. A China-headquartered company offering a model exclusively through Chinese cloud regions has a harder sales motion with European enterprise buyers than one that can say the data stays in France. The Paris region opens that conversation.
This is a meaningful structural difference from where Seedance 2.0 sits. Seedance operates inside China's domestic market. For teams in the EU or US evaluating enterprise AI video options, that is not the same deployment profile as a model with EU-region availability and a compliant data infrastructure story.
What the Competitive Field Looks Like Now
The enterprise AI video API market at the start of 2026 had OpenAI, Google, Runway, Kling, Alibaba, and ByteDance all in various stages of global offering. By June 2026, the list of models with a globally-accessible enterprise API that clears a reasonable quality bar looks like this:

Google Veo 3.1 remains a strong competitor, global API, audio-native, high quality. Runway Gen-4.5 still operates, though without native audio. Grok Imagine 1.5 sits at No. 1 on the leaderboard but is image-to-video only in the current API, limiting it for text-first workflows.
What HappyHorse 1.1 has that the others currently do not: the combination of global API access, audio in a single forward pass, and native multilingual lip sync in one model. That combination is why it occupies a distinct position in the current field rather than being just another capable model.
The Lip Sync Question Is Bigger Than It Looks
Lip sync quality is one of those features that looks like a nice-to-have until you try to use AI-generated video in a professional context. Then it becomes the blocker.
Any video that puts a face speaking to camera, a product demo with a presenter, a training video with an avatar instructor, a customer-facing explainer with a spokesperson, requires lip movements that match the audio. When they do not match, the uncanny valley effect kicks in immediately. Viewers notice in the first five seconds. Trust in the content drops.
HappyHorse 1.0's native lip sync across seven languages was already differentiating. The "zero-drift" improvement in 1.1 specifically addresses extended dialogue, the kind that runs for a minute or two in a training or onboarding video rather than the short clips most models are benchmarked on. That is a different problem from short-clip lip sync, and solving it expands the use case from social content into business communication workflows.
This is the same capability trend that makes talking-head lip sync one of the most in-demand features for enterprise video platforms. Teams building on the AI video generation layer for business use, training, product demos, customer education, need lip sync that holds across an entire scene, not just the first sentence. That is what HappyHorse 1.1 is targeting.
What the Pricing Landscape Looks Like
The compute economics that killed Sora are also the context for every pricing decision in this space. Here is where the main enterprise video generation APIs currently sit on cost per second of output at high resolution:

The reason Sora's economics did not work starts to become clearer when you look at this range. Google Veo 3.1 at approximately $0.40 per second of output is priced for enterprise integrators who can pass that cost through, not for consumer subscription models. A 60-second generated video costs $24 in API fees at that rate. At consumer subscription prices of $20 to $30 per month, the math collapses almost immediately for any meaningful usage volume.
HappyHorse 1.0's 38-second render time on a single H100 at spot pricing anchors the cost at a significantly lower point than GPU-cluster-intensive alternatives. The 40% launch discount on HappyHorse 1.1 for the first two weeks is a market development move, not a signal of what the long-term price will be, but it does accelerate API adoption among teams evaluating post-Sora alternatives.
What This Means for Teams Building AI Video Workflows
If your workflow currently routes through Sora's API, you have until September 24, 2026 before that route closes. The practical task is selecting an alternative and testing it at your usage volumes before then.
The three candidates that are genuinely globally available with a production-grade API as of June 2026 are HappyHorse 1.1, Google Veo 3.1, and Runway Gen-4.5. Grok Imagine 1.5 is worth evaluating if your workflow is image-to-video. Each covers different ground on quality, pricing, and capability profile.
The broader argument from the leaderboard data, covered in our analysis of the Grok Imagine 1.5 leaderboard shift, is that the No. 1 spot on the AI video generation leaderboard has changed hands four times in six months. Committing deeply to a single model at the generation layer is a structural risk when the leader changes every eight to ten weeks. Teams with routing logic that lets them swap the underlying model without rebuilding the application layer absorb those changes without rebuilding. Teams locked into one provider do not.
FAL.ai, which is already a primary routing layer for a number of AI video platforms, added HappyHorse 1.0 as an official API partner on April 27, 2026. That means infrastructure that already routes through FAL gains access to HappyHorse 1.1 without separate integration work as the new version rolls out. The supply-layer improvements described in this post flow through to platforms already connected to the FAL routing layer.
For teams not yet thinking at the routing level: the question is worth asking now. The AI video generation statistics for 2026 consistently show usage spreading across providers rather than concentrating on one. The field exited two major players in Q1 2026. It could exit or materially change others in Q3.
The Copyright Problem Is Not Solved
Seedance 2.0's legal situation carries a warning the industry has not fully priced in. Generating video of real celebrities, recognizable faces, voices, mannerisms, at scale creates legal exposure that does not dissolve because the output is AI-generated. The Hollywood studios who sent ByteDance cease-and-desist letters are not going away.
HappyHorse 1.1's zero-drift lip sync for dialogue is a different problem than what got Seedance in trouble, it is about sync quality for licensed or original characters, not about replicating celebrities. But the broader question of what AI video models were trained on, and what outputs they can or cannot produce, is going to be a factor for every enterprise buyer evaluating these tools in 2026 and 2027.
The current AI video disclosure law landscape is also evolving. New York's AI disclosure requirements, the EU AI Act's transparency rules, and similar legislation in other jurisdictions add compliance surface area to any enterprise AI video deployment. The teams best positioned for that environment are the ones who have documented their training data provenance and built model selection with legal risk in mind from the start.
The Short Version
The enterprise AI video market contracted meaningfully between March and June 2026. Sora failed because the compute economics of consumer video generation did not close. Seedance 2.0 froze globally because Hollywood moved legally, and ByteDance was not in a position to fight that battle while already under US regulatory scrutiny.
HappyHorse 1.1 launched into that contracted field with a genuine technical story: No. 2 globally on a competitive benchmark before any corporate branding was attached, a single-pass audio architecture that avoids the alignment errors of post-processing, zero-drift lip sync for extended dialogue, and EU infrastructure with France-region availability. It is the only model in the current field that combines global API access with native multilingual lip sync.
Whether Alibaba can hold that position as Grok, Google, and Runway continue shipping updates is a different question. Based on the pace of leaderboard changes in 2026, the answer will probably be visible by September. But for teams evaluating enterprise AI video options right now, HappyHorse 1.1 is on the short list, and the reasons it got there have more to do with its competitors' exits than with any marketing push from Alibaba.
If you need to turn a script or source material into finished produced video today, ngram lets you do that, including avatar-led content with lip sync, built on the same AI generation layer whose advances this post covers.
Frequently Asked Questions
What is HappyHorse 1.1?
HappyHorse 1.1 is Alibaba Cloud's enterprise AI video generation model, launched June 22, 2026. It upgrades HappyHorse 1.0 with zero-drift lip sync for extended dialogue scenes, a new video editing modality (fourth modality alongside text-to-video, image-to-video, and reference-video), and context-aware speech pacing. It is available via Alibaba Cloud Model Studio with full API access. HappyHorse 1.0 holds No. 2 on the Artificial Analysis text-to-video leaderboard with 1,444 Elo.
Why did Sora shut down?
OpenAI shut down the Sora consumer app on April 26, 2026, citing unsustainable economics. Generating high-quality video at consumer subscription prices does not recover the compute cost of running large video generation models at scale. The Sora API continues through September 24, 2026 for existing integrators, after which it will be discontinued.
Why did ByteDance freeze Seedance 2.0's global launch?
ByteDance froze Seedance 2.0's planned global rollout on March 16, 2026 following a Motion Picture Association statement and cease-and-desist letters from Disney, Warner Bros. Discovery, Paramount Skydance, Netflix, and Sony Pictures, alleging large-scale copyright infringement in the model's training data. A viral clip showing AI-generated footage of real celebrities triggered the legal response. Seedance 2.0 remains active in China's domestic market.
Who is behind HappyHorse?
HappyHorse comes from Alibaba's Taotian Future Life Lab, led by Zhang Di, who was formerly VP of Kuaishou and the head of the Kling AI technology team before joining Alibaba at the end of 2025. Alibaba launched HappyHorse 1.0 anonymously in early April 2026 and revealed its identity on April 10.
Is HappyHorse 1.1 available on FAL.ai?
HappyHorse 1.0 became available on FAL.ai on April 27, 2026 as an official API partner. HappyHorse 1.1 availability on FAL follows the June 22 launch; integrations already running through FAL's routing layer gain access to the updated model without separate integration work.
What does zero-drift lip sync mean?
Zero-drift lip sync means the model maintains accurate synchronization between lip movement and spoken audio across an extended dialogue scene, not just the first few seconds. Standard AI lip sync often drifts over longer clips, with mouth movements that fall out of alignment with the audio as the scene progresses. The zero-drift capability in HappyHorse 1.1 targets this specifically for dialogue scenes, relevant for training videos, product presenters, and any talking-head content that runs longer than a short clip.
You just read it. Now watch it.
ngram turns this post into a short explainer video: scenes, voiceover, and motion graphics included.






