Runway shipped Gen-4 on May 3 with frame-by-frame synchronized audio generation — soundscapes, dialogue, and environmental effects without post-production dubbing. The bigger story for power users: new API hooks enable hybrid pipelines that chain Gen-4 with Veo 3.1, Seedance, or other models for sequence-level routing, letting you pick the best engine per shot rather than locking into one ecosystem.
MiniMax just evolved its Hailuo Video Agent into the full "Media Agent," supporting comprehensive multimodal creation across text, image, video, music, and audio — launched worldwide today. Hailuo 2.3 ships alongside it with significantly improved physics simulation, anime/illustration stylization, and facial micro-expressions, all at the same price as Hailuo 02.
The mystery model that appeared anonymously on Artificial Analysis in early April and climbed to #1 in both text-to-video (Elo 1355) and image-to-video (Elo 1165) has been revealed as Alibaba's ATH AI Innovation Unit project. It's a 15B-parameter single-stream Transformer with native 1080p, integrated audio-video generation, and multi-shot cinematic control. Open weights status TBD — watch this one.
Instagram expanded its original content policy on May 5, nuking reach for accounts that primarily repost or aggregate. Millions of followers wiped overnight. The algorithm now requires "most" content in a 30-day window to be original — low-effort edits like watermarks or speed changes don't count. Human-centric, face-to-camera content is outperforming polished AI visuals in the new ranking. AI creators distributing on IG need to rethink their content mix immediately.
xAI's Grok Imagine Video now sits at Elo 1233 for text-to-video on Artificial Analysis, just behind Kling 3.0. As of May 6, Quality Mode is live for enterprise API users, with higher realism, stronger text rendering, and better creative control. The unified API generates video+audio from text or image input with restyling and object editing — worth benchmarking if you're already in the xAI ecosystem.
ComfyUI's V3 schema migration is rolling through core nodes — HunyuanVideo, Flux, SD3, upscale models, and LoRA extraction are all converted. Subgraphs are officially stable, letting you package node clusters into reusable LEGO blocks and share them without JSON files. App Mode converts spaghetti graphs into clean UIs with shareable links. Custom node authors: migrate before summer or your nodes will feel dated.
Alibaba's Wan 2.7 launched in late March as API-only with a novel "thinking mode" for video planning. Open weights have not dropped yet, breaking the pattern set by Wan 2.2's Apache 2.0 release. Based on Alibaba's historical cadence, community consensus expects weights by mid-to-late Q2 (i.e., any week now). This would be the biggest open-source video model drop of the year if it lands — keep your VRAM ready.
Lightricks' LTX 2.3 (March release, now widely adopted) remains the top open-weight video generation model — native 4K at 50fps with synchronized stereo audio at 24 kHz, runs on a single RTX 4090. The rebuilt encoder delivers sharper textures and facial detail, and it natively composes for 1080x1920 vertical framing instead of cropping horizontal. Commercial use is free under $10M ARR. If you're building local pipelines, this is your baseline.
YouTube's Partner Program remains fully open to AI-generated content with standard RPM of $0.03–$0.13 per 1K Shorts views. TikTok now auto-detects unlabeled AI content via C2PA and can reduce distribution 5–8% or remove it. Cross-platform creators face a compliance patchwork: TikTok mandates built-in AI labels, YouTube requires disclosure for realistic depictions, and Instagram's new system penalizes anything that looks aggregated. Label everything, everywhere.