Meituan LongCat's banner

Meituan LongCat

@Meituan_LongCat • 5,632 subscribers

Meituan_LongCat

Shorts

Meet LongCat-Video-Avatar 1.5🐱—our upgraded, open-source digital human framework. Built for real production, not just short demos. What's New: 🔹 Upgraded Audio Encoder: Replaces Wav2Vec2 with Whisper-Large, yielding significantly smoother and more natural lip dynamics. 🔹 Production-Ready Stability: Achieves accurate lip-synchronization, full-body temporal stability, and robust long-video generation with strict identity consistency. 🔹 Stylized Domain Generalization: Robustly generalizes to anime, animals, and complex real-world conditions such as multi-person interactions and object handling. 🔹 Efficient 8-Step Inference: Advanced step distillation accelerates inference to 8 NFE, balancing cost-effective serving with exceptional visual fidelity. 📊 LongCat-Video-Avatar 1.5 performs strongly in realism, naturalness, and stability, outperforming leading open-source models and closed systems. 🐱 Avatar 1.5 framework is now open source: 🔗 Weights & Code: 🔗 HuggingFace: 🔗 Tech Report: 🔗 Project Page:

30,965 просмотров

🚀 LongCat-Video Now Open-Source: Text/Image-to-Video + Video Continuation in One Model 🏆 Text/Image-to-Video Performance Hits Open-Source SOTA 🎬 Minutes-Long High-Quality Videos: No Color Drift/Quality Loss (Industry-Standout) ⚙ 13.6B Params | Strong Open-Source DiT-Based Unified Multitask Video Base Model ⚡ C2F Pipeline + Block Sparse Attention: 720p/30fps Video in Minutes 🤗 Open-Source Links: GitHub： Hugging Face： Project Page：

🚀 LongCat-Video Now Open-Source: Text/Image-to-Video + Video Continuation in One Model 🏆 Text/Image-to-Video Performance Hits Open-Source SOTA 🎬 Minutes-Long High-Quality Videos: No Color Drift/Quality Loss (Industry-Standout) ⚙ 13.6B Params | Strong Open-Source DiT-Based Unified Multitask Video Base Model ⚡ C2F Pipeline + Block Sparse Attention: 720p/30fps Video in Minutes 🤗 Open-Source Links: GitHub： Hugging Face： Project Page：

43,748 просмотров

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

🚀 Introducing LongCat-Flash-Thinking-2601 — A version built for deep and general agentic thinking. ✨ Highlights: 🤖 Top Tier Agent Capabilities 🔹 Performance: Top tier benchmark results (TIR / Agentic Search / Agentic Tool Use) ; superb generalization ability, outperforming Claude in complex, random tasks 🔹 Env Scaling: Multiple automaticly constructed high-quality environments; dense dependency graph 🔹 Multi-Env RL: Extended DORA (our RL infra), supporting large-scale multi-environment agentic training 🛡️ Real-World Robustness 🔹 Performance: Solid performance in messy, uncertain scenarios (Vita-Noise & Tau^2-Noise) 🔹 Noise Analysis: Systematically analyzed real-world noise in agentic scenarios 🔹 Curriculum RL: Increasing noise type & intensity while training 🎯 Heavy Thinking Mode 🔹 Parallel Thinking: Expands breadth via multiple independent reasoning tracks 🔹 Iterative Summarization: Enhances depth by using a summary model to synthesize outputs, supporting iterative reasoning loops 📅 One more thing: 1M-token context via Zigzag Attention is coming soon. 🔍 Try it now: ✅ API access for this version is also available. Hugging Face: GitHub:

🚀 Introducing LongCat-Flash-Thinking-2601 — A version built for deep and general agentic thinking. ✨ Highlights: 🤖 Top Tier Agent Capabilities 🔹 Performance: Top tier benchmark results (TIR / Agentic Search / Agentic Tool Use) ; superb generalization ability, outperforming Claude in complex, random tasks 🔹 Env Scaling: Multiple automaticly constructed high-quality environments; dense dependency graph 🔹 Multi-Env RL: Extended DORA (our RL infra), supporting large-scale multi-environment agentic training 🛡️ Real-World Robustness 🔹 Performance: Solid performance in messy, uncertain scenarios (Vita-Noise & Tau^2-Noise) 🔹 Noise Analysis: Systematically analyzed real-world noise in agentic scenarios 🔹 Curriculum RL: Increasing noise type & intensity while training 🎯 Heavy Thinking Mode 🔹 Parallel Thinking: Expands breadth via multiple independent reasoning tracks 🔹 Iterative Summarization: Enhances depth by using a summary model to synthesize outputs, supporting iterative reasoning loops 📅 One more thing: 1M-token context via Zigzag Attention is coming soon. 🔍 Try it now: ✅ API access for this version is also available. Hugging Face: GitHub:

Meituan LongCat

82,449 просмотров • 6 месяцев назад

Meet LongCat-Video-Avatar 1.5🐱—our upgraded, open-source digital human framework. Built for real production, not just short demos. What's New: 🔹 Upgraded Audio Encoder: Replaces Wav2Vec2 with Whisper-Large, yielding significantly smoother and more natural lip dynamics. 🔹 Production-Ready Stability: Achieves accurate lip-synchronization, full-body temporal stability, and robust long-video generation with strict identity consistency. 🔹 Stylized Domain Generalization: Robustly generalizes to anime, animals, and complex real-world conditions such as multi-person interactions and object handling. 🔹 Efficient 8-Step Inference: Advanced step distillation accelerates inference to 8 NFE, balancing cost-effective serving with exceptional visual fidelity. 📊 LongCat-Video-Avatar 1.5 performs strongly in realism, naturalness, and stability, outperforming leading open-source models and closed systems. 🐱 Avatar 1.5 framework is now open source: 🔗 Weights & Code: 🔗 HuggingFace: 🔗 Tech Report: 🔗 Project Page:

Meituan LongCat

30,965 просмотров • 2 месяцев назад

🚀 LongCat-Video Now Open-Source: Text/Image-to-Video + Video Continuation in One Model 🏆 Text/Image-to-Video Performance Hits Open-Source SOTA 🎬 Minutes-Long High-Quality Videos: No Color Drift/Quality Loss (Industry-Standout) ⚙ 13.6B Params | Strong Open-Source DiT-Based Unified Multitask Video Base Model ⚡ C2F Pipeline + Block Sparse Attention: 720p/30fps Video in Minutes 🤗 Open-Source Links: GitHub： Hugging Face： Project Page：

🚀 LongCat-Video Now Open-Source: Text/Image-to-Video + Video Continuation in One Model 🏆 Text/Image-to-Video Performance Hits Open-Source SOTA 🎬 Minutes-Long High-Quality Videos: No Color Drift/Quality Loss (Industry-Standout) ⚙ 13.6B Params | Strong Open-Source DiT-Based Unified Multitask Video Base Model ⚡ C2F Pipeline + Block Sparse Attention: 720p/30fps Video in Minutes 🤗 Open-Source Links: GitHub： Hugging Face： Project Page：

Meituan LongCat

43,748 просмотров • 8 месяцев назад

Meet LongCat-Video-Avatar: a robust audio-driven avatar model that pushes the boundaries of long-form video generation. Compared with the previous InfiniteTalk, LongCat-Video-Avatar delivers far better long-sequence stability and realism. New highlights： ⚙ Built on the LongCat-Video architecture, now supporting Audio-Text-to-Video (AT2V), Audio-Text-Image-to-Video (ATI2V), and Video Continuation modes. 🎭 Open-source SOTA Realism: Ranked 1st in overall anthropomorphism scores for both single and multi-subject scenarios in EvalTalker evaluations (492 participants, 3 independent raters per video). ♾ High-quality long videos: Cross-Chunk Latent Stitching prevents pixel degradation and error accumulation over time, ensuring seamless stitching quality. 🔒 Long-term consistency: Reference Skip Attention maintains ID consistency while eliminating rigid copy-paste artifacts. 🪄 Supports multi-person and infinite-length video generation. 🔗Open-sourced Code: Hugging Face: Project: Paper:

Meituan LongCat

28,041 просмотров • 7 месяцев назад

Больше нет контента для загрузки