
OpenBMB
@OpenBMB • 8,298 subscribers
OpenBMB (Open Lab for Big Model Base) aims to build foundation models and systems towards AGI. Connect with us: https://t.co/N9pevTnoOa
Shorts
Videos

1/5 🚀 PilotDeck is now live — the open-source AI agent OS built for all scenarios! Built by TsinghuaNLP × ModelBest × OpenBMB × AI9stars, PilotDeck is here for full memory transparency, intelligent cost routing, and agents that never stop working for you. 🔗 Show Case: 💻 GitHub: One person. One fleet of agents. Ship something real. 🔥🧵
OpenBMB280,066 Aufrufe • vor 9 Tagen

1/5 MiniCPM-V 4.6 (1.3B) is now live 🚀🚀 High-res visual processing, optimized for consumer-grade and mobile hardware. We’ve leveraged the latest LLaVA-UHD v4 technique to cut vision encoding costs by 55%, enabling native edge deployment with extreme efficiency. 🔥 Beats Gemma4-E2B-it and Qwen3.5-0.8B across key multimodal and Artificial Analysis benchmarks — scoring higher than Qwen3.5-0.8B using just 2.5% of its token budget. ⚡ TTFT (75.7ms) 2.2x Faster than Qwen3.5-0.8B even with 3136² high-res images. 🏗️ ~1.5x Token Throughput compared with Qwen3.5-0.8B on a single RTX 4090. Try the model here: 🤗 Hugging Face: 💻 GitHub: 🔭 Modelscope: 🌐 Web Demo: 📱 App Demo:
OpenBMB350,749 Aufrufe • vor 25 Tagen

🚀 VoxCPM 2 is live! 🎉 Another open-source AI #TTS model from China — and one that stands shoulder to shoulder with Qwen3-TTS, while bringing everything into a single unified model. After rapid iterations from V1 (zero-shot cloning) to V1.5 (long-form + fine-tuning), #VoxCPM has consistently pushed quality and usability forward. Now, VoxCPM 2 takes it further: 🔹30+ languages — truly global, truly local. 🔹Infinite voice design — type it, hear it, control it. From a whisper to a booming cinematic voice. 🔹Studio-grade audio — 48kHz ultra-high fidelity with emotional depth 🔹Diffusion-Autoregressive cloning — preserves more acoustic and emotional detail than token-based models like Qwen3-TTS 💡 Big shoutout to Grok — used your multi-image video magic for our launch demo. It’s scarily good at keeping visuals consistent across shots. Elon Elon Musk, this one’s for you. 😉 Check the demo & start cloning your dream voice: 🌐 Hugging Face Space: 🤗 Hugging Face Model: 🤖 ModelScope Model: 💻 GitHub: #TTS #AI #VoiceCloning #GrokImagine #ElonMusk #OpenBMB #VoxCPM
OpenBMB554,363 Aufrufe • vor 2 Monaten

🚀 🚀Excited to announce the technical report of MiniCPM-o 4.5! MiniCPM-o 4.5 transitions #AI interaction from traditional turn-based processing to a real-time, native full-duplex stream-based paradigm. 🌊 The Omni-Flow Framework Instead of traditional VAD-based workarounds, we introduce the #Omni-#Flow framework. This unified stream paradigm aligns video, audio, and text on a synchronized millisecond timeline. • Native Full-Duplex: Simultaneous perception and response. • Proactive Interaction: Natively manages turn-taking without external VAD, supports proactive reminding. 📉 9B Scale, SOTA Performance MiniCPM-o 4.5 demonstrates SOTA multimodal intelligence at its scale: • Multimodal Benchmarks: Comparable to #Gemini 2.5 Flash on MMBench EN (87.6) and MathVista (80.1). • Streaming Evaluation: 54.4% win rate on LiveSports-3K-CC, surpassing specialized models. 💻 The Ultimate Edge AI — Fully Functional without Network Connection We are providing one-click installers for Windows (12G VRAM,RTX 5070) and macOS (M1-M5 Max/ M5 Pro). • Local API Support: Deploy your own inference server to integrate native full-duplex into custom apps. • Free Access: We are offering free community API services for exploration. • 100% Private: Your data never leaves your machine. Deploy in under 10 minutes. 🛠️👇 👐 Join the Open Future The weights are open. The protocol is public. 📄 Technical Report: 💻 GitHub: 🤗 HuggingFace: 🌐 Web Demo: #MiniCPMo #OpenSourceAI #EdgeAI #MachineLearning #ComputerVision #LLM
OpenBMB146,678 Aufrufe • vor 1 Monat

🥳 Introducing MiniCPM-o 4.5 The first full-duplex omni-modal LLM in open-source community 🎬🎙️ 🔥 Key Highlights: • Full-duplex Omni-modal Live Streaming: The model can see, listen, and speak simultaneously in a real-time conversation without mutual blocking • Proactive Interaction: Moving beyond reactive QA to performing proactive interaction, such as initiating reminders • Leading Performance: Scoring 77.6 on OpenCompass, it outperforms GPT-4o & Gemini 2.0 Pro in vision-language tasks with 9B params The best part? You can experience all above on your PC! #MiniCPM #OpenSource #MultimodalAI #LLM
OpenBMB396,770 Aufrufe • vor 4 Monaten

From lab to open-source: A new milestone for AI-driven education. 🎓 🤗 We’ve been closely following the MAIC project at Tsinghua University, and we’re thrilled to see it now open-sourced as #OpenMAIC. ✨ This isn't just another chatbot; it takes Multi-Agent orchestration to the next level by building a fully interactive classroom where AI instructors and peers collaborate in real-time. What makes it technically impressive: 🛠️ Complex Orchestration: Leveraging #LangGraph to manage spontaneous interactions—like #AI students "raising hands" during a live lecture. 🧠 Structured Planning: A dedicated "Plan Agent" that transforms raw PDFs into coherent, logically sequenced pedagogical flows. 💻 Beyond Text: A masterclass in GenUI implementation, featuring synchronized TTS, laser pointers, and real-time whiteboard demonstrations. 🥳 If you’re building complex, multi-modal #Agent workflows, this repo is a treasure trove of engineering insights. 🖥️Explore the project: 📰 Read the research:
OpenBMB152,347 Aufrufe • vor 2 Monaten

🤔The world’s best small models? We immediately compared Mistral-3-8B with our previous-gen model, MiniCPM-4.1 (Both in thinking) 😂The findings are compelling: ✅MiniCPM is still ~2x faster, maintaining a massive speed lead ✅It remains a full generation ahead in capabilities (excluding math/code) For developers prioritizing efficiency and speed, MiniCPM is undeniably the world's best small model.
OpenBMB260,271 Aufrufe • vor 6 Monaten

MiniCPM-o 4.5: Seeing, Listening, and Speaking — All at Once. 👁️👂🗣️ ✨Beyond traditional turn-taking, we’ve built a Native Full-Duplex engine that allows a 9B model to see, listen, and speak in one concurrent, non-blocking stream. Watch how it masters real-world complexity in real-time: 🔔 Proactive Auditory Interaction: Interrupts itself to alert you when it hears a "Ding!" while reading cards. 🎨 Temporal Flow Tracking: Follows your pen in real-time, narrating and "mind-reading" your drawing as you sketch. 🍎 Omni-Perception: Scans groceries & identifies prices on the fly. ✨Why it’s a category-leader: 📌Performance: Surpasses GPT-4o and Gemini 2.0 Pro on OpenCompass (Avg. 77.6). 📌Architecture: End-to-end fusion of SigLip2, Whisper, and CosyVoice2 on a Qwen3-8B base. 📌Efficiency: Full-duplex live streaming now runs locally on PCs via llama.cpp-omni. The era of "Wait-and-Response" AI is over. Proactive, real-time intelligence is now open-source. 🚀Experience it on Hugging Face: 🔗 #MiniCPM #Omnimodal #FullDuplex #EdgeAI #OpenSource #ComputerVision
OpenBMB115,189 Aufrufe • vor 3 Monaten

Why does a realistic voice matter? 🤔🤔 The same robot that feels creepy can transform into a trusted companion just by having a human-like voice. Think about the movie, Her. 🔥VoxCPM is a new paradigm: continuous, context-aware, and incredibly lifelike. ✅ Small size: 0.5B only ✅ Zero-Shot Voice Cloning ✅ Context-Aware, Expressive Speech Generation Try it 👉 🔗Huggingface | 🔗Github |
OpenBMB126,770 Aufrufe • vor 8 Monaten

🚀Introducing MiniCPM-V 2.6! 🔥 1、Surpassing GPT-4V in single image, multi-image and video understanding 📸🎥 2、Outperforms GPT-4o mini and Gemini 1.5 on OpenCompass 🏆 3、Real-time video analysis on iPad 📱💨 Try out the best on-device multimodal LLM here! 👑 GitHub: Huggingface: #MLLM #MiniCPM
OpenBMB196,281 Aufrufe • vor 1 Jahr

💥 Introducing MiniCPM-o 2.6: An 8B size, GPT-4o level Omni Model runs on device ✨ Highlights: ~Match GPT-4o-202405 in vision, audio and multimodal live streaming ~End-to-end real-time bilingual audio conversation ~Voice cloning & emotion control ~Advanced OCR & video understanding ~Offline iPad-compatible multimodal live streaming 🔗 Try it out: GitHub: HF: Demo:
OpenBMB97,542 Aufrufe • vor 1 Jahr

🚀 Introducing MiniCPM-V 4.5 8B: pushing the boundary of multimodal AI! ~ SOTA VL Capability: Surpasses GPT-4o, Gemini 2.0 Pro, Qwen2.5-VL 72B on OpenCompass! ~ "Eagle Eye" Video: 96x visual token compression for high refresh rate and long video understanding ~ Controllable Hybrid Fast/Deep Thinking ~ Strong OCR & Doc Parsing: Surpasses GPT-4o & Gemini 2.5 on OmniDocBench Get ready for the future of multimodal AI 👉 Huggingface| Github| Gradio| #AI #MiniCM #GPT #Gemini #OpenBMB #ArtificialIntelligence #MachineLearning
OpenBMB24,850 Aufrufe • vor 9 Monaten

🚀 Introducing AgentCPM-Explore: The First Open-Source 4B-Agent Model to Conquer GAIA & Complex Real-World Tasks! 🤗 Hugging Face: 🔗 GitHub: ✨ Key Highlights: ✅ SOTA Agentic Performance: Sets a new benchmark for 4B-scale agent models—outperforming all peers, surpassing 8B models, and rivaling select 30B+ and closed-source LLMs. 🧠 Deep Research Capability: Excels at long-horizon reasoning, supports 100+ turns of autonomous interaction with multi-source cross-validation, human-like self-correction, and dynamic tool use + strategy adaptation—just like a real researcher! 🔓 Full-Stack Open Source: We’re open-sourcing the entire end-to-end agent stack—not just the model! Empower your own innovations with - AgentRL: Asynchronous reinforcement learning framework - AgentDock: Secure, extensible tool sandbox - AgentToLeaP: An one-click evaluation platform for agent tool-learning capabilitie - Full training data pipeline & reproducible workflows #AgentCPM #OpenSourceAI #AgenticAI #AI #GAIA #LLM #OpenBMB #AIAgents #HuggingFace
OpenBMB13,996 Aufrufe • vor 4 Monaten

Introducing MiniCPM 4.1-8B: First Open-Source Reasoning LLM with Trainable Sparse Attention ✅ Strong Reasoning Capability: Surpasses similar-sized models on 15 tasks! ✅ Fast Generation: 3x decoding speedup for reasoning ✅ Efficient Architecture: Trainable sparse attention, frequency-ranked speculative decoding Download Models: Huggingface: Github: Technical Report: #AI #MiniCPM #LLM #OpenBMB #ArtificialIntelligence #MachineLearning
OpenBMB19,070 Aufrufe • vor 9 Monaten
Keine weiteren Inhalte verfügbar