Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

Qwen3.5 is here 🚀 397B params, just 17B active. Native multimodal agents for coding, reasoning, GUI + video. 200+ languages. Open weights. Real scale. The next frontier is open. 🔗

Hugging Models

49,632 subscribers

107,871 görüntüleme • 3 ay önce •via X (Twitter)

Haberler & Politika Bilim & Teknoloji

Anya Rossi• Live Now

Private livecam show

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

🚨 ALIBABA JUST OPEN-SOURCED THE MOST POWERFUL MODEL IN THE WORLD - Qwen3.5 plus And it changes the AI race forever. This is not another “bigger model” release. This is a complete architectural reset. 397B parameters → only 17B active at inference, Qwen3.5 plus delivers performance on par with GPT-5.2 and Gemini-3-Pro. A new Hybrid Sparse MoE + Linear Attention architecture + Multi-Token Prediction = near-instant long-context responses and massive speed gains. Built as a true native multimodal model, it doesn’t just chat it builds. With a single prompt: “Use the React framework to create a visually impressive Mapbox demo featuring locations in Beijing and Shanghai” it generates a live interactive 3D map UI, structured components, real layouts, and production-ready frontend logic. This is real agentic, multimodal coding not a static code block. MMLU-Pro: 87.8 GPQA: 88.4 IFBench: 76.5 ⚡ up to 8× faster inference And it’s fully open-sourced under Apache-2.0. No locked APIs. No restricted tiers. 201 languages supported bringing frontier AI to developers, startups, and researchers worldwide. The race is no longer about the biggest model. It’s about the most usable intelligence. Open source just entered the top tier.

🚨 ALIBABA JUST OPEN-SOURCED THE MOST POWERFUL MODEL IN THE WORLD - Qwen3.5 plus And it changes the AI race forever. This is not another “bigger model” release. This is a complete architectural reset. 397B parameters → only 17B active at inference, Qwen3.5 plus delivers performance on par with GPT-5.2 and Gemini-3-Pro. A new Hybrid Sparse MoE + Linear Attention architecture + Multi-Token Prediction = near-instant long-context responses and massive speed gains. Built as a true native multimodal model, it doesn’t just chat it builds. With a single prompt: “Use the React framework to create a visually impressive Mapbox demo featuring locations in Beijing and Shanghai” it generates a live interactive 3D map UI, structured components, real layouts, and production-ready frontend logic. This is real agentic, multimodal coding not a static code block. MMLU-Pro: 87.8 GPQA: 88.4 IFBench: 76.5 ⚡ up to 8× faster inference And it’s fully open-sourced under Apache-2.0. No locked APIs. No restricted tiers. 201 languages supported bringing frontier AI to developers, startups, and researchers worldwide. The race is no longer about the biggest model. It’s about the most usable intelligence. Open source just entered the top tier.

RAVI KUMAR SAHU

379,387 görüntüleme • 3 ay önce

1.🚀 Baidu’s ERNIE 4.5 VL-28B-A3B-Thinking — an open-source MoE beast. With just 3B active params, it delivers SOTA multimodal reasoning: visual logic (MathVista 72.6), grounding, tool-calling, and more. The full family spans 12 models (0.3B → 424B). 🔗 Try it here: 💬 What’s your favorite “Thinking Mode” trick? #ERNIE45 #MultimodalAI

1.🚀 Baidu’s ERNIE 4.5 VL-28B-A3B-Thinking — an open-source MoE beast. With just 3B active params, it delivers SOTA multimodal reasoning: visual logic (MathVista 72.6), grounding, tool-calling, and more. The full family spans 12 models (0.3B → 424B). 🔗 Try it here: 💬 What’s your favorite “Thinking Mode” trick? #ERNIE45 #MultimodalAI

Helena AI

53,547 görüntüleme • 5 ay önce

Alibaba just dropped Qwen3.5-397B-A17B and there's a lot to unpack. 397B params, 17B active per forward pass. Sparse MoE done right. But the real story isn't the size—it's the architecture choices. The MoE Design Most MoE models feel like bolt-ons. Qwen 3.5's sparse activation is native—only 4.3% of parameters fire per token. That's how you get trillion-parameter-class performance without trillion-parameter inference costs. The 0.8 RMB/million tokens pricing isn't subsidized; it's structurally earned. Native Multimodal, Not Glued-On This is a vision-language model from the ground up. Heterogeneous architecture—separate processing pipelines for text, image, video that fuse early. Not a vision encoder slapped onto an LLM. The result: 90.8 on OmniDocBench, 79.0 on MMMU-Pro. Document understanding and visual reasoning without the usual brittleness. The Context Window Reality Qwen3.5-Plus (the hosted version) ships with 1M tokens by default. That's not a marketing number—they're actually positioning it for long-document workflows. With built-in adaptive tool use, it's clearly aimed at agentic automation, not just chat. What Actually Impressed Me • FP8 native pipeline: ~50% activation memory reduction • Async RL framework for continuous refinement—training and inference workloads separated • 201 languages (up from 119), 250k vocab for better low-resource encoding • Apache 2.0 license. Full weights on HuggingFace and ModelScope. The Benchmark Context 76.4 on SWE-bench Verified puts it in the range where it can handle real debugging workflows. 72.9 on BFCL v4 for agentic tool use. 88.4 on GPQA Diamond. These aren't SOTA in isolation, but the breadth is unusual—strong across reasoning, coding, multimodal, and agentic tasks. The Honest Caveat I haven't stress-tested the 1M context for needle-in-haystack retrieval yet. And "native multimodal" claims need real-world torture testing—PDFs with tables, charts, mixed layouts. Benchmarks are benchmarks. Bottom Line This isn't just another model release. It's a bet on efficient scale: big model capabilities, small active compute, open weights. At 1/18th the cost of Gemini 3 Pro, it's going to force pricing conversations across the board.

Alibaba just dropped Qwen3.5-397B-A17B and there's a lot to unpack. 397B params, 17B active per forward pass. Sparse MoE done right. But the real story isn't the size—it's the architecture choices. The MoE Design Most MoE models feel like bolt-ons. Qwen 3.5's sparse activation is native—only 4.3% of parameters fire per token. That's how you get trillion-parameter-class performance without trillion-parameter inference costs. The 0.8 RMB/million tokens pricing isn't subsidized; it's structurally earned. Native Multimodal, Not Glued-On This is a vision-language model from the ground up. Heterogeneous architecture—separate processing pipelines for text, image, video that fuse early. Not a vision encoder slapped onto an LLM. The result: 90.8 on OmniDocBench, 79.0 on MMMU-Pro. Document understanding and visual reasoning without the usual brittleness. The Context Window Reality Qwen3.5-Plus (the hosted version) ships with 1M tokens by default. That's not a marketing number—they're actually positioning it for long-document workflows. With built-in adaptive tool use, it's clearly aimed at agentic automation, not just chat. What Actually Impressed Me • FP8 native pipeline: ~50% activation memory reduction • Async RL framework for continuous refinement—training and inference workloads separated • 201 languages (up from 119), 250k vocab for better low-resource encoding • Apache 2.0 license. Full weights on HuggingFace and ModelScope. The Benchmark Context 76.4 on SWE-bench Verified puts it in the range where it can handle real debugging workflows. 72.9 on BFCL v4 for agentic tool use. 88.4 on GPQA Diamond. These aren't SOTA in isolation, but the breadth is unusual—strong across reasoning, coding, multimodal, and agentic tasks. The Honest Caveat I haven't stress-tested the 1M context for needle-in-haystack retrieval yet. And "native multimodal" claims need real-world torture testing—PDFs with tables, charts, mixed layouts. Benchmarks are benchmarks. Bottom Line This isn't just another model release. It's a bet on efficient scale: big model capabilities, small active compute, open weights. At 1/18th the cost of Gemini 3 Pro, it's going to force pricing conversations across the board.

Bo Wang

13,221 görüntüleme • 3 ay önce

Introducing Helmor The open-source, local-first answer to Conductor. A more refined, faster GUI for orchestrating coding agents. No cloud. One-click import from Conductor. AI made coding faster. Helmor is about finishing the rest of the loop: orchestration, workspaces, review, testing, and merge. We believe the next generation of GUI agent orchestration should be built in the open — by the community.

Introducing Helmor The open-source, local-first answer to Conductor. A more refined, faster GUI for orchestrating coding agents. No cloud. One-click import from Conductor. AI made coding faster. Helmor is about finishing the rest of the loop: orchestration, workspaces, review, testing, and merge. We believe the next generation of GUI agent orchestration should be built in the open — by the community.

Caspian 東澔

116,076 görüntüleme • 1 ay önce

Introducing GLM-5V-Turbo: Vision Coding Model - Native Multimodal Coding: Natively understands multimodal inputs including images, videos, design drafts, and document layouts. - Balanced Visual and Programming Capabilities: Achieves leading performance across core benchmarks for multimodal coding, tool use, and GUI Agents. - Deep Adaptation for Claude Code and Claw Scenarios: Works in deep synergy with Agents like Claude Code and OpenClaw. Try it now: API: Coding Plan trial applications:

Introducing GLM-5V-Turbo: Vision Coding Model - Native Multimodal Coding: Natively understands multimodal inputs including images, videos, design drafts, and document layouts. - Balanced Visual and Programming Capabilities: Achieves leading performance across core benchmarks for multimodal coding, tool use, and GUI Agents. - Deep Adaptation for Claude Code and Claw Scenarios: Works in deep synergy with Agents like Claude Code and OpenClaw. Try it now: API: Coding Plan trial applications:

Z.ai

1,960,071 görüntüleme • 2 ay önce

$GOOGL launched Gemma 4 which is its newest open model family built for reasoning, multimodal tasks, agentic workflows & efficient on-device use. The release adds long context, code generation & support for 140+ languages as Google keeps expanding its open model ecosystem.

$GOOGL launched Gemma 4 which is its newest open model family built for reasoning, multimodal tasks, agentic workflows & efficient on-device use. The release adds long context, code generation & support for 140+ languages as Google keeps expanding its open model ecosystem.

Shay Boloor

72,509 görüntüleme • 2 ay önce

OpenCode + MLX + Qwen3.5-397B-A17B-4bit. Video is 8x, but the goal is showing that It works! This is something unimaginable just few months ago. MLX Team is pushing like crazy and M5 Ultra will do the rest 🚀

OpenCode + MLX + Qwen3.5-397B-A17B-4bit. Video is 8x, but the goal is showing that It works! This is something unimaginable just few months ago. MLX Team is pushing like crazy and M5 Ultra will do the rest 🚀

Ivan Fioravanti ᯅ

48,610 görüntüleme • 3 ay önce

$Everyone's sleeping on MiniMax. Again. They just shipped M3. The first open-weights model to combine frontier coding, 1M context, and native multimodality in one drop. I plugged it into Claude Code this morning. Pasted a design from Dribbble. Watched M3 write production-ready React code in one session. At the agency, I just replaced Opus 4.8 with M3 for 80% of our coding tasks. The output is the same and we are running everything at a fraction of the cost. Open infrastructure is the future.$

Everyone's sleeping on MiniMax. Again. They just shipped M3. The first open-weights model to combine frontier coding, 1M context, and native multimodality in one drop. I plugged it into Claude Code this morning. Pasted a design from Dribbble. Watched M3 write production-ready React code in one session. At the agency, I just replaced Opus 4.8 with M3 for 80% of our coding tasks. The output is the same and we are running everything at a fraction of the cost. Open infrastructure is the future.

Prajwal Tomar

12,834 görüntüleme • 9 gün önce

Multimodal AI is here. Agents are getting smarter. And the battle for who owns intelligence is just beginning. We talk: GPT-4o and cinematic AI, owning your personal agent, and our next milestone 👀 If you care about open systems, real ownership, and the agentic future: ▶️ Watch ThinkPod Ep. 7

Multimodal AI is here. Agents are getting smarter. And the battle for who owns intelligence is just beginning. We talk: GPT-4o and cinematic AI, owning your personal agent, and our next milestone 👀 If you care about open systems, real ownership, and the agentic future: ▶️ Watch ThinkPod Ep. 7

THINK

13,217 görüntüleme • 1 yıl önce

1/4Introducing Qwen3-Coder-Next: Our latest open-weights model designed specifically to power the next generation of autonomous Coding Agents. Built on Qwen3-Next, this model is engineered to handle complex, long-horizon programming tasks with unprecedented efficiency. High-performance agentic intelligence is now in your hands.

1/4Introducing Qwen3-Coder-Next: Our latest open-weights model designed specifically to power the next generation of autonomous Coding Agents. Built on Qwen3-Next, this model is engineered to handle complex, long-horizon programming tasks with unprecedented efficiency. High-performance agentic intelligence is now in your hands.

Tongyi Lab

212,645 görüntüleme • 4 ay önce

Local. Open weights. Native 4K. LTX-2 is now a 100% Open Source AI video model, and I tested it on my rig! Installation, VRAM usage, and the prompts I used for this video, below 👇

Local. Open weights. Native 4K. LTX-2 is now a 100% Open Source AI video model, and I tested it on my rig! Installation, VRAM usage, and the prompts I used for this video, below 👇

TechHalla

34,466 görüntüleme • 5 ay önce

This is wild. Google just dropped Gemma 4. Apache 2.0, open weights, frontier models that run on phones, laptops, and desktops👇

This is wild. Google just dropped Gemma 4. Apache 2.0, open weights, frontier models that run on phones, laptops, and desktops👇

Min Choi

93,304 görüntüleme • 2 ay önce

Sam Altman on Open Source AI's Future⁣ ⁣ "Right now, what everybody wants is just the most capable frontier coding model they can have. The big frontier models, even if we made them open source, are hard to run." — #sama

Sam Altman on Open Source AI's Future⁣ ⁣ "Right now, what everybody wants is just the most capable frontier coding model they can have. The big frontier models, even if we made them open source, are hard to run." — #sama

AI Insights

19,138 görüntüleme • 4 gün önce

running Qwen3.5 397B MoE (17B active/token) on 4x DGX Sparks in FP8 (~400GB) > OpenCode driving > agent exploring its own config > probing all 4 Sparks (via ssh) + reporting thermals > inspecting how vLLM is serving it > collecting + analyzing its own stats local AI is awesome

running Qwen3.5 397B MoE (17B active/token) on 4x DGX Sparks in FP8 (~400GB) > OpenCode driving > agent exploring its own config > probing all 4 Sparks (via ssh) + reporting thermals > inspecting how vLLM is serving it > collecting + analyzing its own stats local AI is awesome

Ahmad

121,691 görüntüleme • 2 ay önce

Introducing Cosmos 3: Our latest frontier model for Physical AI Cosmos 3 is the world’s first fully open omnimodel with native vision reasoning, world and action generation. Today we’re releasing Super (32B) and Nano (8B) variants.

Introducing Cosmos 3: Our latest frontier model for Physical AI Cosmos 3 is the world’s first fully open omnimodel with native vision reasoning, world and action generation. Today we’re releasing Super (32B) and Nano (8B) variants.

NVIDIA AI

414,153 görüntüleme • 12 gün önce

Multimodal Real-Time Agents will be the future! 💯 Here is an early demo from what i am currently building! It uses Live API with Google DeepMind Gemini 2.0 Flash and React. Code will be open-source soon with a blog post on how to build your own Real-Time Agents! 🚀

Multimodal Real-Time Agents will be the future! 💯 Here is an early demo from what i am currently building! It uses Live API with Google DeepMind Gemini 2.0 Flash and React. Code will be open-source soon with a blog post on how to build your own Real-Time Agents! 🚀

Philipp Schmid

27,385 görüntüleme • 1 yıl önce

Minimax M3 is excellent at SVG generation, reaching close to Gemini 3.5 Flash levels and beating Opus 4.7 on SVG-Bench. With 1M context, native multimodality, strong agentic/coding ability and open weights coming soon, the closed-source moat is thinning fast. Full Video:

Minimax M3 is excellent at SVG generation, reaching close to Gemini 3.5 Flash levels and beating Opus 4.7 on SVG-Bench. With 1M context, native multimodality, strong agentic/coding ability and open weights coming soon, the closed-source moat is thinning fast. Full Video:

WorldofAI

16,499 görüntüleme • 12 gün önce

The next frontier of Minecraft is about to open! In partnership with NASA, get ready to blast off to the moon next week. 👀 🚀 #Artemis #MinecraftEdu

The next frontier of Minecraft is about to open! In partnership with NASA, get ready to blast off to the moon next week. 👀 🚀 #Artemis #MinecraftEdu

Minecraft Education

579,213 görüntüleme • 3 yıl önce

Robotics: coding agents’ next frontier. So how good are they? We introduce CaP-X: an open-source framework and benchmark for coding agents, where they write code for robot perception and control, execute it on sim and real robots, observe the outcomes, and iteratively improve code reliability. From NVIDIA Berkeley AI Research CMU Robotics Institute Stanford AI Lab 🧵

Robotics: coding agents’ next frontier. So how good are they? We introduce CaP-X: an open-source framework and benchmark for coding agents, where they write code for robot perception and control, execute it on sim and real robots, observe the outcomes, and iteratively improve code reliability. From NVIDIA Berkeley AI Research CMU Robotics Institute Stanford AI Lab 🧵

Max Fu

168,697 görüntüleme • 2 ay önce

📣 Introducing SWE-PolyBench: A new open-source multilingual benchmark for evaluating #AI coding agents SWE-PolyBench is the first benchmark to evaluate AI coding agents' ability to understand complex codebases, helping advance AI performance in the real world. Learn more. 👉

📣 Introducing SWE-PolyBench: A new open-source multilingual benchmark for evaluating #AI coding agents SWE-PolyBench is the first benchmark to evaluate AI coding agents' ability to understand complex codebases, helping advance AI performance in the real world. Learn more. 👉

Amazon Web Services

10,866 görüntüleme • 1 yıl önce