Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

Running Minimax M2.1 (MiniMax (official)) with OpenCode (OpenCode) and mlx_lm.server. Works quite well on an M3 Ultra. Once the KV cache is warm the prompt processing is pretty quick. And token generation is very fast.

Awni Hannun

37,077 subscribers

32,329 просмотров • 6 месяцев назад •via X (Twitter)

Образование Новости и политика Наука и технологии

Anya Rossi• Live Now

Private livecam show

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

Running four simultaneous OpenCode agents works well with mlx_lm.server continuous batching and MiniMax M2.1 on an M3 Ultra:

Running four simultaneous OpenCode agents works well with mlx_lm.server continuous batching and MiniMax M2.1 on an M3 Ultra:

Awni Hannun

95,050 просмотров • 6 месяцев назад

Running four high-level OpenCode agents + subagents with mlx_lm.server continuous batching and MiniMax M2.5 (6-bit). Fits easily on a 512GB M3 Ultra. Generation is quite fast. But prefill is still slow compared to cloud servers.

Running four high-level OpenCode agents + subagents with mlx_lm.server continuous batching and MiniMax M2.5 (6-bit). Fits easily on a 512GB M3 Ultra. Generation is quite fast. But prefill is still slow compared to cloud servers.

Awni Hannun

25,535 просмотров • 5 месяцев назад

PSA: MiniMax (official) M2.5 is freely available on OpenCode Great opportunity to check out the power of open models for coding

PSA: MiniMax (official) M2.5 is freely available on OpenCode Great opportunity to check out the power of open models for coding

Niels Rogge

49,065 просмотров • 5 месяцев назад

You can use a 100% open source and MUCH cheaper/free alternative to Claude Code and Opus 4.5 OpenCode + MiniMax M2.1 can even build a 3D website using pure vibe coding. Steps are really simple: 1. Install OpenCode using the command 'npm i -g opencode-ai' 2. Get your MiniMax API key here: 3. Configure the MiniMax (official) mode Just type "opencode connect minimax" Coding plans start at $2… 10x cheaper than Claude Code (yes). You can also invite friends so they can get 10% off and you’ll get 10% API credits. (You can also use it locally if you have the config) And you're ready to build and iterate almost endlessly since the model is both way faster and cheaper than Opus in CC.

You can use a 100% open source and MUCH cheaper/free alternative to Claude Code and Opus 4.5 OpenCode + MiniMax M2.1 can even build a 3D website using pure vibe coding. Steps are really simple: 1. Install OpenCode using the command 'npm i -g opencode-ai' 2. Get your MiniMax API key here: 3. Configure the MiniMax (official) mode Just type "opencode connect minimax" Coding plans start at $2… 10x cheaper than Claude Code (yes). You can also invite friends so they can get 10% off and you’ll get 10% API credits. (You can also use it locally if you have the config) And you're ready to build and iterate almost endlessly since the model is both way faster and cheaper than Opus in CC.

Paul Couvert

42,443 просмотров • 6 месяцев назад

I did it! It works! Using GLM-4.7-4bit with mlx_lm.server and opencode to fix real code locally! 🔥 Here single M3 Ultra 512GB, nex step phase will be 2 using Tensor Parallelism and then apply same changes to exo. Prefill is slow on a single machine, but generation is good.

I did it! It works! Using GLM-4.7-4bit with mlx_lm.server and opencode to fix real code locally! 🔥 Here single M3 Ultra 512GB, nex step phase will be 2 using Tensor Parallelism and then apply same changes to exo. Prefill is slow on a single machine, but generation is good.

Ivan Fioravanti ᯅ

44,000 просмотров • 6 месяцев назад

This is MiniMax-M2.5 MLX running in LM Studio on an Apple Mac Studio M3 Ultra 512GB. Fast enough out of the box for hosting OpenClaw, n8n workflows, and Open WebUI for the team.

This is MiniMax-M2.5 MLX running in LM Studio on an Apple Mac Studio M3 Ultra 512GB. Fast enough out of the box for hosting OpenClaw, n8n workflows, and Open WebUI for the team.

Patrick J Kennedy

73,547 просмотров • 5 месяцев назад

ICYMI: CyOps Arena is now live, co-hosted with MiniMax (official). With a $5,000 prize pool and 80% off MiniMax M3 model token pricing, there's never been a better time to start building. New to CyOps? Watch this quick tutorial and get your first project up and running in minutes 👇

ICYMI: CyOps Arena is now live, co-hosted with MiniMax (official). With a $5,000 prize pool and 80% off MiniMax M3 model token pricing, there's never been a better time to start building. New to CyOps? Watch this quick tutorial and get your first project up and running in minutes 👇

Cysic

21,287 просмотров • 1 месяц назад

MLX + OpenCode + Qwen3.5-122B-A10B-4bit on M3 Ultra created a great snake game! Work zero-shot. Video clearly in super fast mode during generation. I generated the prompt using Grok 4.20, it's in the article.

MLX + OpenCode + Qwen3.5-122B-A10B-4bit on M3 Ultra created a great snake game! Work zero-shot. Video clearly in super fast mode during generation. I generated the prompt using Grok 4.20, it's in the article.

Ivan Fioravanti ᯅ

74,659 просмотров • 4 месяцев назад

We ran the same prompt and identical starting context on MiniMax-M2.1 and Claude Sonnet 4.5. MiniMax-M2.1 by MiniMax (official) reached a usable result faster, required fewer structural fixes, and produced a more consistent visual and interaction flow from the first pass. To test this properly, we asked both models to build a complex single-page web animation with real-world visual and physics constraints. Comparison video below 👇

We ran the same prompt and identical starting context on MiniMax-M2.1 and Claude Sonnet 4.5. MiniMax-M2.1 by MiniMax (official) reached a usable result faster, required fewer structural fixes, and produced a more consistent visual and interaction flow from the first pass. To test this properly, we asked both models to build a complex single-page web animation with real-world visual and physics constraints. Comparison video below 👇

GitHub Projects Community

29,252 просмотров • 6 месяцев назад

MiniMax M3 support added to mlx-vlm with MSA implementation! 🚀 Tested on M3 Ultra 512GB running at 24 tps with peak memory ~240GB. Now working on optimizing performance and adding ton of tests 💪 Model is here: PR is here:

MiniMax M3 support added to mlx-vlm with MSA implementation! 🚀 Tested on M3 Ultra 512GB running at 24 tps with peak memory ~240GB. Now working on optimizing performance and adding ton of tests 💪 Model is here: PR is here:

Ivan Fioravanti ᯅ

24,376 просмотров • 1 месяц назад

The new MiniMax M2.1 model is now available in the Blackbox CLI. 3 games built with a similar prompt, here was the result. Get started here

The new MiniMax M2.1 model is now available in the Blackbox CLI. 3 games built with a similar prompt, here was the result. Get started here

BLACKBOX AI

15,658 просмотров • 6 месяцев назад

🎉 Congrats to MiniMax (official) on releasing MiniMax M3! Frontier coding and agentic capabilities, native image and video input, computer use, and a 1M-token context window, all in a single open model. At the heart of M3 is MSA, a new sparse attention architecture: instead of attending densely over the full KV cache, each query scores 128-token KV blocks and runs attention only over the top blocks. That is what makes 1M-token context practical to serve. M3 runs in vLLM with day-0 support, verified on NVIDIA and AMD hardware: ✨ MSA sparse attention with dedicated prefill and decode kernels ✨ 1M-token context serving with prefix caching and chunked prefill ✨ BF16 and MXFP8 checkpoints, with MoE backends for both Hopper and Blackwell ✨ Native multimodal input (image + video) ✨ Tool calling, reasoning parsing, and thinking-mode control for agent workloads Day-0 support like this is a true team effort. Grateful to the teams at MiniMax (official), NVIDIA AI, AI at AMD, and Inferact, and to the vLLM community for making it happen. 🙏 Deep dive into the implementation, kernel work, and deployment recipes: 🔗

🎉 Congrats to MiniMax (official) on releasing MiniMax M3! Frontier coding and agentic capabilities, native image and video input, computer use, and a 1M-token context window, all in a single open model. At the heart of M3 is MSA, a new sparse attention architecture: instead of attending densely over the full KV cache, each query scores 128-token KV blocks and runs attention only over the top blocks. That is what makes 1M-token context practical to serve. M3 runs in vLLM with day-0 support, verified on NVIDIA and AMD hardware: ✨ MSA sparse attention with dedicated prefill and decode kernels ✨ 1M-token context serving with prefix caching and chunked prefill ✨ BF16 and MXFP8 checkpoints, with MoE backends for both Hopper and Blackwell ✨ Native multimodal input (image + video) ✨ Tool calling, reasoning parsing, and thinking-mode control for agent workloads Day-0 support like this is a true team effort. Grateful to the teams at MiniMax (official), NVIDIA AI, AI at AMD, and Inferact, and to the vLLM community for making it happen. 🙏 Deep dive into the implementation, kernel work, and deployment recipes: 🔗

vLLM

40,306 просмотров • 1 месяц назад

🐙 Fall into the Backrooms with one prompt. "Backrooms Dreamcore" is now live on MiniMax Hub's Skill Square. Try and create your own dreamcore space! #MiniMax #Hailuo #MiniMaxHub

🐙 Fall into the Backrooms with one prompt. "Backrooms Dreamcore" is now live on MiniMax Hub's Skill Square. Try and create your own dreamcore space! #MiniMax #Hailuo #MiniMaxHub

Hailuo AI-MiniMax Hub

15,235 просмотров • 12 дней назад

Minimax M3 is excellent at SVG generation, reaching close to Gemini 3.5 Flash levels and beating Opus 4.7 on SVG-Bench. With 1M context, native multimodality, strong agentic/coding ability and open weights coming soon, the closed-source moat is thinning fast. Full Video:

Minimax M3 is excellent at SVG generation, reaching close to Gemini 3.5 Flash levels and beating Opus 4.7 on SVG-Bench. With 1M context, native multimodality, strong agentic/coding ability and open weights coming soon, the closed-source moat is thinning fast. Full Video:

WorldofAI

16,499 просмотров • 1 месяц назад

So I created a simple Expo app for remotely connecting to an OpenCode server (running on my mac) so I can remotely control & prompt from my iPad or phone. Not very polished but it works pretty well and its kinda interesting that you can see the feedback loop live.

So I created a simple Expo app for remotely connecting to an OpenCode server (running on my mac) so I can remotely control & prompt from my iPad or phone. Not very polished but it works pretty well and its kinda interesting that you can see the feedback loop live.

ryan vogel

50,706 просмотров • 7 месяцев назад

MiniMax (official) M2.1 cooked so hard like this is so good looking website with no external assets 🤯 i am sharing prompts as it was very demanded , for you guys made it in a template so just change first line of the prompt and make beautiful site and tag me ❤️

MiniMax (official) M2.1 cooked so hard like this is so good looking website with no external assets 🤯 i am sharing prompts as it was very demanded , for you guys made it in a template so just change first line of the prompt and make beautiful site and tag me ❤️

Chetaslua

18,115 просмотров • 6 месяцев назад

Opus 4.6 vs. Minimax M2.5 Prompt: Build an interactive solar system from scratch. Opus 4.6 tried to create a beautiful UI, but the sun’s shadow ruined it. Minimax M2.5 built a simple version that works beautifully. The winner is: 🥇 Minimax M2.5 🥈 Opus 4.6

Opus 4.6 vs. Minimax M2.5 Prompt: Build an interactive solar system from scratch. Opus 4.6 tried to create a beautiful UI, but the sun’s shadow ruined it. Minimax M2.5 built a simple version that works beautifully. The winner is: 🥇 Minimax M2.5 🥈 Opus 4.6

Okara

80,558 просмотров • 5 месяцев назад

The MiniMax M2 model is mind-blowing! It's open-source. It outperforms Gemini 2.5, Claude 4.1, and Qwen3 across coding and tool-use benchmarks. Right now, it's one of the world's top 5 models in intelligence! And here is the best part: Claude is one of the best models you can use today, and MiniMax M2 costs only 8% of that! It's smaller, faster, and cheaper. Extremely efficient at using tokens. Minimax M2's biggest strength: High agentic capabilities. The model can plan and execute complex multi-tool workflows. It's reliable and very robust at executing long-horizon tool chains. In summary: • Low latency • Very cheap • Excels at agentic tasks • Open-source The model currently powers the MiniMax Agent and is available for a free global trial. You can access MiniMax M2's API here: To access the agent: And here is the MiniMax website: Thanks to the MiniMax team for showing me the ropes and partnering with me on this post.

The MiniMax M2 model is mind-blowing! It's open-source. It outperforms Gemini 2.5, Claude 4.1, and Qwen3 across coding and tool-use benchmarks. Right now, it's one of the world's top 5 models in intelligence! And here is the best part: Claude is one of the best models you can use today, and MiniMax M2 costs only 8% of that! It's smaller, faster, and cheaper. Extremely efficient at using tokens. Minimax M2's biggest strength: High agentic capabilities. The model can plan and execute complex multi-tool workflows. It's reliable and very robust at executing long-horizon tool chains. In summary: • Low latency • Very cheap • Excels at agentic tasks • Open-source The model currently powers the MiniMax Agent and is available for a free global trial. You can access MiniMax M2's API here: To access the agent: And here is the MiniMax website: Thanks to the MiniMax team for showing me the ropes and partnering with me on this post.

Santiago

91,197 просмотров • 8 месяцев назад

OpenCode + MLX + Qwen3.5-397B-A17B-4bit. Video is 8x, but the goal is showing that It works! This is something unimaginable just few months ago. MLX Team is pushing like crazy and M5 Ultra will do the rest 🚀

OpenCode + MLX + Qwen3.5-397B-A17B-4bit. Video is 8x, but the goal is showing that It works! This is something unimaginable just few months ago. MLX Team is pushing like crazy and M5 Ultra will do the rest 🚀

Ivan Fioravanti ᯅ

48,692 просмотров • 5 месяцев назад

MiniMax M2.1 vs Opus 4.5 vs GLM-4.7 built an interactive 3D solar system from scratch. Running on 3 Claude Code in parallel. The winner is: 🥇 GLM-4.7! 🥈 Opus 4.5 (slowest) 🥉 M2.1 (fastest) GLM-4.7 design capabilities are another level vs 4.6!

MiniMax M2.1 vs Opus 4.5 vs GLM-4.7 built an interactive 3D solar system from scratch. Running on 3 Claude Code in parallel. The winner is: 🥇 GLM-4.7! 🥈 Opus 4.5 (slowest) 🥉 M2.1 (fastest) GLM-4.7 design capabilities are another level vs 4.6!

Ivan Fioravanti ᯅ

81,630 просмотров • 6 месяцев назад