正在加载视频...
视频加载失败
Laguna XS.2 from Poolside is a 33B MoE built for agentic coding. Red Hat AI trained a DFlash speculator for it: 0.6B drafter, 8 tokens per pass, no quality loss. FP8, NVFP4, and INT4 checkpoints via LLM Compressor. Models in comments. Speedup with vLLM:
20,411 次观看 • 17 天前 •via X (Twitter)
0 条评论
暂无评论
原始帖子的评论将显示在这里
