Brian Li's banner
Brian Li's profile picture

Brian Li

@Brian_Bo_Li2,765 subscribers

Brian is building, something new @amilabs Prev works LLaVA-OneVision/LMMs-Eval/LMMs-Engine/OneVision-Encoder.

Shorts

Throughout my journey in developing multimodal models, I’ve always wanted a framework that lets me plug & play modality encoders/decoders on top of an auto-regressive LLM. I want to prototype fast, try new architectures, and have my demo files scale effortlessly — with full support for parallelism and optimization. Not just to hack⚙️, but also to scale🚀. So finally we built it for ourselves. LMMs-Engine: a lean, efficient framework built to train unified multimodal model at scale. From Qwen LLM, VLM, LLaVA-OV, and WanVideo, to unified models like Qwen-Omni and BAGEL — plus Linear-Attn GDN and research prototypes like RAE and SiT - all under one modular system that seamlessly integrates diverse datasets and optimization strategies. Powered by FSDP2 multi-dim parallelism, Ulysses sequence parallel, Flash-Attention, Liger Kernels, and Native Sparse Attention (also with bonus support for the Muon optimizer for all models).

Throughout my journey in developing multimodal models, I’ve always wanted a framework that lets me plug & play modality encoders/decoders on top of an auto-regressive LLM. I want to prototype fast, try new architectures, and have my demo files scale effortlessly — with full support for parallelism and optimization. Not just to hack⚙️, but also to scale🚀. So finally we built it for ourselves. LMMs-Engine: a lean, efficient framework built to train unified multimodal model at scale. From Qwen LLM, VLM, LLaVA-OV, and WanVideo, to unified models like Qwen-Omni and BAGEL — plus Linear-Attn GDN and research prototypes like RAE and SiT - all under one modular system that seamlessly integrates diverse datasets and optimization strategies. Powered by FSDP2 multi-dim parallelism, Ulysses sequence parallel, Flash-Attention, Liger Kernels, and Native Sparse Attention (also with bonus support for the Muon optimizer for all models).

54,648 Aufrufe

Videos

Keine weiteren Inhalte verfügbar