Video yükleniyor...
Video Yüklenemedi
How do we build multimodal systems that work effectively across the globe? 🌍 Today we release the Aya Vision Technical Report, the detailed recipe behind Aya Vision models, unifying state-of-the-art multilingual capabilities in multimodal and text tasks across 23 languages!
15,561 görüntüleme • 1 yıl önce •via X (Twitter)
9 Yorum

Our 8B model is best-in-class for its size outperforming models like Pixtral-12B and Pangea-7B. The compact Aya Vision-32B pushes efficiency further, outperforming models >2x larger like Llama3.2-90B & Molmo-72B! Setting a new Pareto frontier in multilingual multimodal AI. 💪

How to build strong multimodal models for many languages where high-quality multimodal multilingual data is almost non-existent? We develop a novel synthetic annotation framework creating rich, human-preferred multimodal data in 23 languages! ✅

Adding vision often degrades text-only skills (catastrophic forgetting!), especially across languages. 📉 Our novel cross-modal model merging technique fuses the original text LLM with the multimodal model, preserving text abilities and boosting multimodal win-rates! 🤝

Current multimodal evals often miss the mark. 🤔 Too rigid, prompt-sensitive, & English-only, they don't capture real-world nuances. We also introduce Aya Vision Bench! 📊 Our new benchmark focuses on human preference across 23 languages & 9 tasks for better MLLM evaluation. 🌍

Putting it all together for Aya Vision: each of our innovations boost Aya Vision’s performance, enabling SOTA performance: 💡 Synthetic data framework → +17.2% win rate (reaching 58.1%) 🤝 Cross-modal merging → +11.9% (reaching 70.0%) 🚀 Scaling to 32B → +9.1% (reaching 79.1%)

As promised, the Aya Vision Technical Report showcases our commitment to open-science, and completes the release of Aya Vision models and Aya Vision Bench. 🌍 📜Paper link:

Thank you to all authors: @TheyCallMeMr_, @YiyangNan, @johnamqdang , @aahmadian_, @singhshiviii, Madeline Smith, @bharatvenki, @vshmyhlo, @viraataryabumi, Walter Beller-Morales, Jeremy Pekmez, @TheOneKloud, @acyr_l , @nickfrosst, Phil Blunsom, @aidangomez, @1vnzh…

…@mziizm, Manoj Govindassamy, @commit_xact, @mgalle, @beyzaermis, @ahmetustun89, and @sarahookr.

The global AI sector is evolving rapidly, supported by advancements in technology and infrastructure. AIS offers targeted exposure to key players driving these developments.



