Загрузка видео...
Не удалось загрузить видео
New Moondream 2B release! ✨ New features: - Long-form captioning - Open vocab tagging - Better counting, object detection, text understanding - Faster HF transformers inference
51,735 просмотров • 1 год назад •via X (Twitter)
Комментарии: 12

Release notes Demo

Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data. Explore Universal-2 updates 👇

wen bolting on a diffusion model to an output head and generating ghibli

investigating

this is a really awesome release video btw, i love this format, pretty clean. gonna fit all the new features into moondream-zig :)

possible to integrate xnnpack with zig code? they have good quant matmuls

MLX???

soon!

How good is this for OCR?

It was a big focus for this release, but we're only 10% of the way through OCR pretraining. I'd say it's decent but expect a ton more improvement coming soon!

But can it ghiblify images? 😜

watch this space
