Video yükleniyor...
Video Yüklenemedi
Updated my HF Space for vibe testing smol VLMs on object detection, visual grounding, keypoint detection & counting! 👓 🆕Compare Qwen2.5 VL 3B vs Moondream 2B side-by-side with annotated images & text outputs. Try examples or test your own images! 🏃👇
15,717 görüntüleme • 11 ay önce •via X (Twitter)
10 Yorum

📱Space: Models by @Alibaba_Qwen and @moondreamai!

@skalskip92 @vikhyatk @JustinLin610 @onuralpszr you have to see this ^

for moondream object detection prompting with just the object name will work better, that's how we train it

I was unsure whether to use the full prompt or just the object name for the examples. Let me update it to make the comparison fairer 😃

That’s impressive. Playing around with models like that must be a lot of fun.

This is really awesome 🤩

awesome! 👏 very useful work!! 🥳🙏

@pcuenq Vibe testing VLMs, that's really cool! I'm curious, have you explored any blockchain-based applications for object detection or visual grounding? 🤔

I was experimenting with qwen and I can see it can detect each individual candies and when I ask a little bit differently it always says "colorful candies" and when I put that in to prompt I get some what better results but when I say return as "json" it just become one bbox

This is awesome, thank you so much for that. Also really helps to show the inference time. Now do all the other small-ish VLMs like Molmo, SmolVLM, InternVL, etc 😅

