merve's banner
merve's profile picture

merve

@mervenoyann86,495 subscribers

(mer-veh) open-sourceress at @huggingface 🧙🏻‍♀️ DM me for any feedback about HF 🤗 https://t.co/MhrMkGTm7p

Shorts

RF-DETR just landed to Hugging Face transformers 🥵🔥 sota real-time detection & segmentation models by Roboflow 💜 > play with our real-time demo > fine-tune the models on your use case with our tutorials (takes a toaster's VRAM) > or just hand them to your agents 😄

RF-DETR just landed to Hugging Face transformers 🥵🔥 sota real-time detection & segmentation models by Roboflow 💜 > play with our real-time demo > fine-tune the models on your use case with our tutorials (takes a toaster's VRAM) > or just hand them to your agents 😄

56,363 views

this is the BEST vision language model I have ever tried! Aria is a new model by Rhymes.AI: a 25.3B multimodal model that can take image/video inputs 🤩 They release the model with Apache-2.0 license and fine-tuning scripts as well 👏 I tested it extensively, keep reading to learn more 🧶

this is the BEST vision language model I have ever tried! Aria is a new model by Rhymes.AI: a 25.3B multimodal model that can take image/video inputs 🤩 They release the model with Apache-2.0 license and fine-tuning scripts as well 👏 I tested it extensively, keep reading to learn more 🧶

176,047 views

Real-time DEtection Transformer (RT-DETR) landed in Hugging Face transformers 🤩 with Apache 2.0 license 😍 do DETRs Beat YOLOs on Real-time Object Detection? keep reading 👀

Real-time DEtection Transformer (RT-DETR) landed in Hugging Face transformers 🤩 with Apache 2.0 license 😍 do DETRs Beat YOLOs on Real-time Object Detection? keep reading 👀

155,222 views

ViTPose -- best open-source pose estimation model just landed to Hugging Face transformers 🕺🏻💃🏻 See how to use on the next one ⤵️

ViTPose -- best open-source pose estimation model just landed to Hugging Face transformers 🕺🏻💃🏻 See how to use on the next one ⤵️

83,367 views

Meta released LongVU: a new video LM that can handle long videos (great performance, battle-tested by me ⚔) TLDR; 1️⃣ downsample using DINOv2 to eliminate redundant scenes 🦖 2️⃣ fuse rest of the features using DINOv2 and SigLIP 3️⃣ select some tokens, pass to Qwen2/Llama-3.2-3B

Meta released LongVU: a new video LM that can handle long videos (great performance, battle-tested by me ⚔) TLDR; 1️⃣ downsample using DINOv2 to eliminate redundant scenes 🦖 2️⃣ fuse rest of the features using DINOv2 and SigLIP 3️⃣ select some tokens, pass to Qwen2/Llama-3.2-3B

49,521 views

OlmOCR is a new drop by Ai2 to parse any PDF 📝🤝 I have fed one of my old master's notes and it did a great job 💗 It is based on Qwen2VL-7B and works out of the box with transformers, has Apache 2.0 license 🔥

OlmOCR is a new drop by Ai2 to parse any PDF 📝🤝 I have fed one of my old master's notes and it did a great job 💗 It is based on Qwen2VL-7B and works out of the box with transformers, has Apache 2.0 license 🔥

36,624 views

New InternVL drop with a sota 78B model with MIT license 🔥 The release comes with seven new vision LMs based on InternViT 300M/6B and Qwen2.5 and InternLM2 in different sizes ✨ 78B model is of InternViT 6B and Qwen2.5-72B Instruct, can accomplish variety of tasks 👏

New InternVL drop with a sota 78B model with MIT license 🔥 The release comes with seven new vision LMs based on InternViT 300M/6B and Qwen2.5 and InternLM2 in different sizes ✨ 78B model is of InternViT 6B and Qwen2.5-72B Instruct, can accomplish variety of tasks 👏

25,314 views

Spaces at Hugging Face is the app store of AI 📱 it's also the MCP store now 🤠 filter thousands of MCPs you can attach to your LLM 🤗

Spaces at Hugging Face is the app store of AI 📱 it's also the MCP store now 🤠 filter thousands of MCPs you can attach to your LLM 🤗

18,307 views

many parts of Hugging Face Hub is actually powered by open machine learning models 🥹 translation feature is one of them, it uses a very tiny (600M) multilingual translation model by AI at Meta 💗

many parts of Hugging Face Hub is actually powered by open machine learning models 🥹 translation feature is one of them, it uses a very tiny (600M) multilingual translation model by AI at Meta 💗

14,958 views

Aya by Cohere For AI can now see! 👀 C4AI community has built Maya 8B, a new open-source multilingual VLM built on SigLIP and Aya 8B 🌱 works on 8 languages! 🗣️ The authors extend Llava dataset using Aya's translation capabilities with 558k examples! works very well ⬇️

Aya by Cohere For AI can now see! 👀 C4AI community has built Maya 8B, a new open-source multilingual VLM built on SigLIP and Aya 8B 🌱 works on 8 languages! 🗣️ The authors extend Llava dataset using Aya's translation capabilities with 558k examples! works very well ⬇️

16,226 views

Videos