Loading video...
Video Failed to Load
Can a VLM see without a vision encoder? We trained one for $100, inspired by Gemma 4 12B. Latency on an M3 Pro MacBook: 112 ms -> 1.1 ms for the image path 30% lower end-to-end image+LLM The architecture is just: patchify the image -> linear projection with pos... show more
58,819 views • 5 days ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here
