Загрузка видео...
Не удалось загрузить видео
[CLIP] by Hand ✍️ The CLIP (Contrastive Language–Image Pre-training) model, a groundbreaking work by OpenAI, redefines the intersection of computer vision and natural language processing. It is the basis of all the multi-modal foundation models we see today. How does CLIP work? Goal: 🟨 Learn a shared embedding space... show more
67,790 просмотров • 2 лет назад •via X (Twitter)
Комментарии: 9

Could language-vision models spark new intuitive human-computer interfaces? What opportunities await?

These are beautiful how do you generate these?

I drew each frame using powerpoint and a drawing tablet. This video has 102 frames. ✍️🏋️

Beautiful! Have you seen posts/articles on CLIP fine-tuning?

CLIP bridges the gap between words & pictures. How could this shape the way we interact with machines in the future? #CLIP #AI #FutureTech

Impressive breakdown of the CLIP model. Your work at the intersection of computer vision and NLP is commendable. At Master of Code, we excel in applying advanced AI like CLIP to revolutionize user interactions.

@threadreaderapp unroll

@ProfTomYeh @Maxin_check Hi! the unroll you asked for: Enjoy :) 🤖

@IntentAGI $OO @moonberg_ai $ZENT @ZentryHQ $MINE @MineProBusiness

