Загрузка видео...
Не удалось загрузить видео
Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by Dibya Ghosh, we show how this can be done:
48,688 просмотров • 11 месяцев назад •via X (Twitter)
Комментарии: 5

Imagine an MDP where the state is the current crop of the image, an action is to pick a new crop, and rewards are matching textual captions or other (weak or strong) labels. Training a value function for this MDP instantiations a representation learning method.

Reward could come from matching a text label, or provided in a fully unsupervised way via crop consistency. The stronger the reward, the better it works, but even weak rewards like crop consistency lead to improvement. For more, check out the website:

@its_dibya *an SSL, but overall your grammar and punctuation are top-tier 💯

@its_dibya RL for SSL using semantic rewards? Brilliant method. Scaling beyond COCO might be tough here though—Canada’s R&D can’t keep up with compute demands anymore.

@its_dibya That’s just flattened patching which is something but not really.

