正在加载视频...
视频加载失败
[1/n] Do distinct large models admit a simple map that aligns their embedding spaces? We show that across multimodal contrastive models—trained on different data and architectures—an orthogonal map aligns image embeddings. Strikingly, the same map also aligns text embeddings.
36,956 次观看 • 3 个月前 •via X (Twitter)
0 条评论
暂无评论
原始帖子的评论将显示在这里
