正在加载视频...

视频加载失败

"Cameras as Relative Positional Encoding" TLDR: comparison for conditioning transformers on cameras: token-level raymap, attention-level relative pose encodings, a (new) relative encoding Projective Positional Encoding -> camera frustums, (int|ext)insics for relative pos encoding

17,795 次观看 • 10 个月前 •via X (Twitter)

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频