Foundation Models @Tesla_AI. Prev: PhD at Stanford
Shorts
Introduce CoT-VLA – Visual Chain-of-Thought reasoning for Robot Foundation Models! 🤖 By leveraging next-frame prediction as visual chain-of-thought reasoning, CoT-VLA uses future prediction to guide action generation and unlock large-scale video data for training. #CVPR2025
48,069 просмотров
How do we create realistic models of dressed humans directly from visual data? We introduce PhysAvatar, a framework that estimates the shape, appearance, and physical parameters of dressed human avatars from multi-view videos. Page: (1/6)