
Ryohei Sasaki@engineer
@rsasaki0109 • 9,301 subscribers
Software Engineer at MAP IV(TIER IV group) AI/Robotics/Autonomous Driving/GNSS/LiDAR/IMU/SLAM/Localization/Mapping
Shorts
Videos

VLAExplain — Interpreting Vision-Language-Action (VLA) Models VLAExplain is an interpretability toolkit designed to help users visually understand the inner workings of Vision-Language-Action (VLA) models. Currently, attention analysis is supported for both the pi05 and unifolm-vla models. For details, please check pi05 and UnifoLM-VLA readme files respectively. Demo of pi05 in action:
Ryohei Sasaki@engineer12,774 次观看 • 1 个月前

LEGO-SLAM: Language-Embedded Gaussian Optimization SLAM LEGO-SLAM running at 15 FPS on a ScanNet scene with language-based loop closing for drift correction. LEGO-SLAM is a 3DGS-based SLAM framework that supports open-vocabulary semantic querying and rendering. It tracks via G-ICP and efficiently builds a map by embedding Gaussians with scene-adaptive 16D language features. Map management is achieved through Language Pruning and Language-Based Loop Detection. The generated map enables open-vocabulary 3D Object Localization.
Ryohei Sasaki@engineer14,935 次观看 • 2 个月前
没有更多内容可加载