Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Introducing “FlowMap”, the first self-supervised, differentiable structure-from-motion method that is competitive with conventional SfM like Colmap! IMO this solves a major missing piece for internet-scale training of 3D Deep Learning methods. 1/n

Vincent Sitzmann

19,511 subscribers

128,604 views • 2 years ago •via X (Twitter)

Science & Technology

Anya Rossi• Live Now

Private livecam show

12 Comments

Vincent Sitzmann2 years ago

Work led by our amazing @omcamsmith and @DavidCharatan with the support of the brilliant @_atewari ! 2/n

Vincent Sitzmann2 years ago

Structure-from-Motion is the only area of computer vision where non-deep learning methods - Colmap - remains the state-of-the-art. This has slowed us down: Colmap is used to generate pseudo-ground truth for 3D vision, instead of finding a self-supervised way! 3/n

Vincent Sitzmann2 years ago

FlowMap is a major step towards solving that problem: it is a fully differentiable, self-supervised structure-from-motion method! From only off-the-shelf point tracks / optical flow, FlowMap performs SfM that outperforms Colmap’s on Gaussian Splatting Novel View Synthesis! 4/n

Vincent Sitzmann2 years ago

Here are some point clouds reconstructed from FlowMap on popular scenes - it really works very robustly!! 5/n

Vincent Sitzmann2 years ago

There are two unique aspects to FlowMap: (1) Depth is the *only* free variable - poses and intrinsics are inferred feed-forward! (2) FlowMap is differentiable with respect to the depth estimator - this enables us to train one fully self-supervised just on video! 6/n

Vincent Sitzmann2 years ago

In the limit of having a *perfect* depth estimator and perfect correspondence (for instance from large-scale training), FlowMap solves the Structure-from-Motion problem - poses, intrinsics, and fused multi-view pointcloud - in a single feed-forward pass! 7/n

Vincent Sitzmann2 years ago

We have already released the code, which we’ve spent time organizing for ease of use. It includes the scripts for baselines, figures, and tables, so it will be a breeze for you to reproduce & build on top of it! 8/n

Vincent Sitzmann2 years ago

FlowMap minimizes a “camera-induced correspondence loss.” When a camera moves through a static scene, that motion induces correspondences on the image sensor according to the scene’s geometry, the camera motion, and the camera intrinsics, which we supervise with point tracks 9/n

Vincent Sitzmann2 years ago

However, solving for depth, poses and intrinsics as free variables via gradient descent does not work well (see the paper for why!). Instead, we reparameterize both poses and intrinsics in terms of depth and optical flow, *leaving only depth as a free variable*! 10/n

Vincent Sitzmann2 years ago

Even so, an unnecessary degree of freedom remains: Two identical image patches can have *different* depths! To fix this, we re-parameterize depth via a small monocular depth predictor. 11/n

Vincent Sitzmann2 years ago

We can use FlowMap itself to supervise & pre-train the depth estimator! Pre-training leads to better results & faster convergence, but not strictly necessary—it works even without any pre-training! The key is “patch-match” regularization: similar RGB patch → similar depth. 12/n

Vincent Sitzmann2 years ago

FlowMap allows us to train *any* 3D computer vision model self-supervised, just on video of static scenes. There are infinite cool follow-up directions, from feed-forward SfM to dynamics to multi-view stereo - we can't wait what the community will do with it! n/n

Related Videos

Supervised learning has held 3D Vision back for too long. Meet RayZer — a self-supervised 3D model trained with zero 3D labels: ❌ No supervision of camera & geometry ✅ Just RGB images And the wild part? RayZer outperforms supervised methods (as 3D labels from COLMAP is noisy) 🌐 Project: (1/4)

Supervised learning has held 3D Vision back for too long. Meet RayZer — a self-supervised 3D model trained with zero 3D labels: ❌ No supervision of camera & geometry ✅ Just RGB images And the wild part? RayZer outperforms supervised methods (as 3D labels from COLMAP is noisy) 🌐 Project: (1/4)

Hanwen Jiang

69,607 views • 1 year ago

(1/N) Will this be the BERT/GPT moment for 3D vision？ Finally, unsupervised pre-training for 3D works. Led by Qitao Zhao , we present E-RayZer — a fully self-supervised 3D reconstruction model that: 🔥Matches or surpasses supervised methods like VGGT 👀Learns transferable 3D representations, outperforming CroCo, VideoMAE, and DINO 📈Scales with more unlabeled data A new recipe for scalable 3D foundation models.

(1/N) Will this be the BERT/GPT moment for 3D vision？ Finally, unsupervised pre-training for 3D works. Led by Qitao Zhao , we present E-RayZer — a fully self-supervised 3D reconstruction model that: 🔥Matches or surpasses supervised methods like VGGT 👀Learns transferable 3D representations, outperforming CroCo, VideoMAE, and DINO 📈Scales with more unlabeled data A new recipe for scalable 3D foundation models.

Hanwen Jiang

58,093 views • 7 months ago

Spatial reasoning is a major challenge for the foundation models today, even in simple tasks like arranging objects in 3D space. #CVPR2025 Introducing LayoutVLM, a differentiable optimization framework that uses VLM to spatially reason about diverse scene layouts from unlabeled assets and open-ended language instructions 1/n

Spatial reasoning is a major challenge for the foundation models today, even in simple tasks like arranging objects in 3D space. #CVPR2025 Introducing LayoutVLM, a differentiable optimization framework that uses VLM to spatially reason about diverse scene layouts from unlabeled assets and open-ended language instructions 1/n

Fan-Yun Sun

92,572 views • 1 year ago

Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes Contributions: • We propose STORM, the first feed-forward, self-supervised method for fast and accurate reconstruction of dynamic 3D scenes from sparse, multi-timestep, posed camera images. • Our bottom-up framework aggregates and transforms per-frame 3D Gaussian Splats into a cohesive scene representation, enabling self-supervised motion estimation. Furthermore, we introduce motion tokens that capture common motion primitives and regularize motion predictions, facilitating dynamic motion group segmentation without explicit motion or correspondence supervision. • We present several enhancements for in-the-wild scenarios, including sky modeling, camera exposure inconsistency handling, large novel-view extrapolation, and fine-grained human motions reconstruction, making STORM well-suited for real-world applications.

Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes Contributions: • We propose STORM, the first feed-forward, self-supervised method for fast and accurate reconstruction of dynamic 3D scenes from sparse, multi-timestep, posed camera images. • Our bottom-up framework aggregates and transforms per-frame 3D Gaussian Splats into a cohesive scene representation, enabling self-supervised motion estimation. Furthermore, we introduce motion tokens that capture common motion primitives and regularize motion predictions, facilitating dynamic motion group segmentation without explicit motion or correspondence supervision. • We present several enhancements for in-the-wild scenarios, including sky modeling, camera exposure inconsistency handling, large novel-view extrapolation, and fine-grained human motions reconstruction, making STORM well-suited for real-world applications.

MrNeRF

53,292 views • 1 year ago

Large-scale 3D Scene Generation (all scenes are real-time rendered)!! Physically-grounded generative data without hallucinations is the missing link for robot learning and testing at scale. We introduce a method that directly generates large-scale 3D driving scenes with accurate geometry, allowing for causal view synthesis and generation with object permanence and explicit 3D geometry. This also allows for extreme trajectory extrapolation without failure! We also show that we can build fully data-driven simulators for end-to-end learning with this approach. Project: with the amazing team of Julian Ost, Amogh Joshi , Andrea Ramazzina, Maximilian Bömer, Mario Bijelic.

Large-scale 3D Scene Generation (all scenes are real-time rendered)!! Physically-grounded generative data without hallucinations is the missing link for robot learning and testing at scale. We introduce a method that directly generates large-scale 3D driving scenes with accurate geometry, allowing for causal view synthesis and generation with object permanence and explicit 3D geometry. This also allows for extreme trajectory extrapolation without failure! We also show that we can build fully data-driven simulators for end-to-end learning with this approach. Project: with the amazing team of Julian Ost, Amogh Joshi , Andrea Ramazzina, Maximilian Bömer, Mario Bijelic.

Felix Heide

27,779 views • 10 months ago

Meet LA-Pose. Our latest model taking Wayve another step towards generalization at scale. LA-Pose employs large-scale self-supervised learning, building strong motion representations for 3D perception from 10.2 million unlabeled driving video snippets, unlike today's strongest approaches that often depend on expensive, carefully curated 3D supervision. With only a lightweight pose head and limited labelled data, LA-Pose achieves: 📷 State-of-the-art camera pose estimation 🌎 Strong zero-shot generalization across diverse driving scenarios 🏷️ Orders of magnitude less labelled data than fully supervised 3D approaches Our full blog post: Explore the full paper here:

Meet LA-Pose. Our latest model taking Wayve another step towards generalization at scale. LA-Pose employs large-scale self-supervised learning, building strong motion representations for 3D perception from 10.2 million unlabeled driving video snippets, unlike today's strongest approaches that often depend on expensive, carefully curated 3D supervision. With only a lightweight pose head and limited labelled data, LA-Pose achieves: 📷 State-of-the-art camera pose estimation 🌎 Strong zero-shot generalization across diverse driving scenarios 🏷️ Orders of magnitude less labelled data than fully supervised 3D approaches Our full blog post: Explore the full paper here:

Wayve

36,410 views • 2 months ago

The million-dollar question in humanoid robotics is: Can humanoids tap into Internet-scale training data such as online videos due to their human-like physique? Our #CoRL2024 oral paper showed the promise of humanoids learning new skills from single video demonstrations. (1/n)

The million-dollar question in humanoid robotics is: Can humanoids tap into Internet-scale training data such as online videos due to their human-like physique? Our #CoRL2024 oral paper showed the promise of humanoids learning new skills from single video demonstrations. (1/n)

Yuke Zhu

69,582 views • 1 year ago

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision paper page: We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic environments or a narrow selection of real-world scenes, are quite insufficient. This insufficiency not only hinders a comprehensive benchmark of existing methods but also caps what could be explored in deep learning-based 3D analysis. To address this critical gap, we present DL3DV-10K, a large-scale scene dataset, featuring 51.2 million frames from 10,510 videos captured from 65 types of point-of-interest (POI) locations, covering both bounded and unbounded scenes, with different levels of reflection, transparency, and lighting. We conducted a comprehensive benchmark of recent NVS methods on DL3DV-10K, which revealed valuable insights for future research in NVS. In addition, we have obtained encouraging results in a pilot study to learn generalizable NeRF from DL3DV-10K, which manifests the necessity of a large-scale scene-level dataset to forge a path toward a foundation model for learning 3D representation.

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision paper page: We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic environments or a narrow selection of real-world scenes, are quite insufficient. This insufficiency not only hinders a comprehensive benchmark of existing methods but also caps what could be explored in deep learning-based 3D analysis. To address this critical gap, we present DL3DV-10K, a large-scale scene dataset, featuring 51.2 million frames from 10,510 videos captured from 65 types of point-of-interest (POI) locations, covering both bounded and unbounded scenes, with different levels of reflection, transparency, and lighting. We conducted a comprehensive benchmark of recent NVS methods on DL3DV-10K, which revealed valuable insights for future research in NVS. In addition, we have obtained encouraging results in a pilot study to learn generalizable NeRF from DL3DV-10K, which manifests the necessity of a large-scale scene-level dataset to forge a path toward a foundation model for learning 3D representation.

AK

49,917 views • 2 years ago

📢Happy to present Convex Splatting, a novel way for 3D reconstruction based on 3D smooth convexes. For the first time, a splatting-based method reaches the quality of NeRF sota methods but with real-time rendering and few primitives!! I expect this to replace Gaussian splatting for 3D in the coming months. CODE RELEASED TODAY! joint work with collaborators from Université de Liège Visual Geometry Group (VGG) , KAUST Computer Vision Lab (IVUL) a thread 🧵 1/n

📢Happy to present Convex Splatting, a novel way for 3D reconstruction based on 3D smooth convexes. For the first time, a splatting-based method reaches the quality of NeRF sota methods but with real-time rendering and few primitives!! I expect this to replace Gaussian splatting for 3D in the coming months. CODE RELEASED TODAY! joint work with collaborators from Université de Liège Visual Geometry Group (VGG) , KAUST Computer Vision Lab (IVUL) a thread 🧵 1/n

Abdullah Hamdi

71,661 views • 1 year ago

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more about DINOv3 here:

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more about DINOv3 here:

AI at Meta

900,376 views • 11 months ago

PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking paper page: introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework, for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion. Toward the goal of naturalism, we animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos. We create combinatorial diversity by randomizing character appearance, motion profiles, materials, lighting, 3D assets, and atmospheric effects. Our dataset currently includes 104 videos, averaging 2,000 frames long, with orders of magnitude more correspondence annotations than prior work. We show that existing methods can be trained from scratch in our dataset and outperform the published variants. Finally, we introduce modifications to the PIPs point tracking method, greatly widening its temporal receptive field, which improves its performance on PointOdyssey as well as on two real-world benchmarks.

PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking paper page: introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework, for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion. Toward the goal of naturalism, we animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos. We create combinatorial diversity by randomizing character appearance, motion profiles, materials, lighting, 3D assets, and atmospheric effects. Our dataset currently includes 104 videos, averaging 2,000 frames long, with orders of magnitude more correspondence annotations than prior work. We show that existing methods can be trained from scratch in our dataset and outperform the published variants. Finally, we introduce modifications to the PIPs point tracking method, greatly widening its temporal receptive field, which improves its performance on PointOdyssey as well as on two real-world benchmarks.

AK

122,533 views • 3 years ago

📢 SHeaP: Self-Supervised Head Predictor Learned via 2D Gaussians 📢 Given a single input image, we predict accurate 3D head geometry, pose, and expression. Previous works (e.g. DECA, EMOCA) use differentiable mesh rasterization to learn a self-supervised head geometry predictor via a photometric reconstruction loss. We borrow these ideas, but our key insight is to replace the mesh rendering with 2D Gaussian Splatting. This leads to much higher accuracy of the underlying predicted geometry and thus more gradient signal during training. 🌍 🎥 Great work by Liam Schoneveld Davide Davoli Jiapeng Tang

📢 SHeaP: Self-Supervised Head Predictor Learned via 2D Gaussians 📢 Given a single input image, we predict accurate 3D head geometry, pose, and expression. Previous works (e.g. DECA, EMOCA) use differentiable mesh rasterization to learn a self-supervised head geometry predictor via a photometric reconstruction loss. We borrow these ideas, but our key insight is to replace the mesh rendering with 2D Gaussian Splatting. This leads to much higher accuracy of the underlying predicted geometry and thus more gradient signal during training. 🌍 🎥 Great work by Liam Schoneveld Davide Davoli Jiapeng Tang

Matthias Niessner

28,559 views • 1 year ago

We are excited to share our #CORL2024 paper (oral) on "Learning Quadruped Locomotion Using Differentiable Simulation" done in collaboration with Sangbae Kim Massachusetts Institute of Technology (MIT). We present a new way to learn to walk in minutes without parallelization, outperforming PPO in sample efficiency! PDF: Video: We present a new framework for learning quadruped locomotion. By leveraging differentiable simulation for policy optimization, our approach achieves fast convergence and stable training, significantly outperforming model-free #ReinforcementLearning methods like PPO in sample efficiency. The key enabler is to combine a high-fidelity, non-differentiable simulator for forward dynamics with a simplified surrogate model for gradient backpropagation. Our framework enables learning quadruped walking in simulation in minutes without parallelization. When augmented with GPU parallelization, our approach allows the quadruped robot to master diverse locomotion skills on challenging terrains in minutes. This work highlights one of the first successful real-world applications of differentiable simulation for quadruped robots, offering a compelling alternative to traditional RL methods. Kudos to Yunlong Song! UZH Science University of Zurich UZH Space Hub UZH IfI European Research Council (ERC) Massachusetts Institute of Technology (MIT)MechE

We are excited to share our #CORL2024 paper (oral) on "Learning Quadruped Locomotion Using Differentiable Simulation" done in collaboration with Sangbae Kim Massachusetts Institute of Technology (MIT). We present a new way to learn to walk in minutes without parallelization, outperforming PPO in sample efficiency! PDF: Video: We present a new framework for learning quadruped locomotion. By leveraging differentiable simulation for policy optimization, our approach achieves fast convergence and stable training, significantly outperforming model-free #ReinforcementLearning methods like PPO in sample efficiency. The key enabler is to combine a high-fidelity, non-differentiable simulator for forward dynamics with a simplified surrogate model for gradient backpropagation. Our framework enables learning quadruped walking in simulation in minutes without parallelization. When augmented with GPU parallelization, our approach allows the quadruped robot to master diverse locomotion skills on challenging terrains in minutes. This work highlights one of the first successful real-world applications of differentiable simulation for quadruped robots, offering a compelling alternative to traditional RL methods. Kudos to Yunlong Song! UZH Science University of Zurich UZH Space Hub UZH IfI European Research Council (ERC) Massachusetts Institute of Technology (MIT)MechE

Davide Scaramuzza

15,533 views • 1 year ago

Introducing Muscle v0 -- infinite degrees of freedom, from Daxo Robotics. A different mountain to climb - with a far more beautiful peak. We built this from the ground up: - Ultra-dexterous - Built for machine learning - Durable and robust More below (1/n)

Introducing Muscle v0 -- infinite degrees of freedom, from Daxo Robotics. A different mountain to climb - with a far more beautiful peak. We built this from the ground up: - Ultra-dexterous - Built for machine learning - Durable and robust More below (1/n)

Tom Zhang

273,756 views • 1 year ago

Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation paper page: Recent advances in generative modeling have led to promising progress on synthesizing 3D human motion from text, with methods that can generate character animations from short prompts and specified durations. However, using a single text prompt as input lacks the fine-grained control needed by animators, such as composing multiple actions and defining precise durations for parts of the motion. To address this, we introduce the new problem of timeline control for text-driven motion synthesis, which provides an intuitive, yet fine-grained, input interface for users. Instead of a single prompt, users can specify a multi-track timeline of multiple prompts organized in temporal intervals that may overlap. This enables specifying the exact timings of each action and composing multiple actions in sequence or at overlapping intervals. To generate composite animations from a multi-track timeline, we propose a new test-time denoising method. This method can be integrated with any pre-trained motion diffusion model to synthesize realistic motions that accurately reflect the timeline. At every step of denoising, our method processes each timeline interval (text prompt) individually, subsequently aggregating the predictions with consideration for the specific body parts engaged in each action. Experimental comparisons and ablations validate that our method produces realistic motions that respect the semantics and timing of given text prompts.

Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation paper page: Recent advances in generative modeling have led to promising progress on synthesizing 3D human motion from text, with methods that can generate character animations from short prompts and specified durations. However, using a single text prompt as input lacks the fine-grained control needed by animators, such as composing multiple actions and defining precise durations for parts of the motion. To address this, we introduce the new problem of timeline control for text-driven motion synthesis, which provides an intuitive, yet fine-grained, input interface for users. Instead of a single prompt, users can specify a multi-track timeline of multiple prompts organized in temporal intervals that may overlap. This enables specifying the exact timings of each action and composing multiple actions in sequence or at overlapping intervals. To generate composite animations from a multi-track timeline, we propose a new test-time denoising method. This method can be integrated with any pre-trained motion diffusion model to synthesize realistic motions that accurately reflect the timeline. At every step of denoising, our method processes each timeline interval (text prompt) individually, subsequently aggregating the predictions with consideration for the specific body parts engaged in each action. Experimental comparisons and ablations validate that our method produces realistic motions that respect the semantics and timing of given text prompts.

AK

126,585 views • 2 years ago

This video has been scrubbed from the internet, luckily for everyone I recorded it. Chinese Explorers found this on the Eastern Slope of Mt. Ararat in 2009. They Claim it’s Noah’s Ark. Buried Deep Under Rumble & Encased in Ice, is a wooden structure, with rooms & what looks like stables.

This video has been scrubbed from the internet, luckily for everyone I recorded it. Chinese Explorers found this on the Eastern Slope of Mt. Ararat in 2009. They Claim it’s Noah’s Ark. Buried Deep Under Rumble & Encased in Ice, is a wooden structure, with rooms & what looks like stables.

Ancient Hypotheses

1,552,027 views • 11 months ago

[Spoiler alert‼️] Agents 3D motion teaser 🔥 Fresh out of our AI factory—MAX is here, and she goes 3D. She moves with intention, reacts naturally, and gestures like a true conversationalist. And, she’s the FIRST & ONLY AI Agent pulling this off. 🎭 Real-time motion meets real-time conversation. 🧠 Solving one of AI’s hardest problems—dynamic, expressive interaction. This is just the beginning. MAX is learning. MAX is evolving. Stay tuned. #AI #ConversationalAI $MAX

[Spoiler alert‼️] Agents 3D motion teaser 🔥 Fresh out of our AI factory—MAX is here, and she goes 3D. She moves with intention, reacts naturally, and gestures like a true conversationalist. And, she’s the FIRST & ONLY AI Agent pulling this off. 🎭 Real-time motion meets real-time conversation. 🧠 Solving one of AI’s hardest problems—dynamic, expressive interaction. This is just the beginning. MAX is learning. MAX is evolving. Stay tuned. #AI #ConversationalAI $MAX

Distilled AI

18,160 views • 1 year ago

⚡️📣👇Tremendously excited to share our new Cell article, where we develop TriPath, a method for analyzing 3D pathology samples using weakly supervised AI. Article: TriPath enables 3D computational pathology via 3D multiple instance learning allowing AI models to capture intricate morphological details from pathology volumes. Code: Blog post: Tested on two different imaging modalities, and patient cohorts from two institutions. Our superstar Andrew H. Song put in a monumental effort of leading the study, in a fantastic collaboration with Jonathan Liu at University of Washington . Interesting aspects: - Utilizing the whole tissue volume and leveraging 3D deep learning enable superior risk prediction performance compared to 2D deep learning baselines based on a few sampled tissue sections that emulate standard clinical practice. This indicates TriPath can harness additional information provided by 3D tissue morphology. - The performance is also superior to clinical baselines from a reader study that involved six expert pathologists. - The morphologically heterogeneous tissue volume could lead to opposing patient-level outcome predictions, dependent on which portion of the tissue volume is used. This concurs with current clinical literature warning that tissue sampling bias can lead to misdiagnosis. Some limitations: - While the 3D pathology cohort size is unprecedented, it is smaller than typical 2D pathology cohorts. Further large-scale studies will be required for validation. Nevertheless, we believe that this study will initiate a positive cycle, encouraging academic institutions and pharmaceutical companies to contribute large banks of human tissue blocks with paired clinical outcomes, thus speeding up advancements in 3D computational pathology. Concluding insights: We believe that 3D pathology is just around the corner - It has the huge potential to not only augment/improve the current clinical practice centered around 2D examination of human tissue, but also help reveal novel biomarkers for prognosis and therapeutic response.. Harvard Medical School Harvard Data Science Initiative Mass General Brigham Broad Institute

⚡️📣👇Tremendously excited to share our new Cell article, where we develop TriPath, a method for analyzing 3D pathology samples using weakly supervised AI. Article: TriPath enables 3D computational pathology via 3D multiple instance learning allowing AI models to capture intricate morphological details from pathology volumes. Code: Blog post: Tested on two different imaging modalities, and patient cohorts from two institutions. Our superstar Andrew H. Song put in a monumental effort of leading the study, in a fantastic collaboration with Jonathan Liu at University of Washington . Interesting aspects: - Utilizing the whole tissue volume and leveraging 3D deep learning enable superior risk prediction performance compared to 2D deep learning baselines based on a few sampled tissue sections that emulate standard clinical practice. This indicates TriPath can harness additional information provided by 3D tissue morphology. - The performance is also superior to clinical baselines from a reader study that involved six expert pathologists. - The morphologically heterogeneous tissue volume could lead to opposing patient-level outcome predictions, dependent on which portion of the tissue volume is used. This concurs with current clinical literature warning that tissue sampling bias can lead to misdiagnosis. Some limitations: - While the 3D pathology cohort size is unprecedented, it is smaller than typical 2D pathology cohorts. Further large-scale studies will be required for validation. Nevertheless, we believe that this study will initiate a positive cycle, encouraging academic institutions and pharmaceutical companies to contribute large banks of human tissue blocks with paired clinical outcomes, thus speeding up advancements in 3D computational pathology. Concluding insights: We believe that 3D pathology is just around the corner - It has the huge potential to not only augment/improve the current clinical practice centered around 2D examination of human tissue, but also help reveal novel biomarkers for prognosis and therapeutic response.. Harvard Medical School Harvard Data Science Initiative Mass General Brigham Broad Institute

Faisal Mahmood

65,520 views • 2 years ago

SW 2026.14.6.7 | FSD V14.3.3 First Drive I’m not sure what to test with this to be honest - I already think the nag is VERY chill for being a supervised piece of software 🤷‍♂️ Let me know if you’d like me to test anything specifically.

SW 2026.14.6.7 | FSD V14.3.3 First Drive I’m not sure what to test with this to be honest - I already think the nag is VERY chill for being a supervised piece of software 🤷‍♂️ Let me know if you’d like me to test anything specifically.

Devin Olsen

15,050 views • 1 month ago