Bo Wang's banner

Bo Wang

@BoWang87 • 30,647 subscribers

Prof @UofT | Co-Founder & Chief AI Scientist @Xaira_Thera | Building first Virtual Cell | AI & Bio & Healthcare | Co-Inventor of ScGPT, MedSAM, BIOREASON

Shorts

Can’t believe they actually shipped it 😂 A $20, bottlecap-sized 🦞 in your pocket 🔥

Can’t believe they actually shipped it 😂 A $20, bottlecap-sized 🦞 in your pocket 🔥

1,611,770 次观看

New nature paper today : Sony's Ace robot beats 3 of 5 elite table tennis players. Loses to professionals. Human players win points with faster-than-average shots (p<0.001 between won vs returned). Ace wins with ordinary shots. Same speed and spin profile whether it wins or loses the rally (p=0.88). It's playing a completely different sport than the humans are. Trained entirely in simulation. Zero sim-to-real tricks beyond good physics modeling and asymmetric actor-critic (critic sees ground truth, actor sees noisy sensors). Best part — after watching a point, 1992 Olympian Kinjiro Nakamura said: "I didn't think it was possible. But the fact that it was possible... means that there is a possibility that a human could do it too." Code: Paper:

100,049 次观看

Few people realize how fast AI is already revolutionizing surgery! Medivis's SurgicalAR platform for assisting neurosurgeons during surgery just received FDA clearance in Dec 2025. We keep debating whether AI will replace doctors. Meanwhile surgeons are literally seeing through patients with AR navigation in real-time. The right question was never replacement. It was always: what does a surgeon become when they have superhuman perception? We will share a lot more projects in AI & surgery soon from University Health Network ! Stay tuned

Sensitive content

Few people realize how fast AI is already revolutionizing surgery! Medivis's SurgicalAR platform for assisting neurosurgeons during surgery just received FDA clearance in Dec 2025. We keep debating whether AI will replace doctors. Meanwhile surgeons are literally seeing through patients with AR navigation in real-time. The right question was never replacement. It was always: what does a surgeon become when they have superhuman perception? We will share a lot more projects in AI & surgery soon from University Health Network ! Stay tuned

41,726 次观看

🚀 The Segment Anything Model (SAM) has been upgraded to SAM2, featuring an efficient image encoder for segmenting images and videos. But does SAM2 outperform SAM1 in medical image and video segmentation? We're thrilled to present our paper "Segment Anything in Medical Images and Videos: Benchmark and Deployment"! We comprehensively benchmark SAM2 across 11 medical image modalities and videos. 📄 Paper: 💻 Code: **Highlights:** 1. SAM2 doesn’t always outperform SAM1 in 2D medical images, but excels in video segmentation, making it more accurate and efficient for 3D images, such as CT and MR scans. 2. MedSAM still outperforms SAM2 on most 2D modalities, but SAM2 surpasses MedSAM for 3D image segmentation in a slice-by-slice approach. 3. Segmentation performance varies with model size; sometimes the smallest model outperforms larger ones. 4. Fine-tuning SAM2 significantly boosts its performance for medical image segmentation. While SAM2 may struggle with challenging objects that have unclear boundaries or low contrast, it excels in generating good initial segmentation masks for common medical images and videos. However, the official interface doesn’t support medical data formats and has limitations on video length. To address this, we've developed a 3D Slicer Plugin and Gradio API for efficient 3D medical image and video segmentation. We invite you to try them out and provide feedback! 🔧 Deployment: - 3D Slicer Plugin: - Gradio API: (Note: Due to GPU limitations, the online API is available for only 12 hours and may be slow. We highly recommend deploying the Gradio API with your own computing resources: A big shoutout to Jun Ma (JunMa) who recently joined our UHN AI hub (UHN AI Hub) as Machine Learning Lead, and kudos to all co-authors: Sumin Kim, Feifei Li, Mohammed Baharoon (Mohammed Baharoon), Reza Asakereh, and Hongwei Lyu! This is true teamwork! Looking forward to collaborating with the community to advance 3D medical image and video segmentation foundation models! University Health Network U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology Temerty Centre for AI in Medicine (T-CAIREM) Vector Institute #MedTech #AIinHealthcare #DeepLearning #MedicalImaging #SAM2 #MedSAM #AIResearch

🚀 The Segment Anything Model (SAM) has been upgraded to SAM2, featuring an efficient image encoder for segmenting images and videos. But does SAM2 outperform SAM1 in medical image and video segmentation? We're thrilled to present our paper "Segment Anything in Medical Images and Videos: Benchmark and Deployment"! We comprehensively benchmark SAM2 across 11 medical image modalities and videos. 📄 Paper: 💻 Code: Highlights: 1. SAM2 doesn’t always outperform SAM1 in 2D medical images, but excels in video segmentation, making it more accurate and efficient for 3D images, such as CT and MR scans. 2. MedSAM still outperforms SAM2 on most 2D modalities, but SAM2 surpasses MedSAM for 3D image segmentation in a slice-by-slice approach. 3. Segmentation performance varies with model size; sometimes the smallest model outperforms larger ones. 4. Fine-tuning SAM2 significantly boosts its performance for medical image segmentation. While SAM2 may struggle with challenging objects that have unclear boundaries or low contrast, it excels in generating good initial segmentation masks for common medical images and videos. However, the official interface doesn’t support medical data formats and has limitations on video length. To address this, we've developed a 3D Slicer Plugin and Gradio API for efficient 3D medical image and video segmentation. We invite you to try them out and provide feedback! 🔧 Deployment: - 3D Slicer Plugin: - Gradio API: (Note: Due to GPU limitations, the online API is available for only 12 hours and may be slow. We highly recommend deploying the Gradio API with your own computing resources: A big shoutout to Jun Ma (JunMa) who recently joined our UHN AI hub (UHN AI Hub) as Machine Learning Lead, and kudos to all co-authors: Sumin Kim, Feifei Li, Mohammed Baharoon (Mohammed Baharoon), Reza Asakereh, and Hongwei Lyu! This is true teamwork! Looking forward to collaborating with the community to advance 3D medical image and video segmentation foundation models! University Health Network U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology Temerty Centre for AI in Medicine (T-CAIREM) Vector Institute #MedTech #AIinHealthcare #DeepLearning #MedicalImaging #SAM2 #MedSAM #AIResearch

178,481 次观看

New Cell paper from Bergles lab at Johns Hopkins just built the most comprehensive map of brain myelin ever made — every oligodendrocyte, across the entire mouse brain, across the lifespan. The scale: >10 million cells per brain, terabyte-scale 3D lightsheet volumes, registered to the Allen Brain Atlas across 417 regions from 2 months to 2+ years of age. The technical stack: Custom tissue clearing (CUBIC-L + SHIELD + uRIMS with 40% urea) to preserve endogenous fluorescence. 3D Mask R-CNN for instance segmentation — not just semantic, instance — so it can distinguish individual cells within dense clusters at scale via overlapping sliding windows. Vision Transformer to classify newly-formed vs. mature oligodendrocytes using soma morphology. All cross-referenced against Allen ISH transcriptomics and MICrONS serial EM. What they found: Oligodendrocyte density varies 10,000-fold across brain regions. Left-right hemispheres: r=0.99. Sex: no significant difference. Strain: matters. The brain never stops myelinating. New oligodendrocytes are still being generated in 2-year-old mice. Prefrontal cortex L6 shows the fastest rates of new myelination into old age — the circuits for executive function keep rewiring throughout life. After demyelination, L4 sensory cortex is the most resilient — oligodendrocytes survive at higher rates. The hippocampus loses nearly everything and barely recovers. Degree of injury doesn't predict rate of recovery. These are independent axes. The Alzheimer's result is the most surprising: Dense-core plaques dominate in cortex and hippocampus. Diffuse/small-core plaques dominate in white matter fiber tracts. Old assumption: diffuse plaques are "less toxic." The data says the opposite — small plaques in fiber tracts cause more myelin loss per plaque than dense-core plaques in gray matter. Plaque load and oligodendrocyte loss are essentially uncorrelated (ρ=0.22). The damage is plaque-type and location specific, not load-dependent. For MS and AD research: you can't read off white matter injury from gray matter plaque burden. The pathology in fiber tracts is running on different rules. Data: Paper:

New Cell paper from Bergles lab at Johns Hopkins just built the most comprehensive map of brain myelin ever made — every oligodendrocyte, across the entire mouse brain, across the lifespan. The scale: >10 million cells per brain, terabyte-scale 3D lightsheet volumes, registered to the Allen Brain Atlas across 417 regions from 2 months to 2+ years of age. The technical stack: Custom tissue clearing (CUBIC-L + SHIELD + uRIMS with 40% urea) to preserve endogenous fluorescence. 3D Mask R-CNN for instance segmentation — not just semantic, instance — so it can distinguish individual cells within dense clusters at scale via overlapping sliding windows. Vision Transformer to classify newly-formed vs. mature oligodendrocytes using soma morphology. All cross-referenced against Allen ISH transcriptomics and MICrONS serial EM. What they found: Oligodendrocyte density varies 10,000-fold across brain regions. Left-right hemispheres: r=0.99. Sex: no significant difference. Strain: matters. The brain never stops myelinating. New oligodendrocytes are still being generated in 2-year-old mice. Prefrontal cortex L6 shows the fastest rates of new myelination into old age — the circuits for executive function keep rewiring throughout life. After demyelination, L4 sensory cortex is the most resilient — oligodendrocytes survive at higher rates. The hippocampus loses nearly everything and barely recovers. Degree of injury doesn't predict rate of recovery. These are independent axes. The Alzheimer's result is the most surprising: Dense-core plaques dominate in cortex and hippocampus. Diffuse/small-core plaques dominate in white matter fiber tracts. Old assumption: diffuse plaques are "less toxic." The data says the opposite — small plaques in fiber tracts cause more myelin loss per plaque than dense-core plaques in gray matter. Plaque load and oligodendrocyte loss are essentially uncorrelated (ρ=0.22). The damage is plaque-type and location specific, not load-dependent. For MS and AD research: you can't read off white matter injury from gray matter plaque burden. The pathology in fiber tracts is running on different rules. Data: Paper:

24,748 次观看

🧬 We have many foundation models or language models for DNAs, but can we control them? We introduce Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL — a reinforcement learning framework for controllable cis-regulatory sequence generation. Paper: Code: 🔬What’s the challenge? Designing regulatory DNA that is both highly expressive in target cell types and inactive in others is essential for synthetic biology, gene therapy, and precision medicine. Yet, controlling these trade-offs is challenging due to sparse, sequence-level rewards and biological constraints. 🔥Why Ctrl-DNA? Ctrl-DNA fine-tunes pre-trained DNA language models using a value model free, Lagrangian-guided RL framework, enabling flexible and customizable constraint optimization. Users can define application-specific thresholds across cell types, balancing expression strength with specificity. ✅ Maximize target-cell expression ✅ Constrain off-target activity under user-defined thresholds ✅ Preserve cell-type-specific TF motif structure Benchmarked on human enhancer and promoter datasets, Ctrl-DNA consistently outperforms prior methods, achieving stronger specificity, higher fitness, and more biologically grounded sequence generation — all with direct control over regulatory trade-offs. Shoutout to the PhD students Xingyu Chen (Xingyu Chen ) and Rex Ma (Rex Ma) for their amazing work leading this project!

🧬 We have many foundation models or language models for DNAs, but can we control them? We introduce Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL — a reinforcement learning framework for controllable cis-regulatory sequence generation. Paper: Code: 🔬What’s the challenge? Designing regulatory DNA that is both highly expressive in target cell types and inactive in others is essential for synthetic biology, gene therapy, and precision medicine. Yet, controlling these trade-offs is challenging due to sparse, sequence-level rewards and biological constraints. 🔥Why Ctrl-DNA? Ctrl-DNA fine-tunes pre-trained DNA language models using a value model free, Lagrangian-guided RL framework, enabling flexible and customizable constraint optimization. Users can define application-specific thresholds across cell types, balancing expression strength with specificity. ✅ Maximize target-cell expression ✅ Constrain off-target activity under user-defined thresholds ✅ Preserve cell-type-specific TF motif structure Benchmarked on human enhancer and promoter datasets, Ctrl-DNA consistently outperforms prior methods, achieving stronger specificity, higher fitness, and more biologically grounded sequence generation — all with direct control over regulatory trade-offs. Shoutout to the PhD students Xingyu Chen (Xingyu Chen ) and Rex Ma (Rex Ma) for their amazing work leading this project!

30,719 次观看

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

A Brazilian scientist worked in silence for 25 years on something medicine said was impossible: regenerating the spinal cord. Dr. Tatiana Sampaio extracted a protein from placentas that acts as "biological glue" — recreating the conditions that let embryonic neurons connect. Six patients with complete spinal cord injuries regained movement. Bruno Drummond was tetraplegic after a car accident. Two weeks after treatment, he moved his toe. Today he walks, climbs stairs, dances. Her quote when asked why she finally went public: "I no longer have the right to be conservative." 25 years. No social media. No self-promotion. Just the work. This is what real science looks like.

A Brazilian scientist worked in silence for 25 years on something medicine said was impossible: regenerating the spinal cord. Dr. Tatiana Sampaio extracted a protein from placentas that acts as "biological glue" — recreating the conditions that let embryonic neurons connect. Six patients with complete spinal cord injuries regained movement. Bruno Drummond was tetraplegic after a car accident. Two weeks after treatment, he moved his toe. Today he walks, climbs stairs, dances. Her quote when asked why she finally went public: "I no longer have the right to be conservative." 25 years. No social media. No self-promotion. Just the work. This is what real science looks like.

2,222,453 次观看 • 5 个月前

Professor Judea Pearl — the pioneer who invented causal reasoning in AI — says scaling won't save us. "Mathematical limitations that are not crossable by scaling up." The brutal truth: LLMs aren’t learning how the world works. They are learning how we describe the world. This resonates with most biologists: Drug discovery is hitting the same wall. We have mountains of genomic data, but most AI models just find patterns in published papers — not in the raw biology itself. They're learning what scientists think causes disease, not what actually does. Pearl's causal revolution? That's how we move from "this gene correlates with cancer" to "this gene causes cancer" — and finally design drugs that work. Until then, we're building very expensive parrots.

Professor Judea Pearl — the pioneer who invented causal reasoning in AI — says scaling won't save us. "Mathematical limitations that are not crossable by scaling up." The brutal truth: LLMs aren’t learning how the world works. They are learning how we describe the world. This resonates with most biologists: Drug discovery is hitting the same wall. We have mountains of genomic data, but most AI models just find patterns in published papers — not in the raw biology itself. They're learning what scientists think causes disease, not what actually does. Pearl's causal revolution? That's how we move from "this gene correlates with cancer" to "this gene causes cancer" — and finally design drugs that work. Until then, we're building very expensive parrots.

803,112 次观看 • 5 个月前

Yann LeCun just said something that every AI-in-healthcare researcher should sit with. He basically said: If language were enough to understand the world, you could learn medicine by reading books. But you can’t. You need residency. You need to see thousands of normal cases before you recognize the abnormal one. He also points out something wild — all the public text on the internet is on the order of 10¹⁴ bytes. A 4-year-old processes about that much through vision alone. The world is just… higher bandwidth than text. I think this shift — from language models to world models — is going to matter a lot in healthcare. 🫀

Yann LeCun just said something that every AI-in-healthcare researcher should sit with. He basically said: If language were enough to understand the world, you could learn medicine by reading books. But you can’t. You need residency. You need to see thousands of normal cases before you recognize the abnormal one. He also points out something wild — all the public text on the internet is on the order of 10¹⁴ bytes. A 4-year-old processes about that much through vision alone. The world is just… higher bandwidth than text. I think this shift — from language models to world models — is going to matter a lot in healthcare. 🫀

418,625 次观看 • 5 个月前

Former Goldman Sachs executive Raoul Pal: knowledge is now worth zero. He's half right. AI didn't make knowledge worthless. It made access to knowledge worthless. The scarcity shifted. From "who knows" to "who can think." Doctors and lawyers aren't paid for memorizing facts—they're paid for judgment under uncertainty. For knowing which fact applies when. For deciding when the rules don't fit. AI has infinite knowledge. It has zero wisdom. Knowledge is free. Taste is expensive.

Former Goldman Sachs executive Raoul Pal: knowledge is now worth zero. He's half right. AI didn't make knowledge worthless. It made access to knowledge worthless. The scarcity shifted. From "who knows" to "who can think." Doctors and lawyers aren't paid for memorizing facts—they're paid for judgment under uncertainty. For knowing which fact applies when. For deciding when the rules don't fit. AI has infinite knowledge. It has zero wisdom. Knowledge is free. Taste is expensive.

211,977 次观看 • 5 个月前

Today we’re announcing X-Cell — Xaira’s first step toward a virtual cell. 🧬 A foundation model that predicts how gene expression changes under causal perturbations — across cell types, conditions, and even unseen biology. This is not trained on observational atlases. It is trained on interventions. 🧵👇

Today we’re announcing X-Cell — Xaira’s first step toward a virtual cell. 🧬 A foundation model that predicts how gene expression changes under causal perturbations — across cell types, conditions, and even unseen biology. This is not trained on observational atlases. It is trained on interventions. 🧵👇

173,963 次观看 • 4 个月前

A pulmonologist with 20 years of experience just made the most honest AI video I've seen. He shows what he can do in seconds: "Right middle lobe pneumonia. Left upper lobe consolidation. Bilateral pneumonia. Patient is very sick." Then: "Here comes AI. They pick it up in a second." Then: "I'm going to be applying to McDonald's soon. I hope they have some openings." 😅 This isn't a tech bro making predictions. This is the prediction arriving, narrated by the person it's arriving for.

A pulmonologist with 20 years of experience just made the most honest AI video I've seen. He shows what he can do in seconds: "Right middle lobe pneumonia. Left upper lobe consolidation. Bilateral pneumonia. Patient is very sick." Then: "Here comes AI. They pick it up in a second." Then: "I'm going to be applying to McDonald's soon. I hope they have some openings." 😅 This isn't a tech bro making predictions. This is the prediction arriving, narrated by the person it's arriving for.

143,923 次观看 • 4 个月前

Someone just put OpenClaw inside a Unitree G1 humanoid — and it walks. Same agent framework running on your phone. Now integrated with lidar, stereo + RGB cameras, understanding 3D space and temporal context. Works on drones and quadrupeds too. Fully open source. Cool work! Video Source: stash

Someone just put OpenClaw inside a Unitree G1 humanoid — and it walks. Same agent framework running on your phone. Now integrated with lidar, stereo + RGB cameras, understanding 3D space and temporal context. Works on drones and quadrupeds too. Fully open source. Cool work! Video Source: stash

102,582 次观看 • 4 个月前

“why can't computers match the biological brain?” —asked by Naveen Rao, CEO of Unconventional AI Here's what the numbers actually look like: 🧠 Biological brain: • 20+ watts • 86 billion neurons, 100 trillion synapses • Handles vision, language, memory, emotion simultaneously • Total energy to "train" over 20 years: ~3,500 kWh ⚡ GPT-4: • Training alone: ~50,000,000 kWh • That's 14,000× more energy than your entire brain used in 20 years • Inference at scale: tens of megawatts, continuously Biology delivers more general intelligence per watt than anything we've ever built.

“why can't computers match the biological brain?” —asked by Naveen Rao, CEO of Unconventional AI Here's what the numbers actually look like: 🧠 Biological brain: • 20+ watts • 86 billion neurons, 100 trillion synapses • Handles vision, language, memory, emotion simultaneously • Total energy to "train" over 20 years: ~3,500 kWh ⚡ GPT-4: • Training alone: ~50,000,000 kWh • That's 14,000× more energy than your entire brain used in 20 years • Inference at scale: tens of megawatts, continuously Biology delivers more general intelligence per watt than anything we've ever built.

92,678 次观看 • 5 个月前

A new Nature paper from Johns Hopkins (by Prof. Lin Dingchang Lin ) just solved one of the hardest problems in biology: how do you record what every cell in a tissue experienced over time, not just what it looks like right now? The answer: GEMINI — Granularly Expanding Memory for Intracellular Narrative Integration. It works exactly like tree rings. Cells are genetically engineered to express a computationally designed protein assembly. As the assembly grows inside the cell, it captures cellular activity as fluorescent ring patterns — each ring a timestamp, each ring's properties encoding signal intensity. Look at a cross-section under a microscope and you can read the cell's history backward, with ~15-minute resolution. The key: cells build the recorder themselves. GEMINI doesn't interfere with normal function — it just quietly writes. What they demonstrated: In a full tumor xenograft, GEMINI captured every cancer cell's activity history across the entire tumor while it continued to grow normally. For the first time, researchers can look back and see how different regions of the same tumor responded differently to therapy over time — not snapshots, but film. In a mouse brain, GEMINI recorded neural activity dynamics without disrupting behavior, coordination, or memory. It could temporally resolve the history of a brain seizure. Why this matters: Every tool we have in biology gives you state — what the cell looks like now. Sequencing, imaging, proteomics — all snapshots. GEMINI gives you trajectory. It's the difference between a photograph and a video, applied to every cell in an organ simultaneously. The team is explicit that AI-based decoding tools will be central to reading GEMINI's output at whole-brain scale. This is the data layer that makes temporal single-cell atlases possible. Paper: Congratulations Dingchang Lin

A new Nature paper from Johns Hopkins (by Prof. Lin Dingchang Lin ) just solved one of the hardest problems in biology: how do you record what every cell in a tissue experienced over time, not just what it looks like right now? The answer: GEMINI — Granularly Expanding Memory for Intracellular Narrative Integration. It works exactly like tree rings. Cells are genetically engineered to express a computationally designed protein assembly. As the assembly grows inside the cell, it captures cellular activity as fluorescent ring patterns — each ring a timestamp, each ring's properties encoding signal intensity. Look at a cross-section under a microscope and you can read the cell's history backward, with ~15-minute resolution. The key: cells build the recorder themselves. GEMINI doesn't interfere with normal function — it just quietly writes. What they demonstrated: In a full tumor xenograft, GEMINI captured every cancer cell's activity history across the entire tumor while it continued to grow normally. For the first time, researchers can look back and see how different regions of the same tumor responded differently to therapy over time — not snapshots, but film. In a mouse brain, GEMINI recorded neural activity dynamics without disrupting behavior, coordination, or memory. It could temporally resolve the history of a brain seizure. Why this matters: Every tool we have in biology gives you state — what the cell looks like now. Sequencing, imaging, proteomics — all snapshots. GEMINI gives you trajectory. It's the difference between a photograph and a video, applied to every cell in an organ simultaneously. The team is explicit that AI-based decoding tools will be central to reading GEMINI's output at whole-brain scale. This is the data layer that makes temporal single-cell atlases possible. Paper: Congratulations Dingchang Lin

85,108 次观看 • 4 个月前

Nature figured out distributed systems millions of years before we did. Meet the giant honeybee (Apis dorsata). No hive box, no protection—just thousands of bees exposed on a cliff face or tree branch. Their defense? A biological Mexican wave that makes predators freeze in confusion. This is shimmering. And the science behind it is wild. The Visual Picture a dark sheet of bees covering an open comb. Suddenly, a ripple of light flashes across the surface—hundreds of bees flipping their abdomens upward in perfect coordination, creating a wave that propagates in under a second. To a wasp or bird approaching for a meal, it's disorienting. The nest surface seems alive, unpredictable, dangerous. How It Actually Works Three distinct "agent types" coordinate this defense: 1. Bucket-Bridging Agents (75% of participants) The foot soldiers. These bees pass the signal neighbor-to-neighbor like a bucket brigade at a fire. They receive the cue from an adjacent bee, flip their abdomen, and pass it on. Velocity: ~0.32 m/s. Linear, reliable, slow. 2. Chain-Tail Agents (9%) The end of the line. These bees get activated but don't propagate the signal further. They're the wave's trailing edge. 3. Generator Agents (16%) Here's where it gets interesting. These bees flip their abdomens before the main wave reaches them. They create "daughter waves" that merge with the parental wave, accelerating the whole process by 41.5% to ~0.51 m/s. Without generators, shimmering would be too slow to matter. With them, the colony responds in real-time to a wasp's flight path. The "Special Agents" Hypothesis Early researchers assumed the bees closest to a predator would trigger the wave. Makes sense, right? Wrong. Experiments with tethered wasps revealed something stranger: shimmering starts at specific "trigger centers" clustered around the nest's mouth zone—where foragers enter and exit. These aren't random bees. They're specialized. The position of trigger cohorts doesn't match the predator's location. Instead, bees in these zones are primed to respond faster, possibly through age or experience. Think of them as sentinels—stationed strategically, not reactively. The Visual Trigger System Shimmering isn't automatic. Bees are selective about when to deploy it: • Contrast matters: Dark objects against bright backgrounds (like a hornet silhouetted against sky) trigger strong responses. Reverse the contrast—light object on dark—and nothing happens. • Size threshold: Objects smaller than ~4cm don't trigger shimmering. Below a certain visual angle (1.6–3.4 degrees), the threat isn't worth the energy. • Light dependence: Shimmering peaks in bright daylight. At dawn/dusk, the colony switches to other defenses. The visual system needs illumination to work. Why This Is Brilliant Shimmering solves multiple problems simultaneously: 1. Predator deterrence: Wasps see the wave and abort approach. The movement is unpredictable, hard to track, signals a coordinated colony. 2. Internal alarm: The wave propagates mechanoreceptive cues and Nasonov pheromone through the nest, alerting bees to prepare for escalation—mass stinging if the predator persists. 3. Energy efficiency: Not every threat triggers full defense. The visual filtering (size, contrast, light) prevents false alarms. 4. Speed through parallelism: Generator agents create saltatory (jumping) propagation that outpaces simple neighbor-to-neighbor transfer. The colony literally shortcuts information flow.

Nature figured out distributed systems millions of years before we did. Meet the giant honeybee (Apis dorsata). No hive box, no protection—just thousands of bees exposed on a cliff face or tree branch. Their defense? A biological Mexican wave that makes predators freeze in confusion. This is shimmering. And the science behind it is wild. The Visual Picture a dark sheet of bees covering an open comb. Suddenly, a ripple of light flashes across the surface—hundreds of bees flipping their abdomens upward in perfect coordination, creating a wave that propagates in under a second. To a wasp or bird approaching for a meal, it's disorienting. The nest surface seems alive, unpredictable, dangerous. How It Actually Works Three distinct "agent types" coordinate this defense: 1. Bucket-Bridging Agents (75% of participants) The foot soldiers. These bees pass the signal neighbor-to-neighbor like a bucket brigade at a fire. They receive the cue from an adjacent bee, flip their abdomen, and pass it on. Velocity: ~0.32 m/s. Linear, reliable, slow. 2. Chain-Tail Agents (9%) The end of the line. These bees get activated but don't propagate the signal further. They're the wave's trailing edge. 3. Generator Agents (16%) Here's where it gets interesting. These bees flip their abdomens before the main wave reaches them. They create "daughter waves" that merge with the parental wave, accelerating the whole process by 41.5% to ~0.51 m/s. Without generators, shimmering would be too slow to matter. With them, the colony responds in real-time to a wasp's flight path. The "Special Agents" Hypothesis Early researchers assumed the bees closest to a predator would trigger the wave. Makes sense, right? Wrong. Experiments with tethered wasps revealed something stranger: shimmering starts at specific "trigger centers" clustered around the nest's mouth zone—where foragers enter and exit. These aren't random bees. They're specialized. The position of trigger cohorts doesn't match the predator's location. Instead, bees in these zones are primed to respond faster, possibly through age or experience. Think of them as sentinels—stationed strategically, not reactively. The Visual Trigger System Shimmering isn't automatic. Bees are selective about when to deploy it: • Contrast matters: Dark objects against bright backgrounds (like a hornet silhouetted against sky) trigger strong responses. Reverse the contrast—light object on dark—and nothing happens. • Size threshold: Objects smaller than ~4cm don't trigger shimmering. Below a certain visual angle (1.6–3.4 degrees), the threat isn't worth the energy. • Light dependence: Shimmering peaks in bright daylight. At dawn/dusk, the colony switches to other defenses. The visual system needs illumination to work. Why This Is Brilliant Shimmering solves multiple problems simultaneously: 1. Predator deterrence: Wasps see the wave and abort approach. The movement is unpredictable, hard to track, signals a coordinated colony. 2. Internal alarm: The wave propagates mechanoreceptive cues and Nasonov pheromone through the nest, alerting bees to prepare for escalation—mass stinging if the predator persists. 3. Energy efficiency: Not every threat triggers full defense. The visual filtering (size, contrast, light) prevents false alarms. 4. Speed through parallelism: Generator agents create saltatory (jumping) propagation that outpaces simple neighbor-to-neighbor transfer. The colony literally shortcuts information flow.

72,063 次观看 • 5 个月前

Many people still aren’t familiar with Ex Vivo Lung Perfusion (EVLP), one of the most underrated life-saving inventions in medicine. Developed in Toronto, EVLP keeps donor lungs alive outside the body at physiological conditions, so we can assess, repair, and even improve them before transplant. The impact: • Expanded usable donor lungs by ~2–3× • Reduced primary graft dysfunction • Enabled longer preservation and transport • Opened the door to targeted pre-transplant therapies It turned “unusable” lungs into life-saving organs. And now we just used AI to build a digital twin of lungs, based on large-scale EVLP data from University Health Network. 🔥

Sensitive content

This media may contain sensitive content.

Many people still aren’t familiar with Ex Vivo Lung Perfusion (EVLP), one of the most underrated life-saving inventions in medicine. Developed in Toronto, EVLP keeps donor lungs alive outside the body at physiological conditions, so we can assess, repair, and even improve them before transplant. The impact: • Expanded usable donor lungs by ~2–3× • Reduced primary graft dysfunction • Enabled longer preservation and transport • Opened the door to targeted pre-transplant therapies It turned “unusable” lungs into life-saving organs. And now we just used AI to build a digital twin of lungs, based on large-scale EVLP data from University Health Network. 🔥

35,070 次观看 • 2 个月前

Welcome to the Lab of the Future! 🧬🤖 Excited to share LUMI-lab, out today in Cell — a self-driving platform that pairs an AI foundation model with a robotic lab to autonomously discover ionizable lipids (LNPs) for mRNA delivery. The core problem: Designing lipid nanoparticles (LNPs) is hard. The chemical space of ionizable lipids is vast, experimental cycles are slow, and — critically — historical LNP datasets are far too small to train a predictive model from scratch. Most AI approaches in this space hit a wall immediately: not enough data to learn from. Our solution: lab-in-the-loop foundation model learning. Instead of training on LNP data alone, LUMI starts as a transformer-based foundation model pretrained across broad chemical space, building rich molecular representations before it ever sees a single LNP experiment. Then it enters a closed loop with a robotic synthesis platform: predict → synthesize → assay → update. Each round of real wet-lab experiments fine-tunes the model, which then proposes smarter candidates for the next round. The lab isn't just validating AI predictions — it's actively teaching the model, continuously. What happened when we let it run: LUMI-lab autonomously synthesized and screened 1,700+ ionizable lipids in human bronchial epithelial cells. The top candidate — LUMI-6 — features a brominated lipid tail, a structural motif that had been largely overlooked in LNP design. LUMI found it without being told where to look. When formulated into LNPs and delivered intratracheally to mice, LUMI-6 achieved 20.3% gene editing efficiency in lung epithelial cells — a compelling result for one of the hardest-to-reach therapeutic targets, directly relevant to diseases like cystic fibrosis and alpha-1 antitrypsin deficiency. Why this matters beyond LNPs: This is a proof of concept for a broader thesis — that foundation model pretraining + active learning + robotic experimentation can overcome the data scarcity bottleneck that plagues AI-driven discovery in biology. You don't need a massive domain-specific dataset to start. You need a model that can generalize, a lab that can generate the right data, and a loop that connects them. Huge congratulations to first authors Yue Xu, Haotian Cui, and Kuan Pang, and to the entire Bowen LI team. Grateful to our collaborators at University Health Network and Leslie Dan Faculty of Pharmacy, and to Princess Margaret Cancer Centre Research Princess Margaret Cancer Centre Research. 📄 Paper:

Welcome to the Lab of the Future! 🧬🤖 Excited to share LUMI-lab, out today in Cell — a self-driving platform that pairs an AI foundation model with a robotic lab to autonomously discover ionizable lipids (LNPs) for mRNA delivery. The core problem: Designing lipid nanoparticles (LNPs) is hard. The chemical space of ionizable lipids is vast, experimental cycles are slow, and — critically — historical LNP datasets are far too small to train a predictive model from scratch. Most AI approaches in this space hit a wall immediately: not enough data to learn from. Our solution: lab-in-the-loop foundation model learning. Instead of training on LNP data alone, LUMI starts as a transformer-based foundation model pretrained across broad chemical space, building rich molecular representations before it ever sees a single LNP experiment. Then it enters a closed loop with a robotic synthesis platform: predict → synthesize → assay → update. Each round of real wet-lab experiments fine-tunes the model, which then proposes smarter candidates for the next round. The lab isn't just validating AI predictions — it's actively teaching the model, continuously. What happened when we let it run: LUMI-lab autonomously synthesized and screened 1,700+ ionizable lipids in human bronchial epithelial cells. The top candidate — LUMI-6 — features a brominated lipid tail, a structural motif that had been largely overlooked in LNP design. LUMI found it without being told where to look. When formulated into LNPs and delivered intratracheally to mice, LUMI-6 achieved 20.3% gene editing efficiency in lung epithelial cells — a compelling result for one of the hardest-to-reach therapeutic targets, directly relevant to diseases like cystic fibrosis and alpha-1 antitrypsin deficiency. Why this matters beyond LNPs: This is a proof of concept for a broader thesis — that foundation model pretraining + active learning + robotic experimentation can overcome the data scarcity bottleneck that plagues AI-driven discovery in biology. You don't need a massive domain-specific dataset to start. You need a model that can generalize, a lab that can generate the right data, and a loop that connects them. Huge congratulations to first authors Yue Xu, Haotian Cui, and Kuan Pang, and to the entire Bowen LI team. Grateful to our collaborators at University Health Network and Leslie Dan Faculty of Pharmacy, and to Princess Margaret Cancer Centre Research Princess Margaret Cancer Centre Research. 📄 Paper:

57,430 次观看 • 4 个月前

🚀 The Segment Anything Model (SAM) has been upgraded to SAM2, featuring an efficient image encoder for segmenting images and videos. But does SAM2 outperform SAM1 in medical image and video segmentation? We're thrilled to present our paper "Segment Anything in Medical Images and Videos: Benchmark and Deployment"! We comprehensively benchmark SAM2 across 11 medical image modalities and videos. 📄 Paper: 💻 Code: **Highlights:** 1. SAM2 doesn’t always outperform SAM1 in 2D medical images, but excels in video segmentation, making it more accurate and efficient for 3D images, such as CT and MR scans. 2. MedSAM still outperforms SAM2 on most 2D modalities, but SAM2 surpasses MedSAM for 3D image segmentation in a slice-by-slice approach. 3. Segmentation performance varies with model size; sometimes the smallest model outperforms larger ones. 4. Fine-tuning SAM2 significantly boosts its performance for medical image segmentation. While SAM2 may struggle with challenging objects that have unclear boundaries or low contrast, it excels in generating good initial segmentation masks for common medical images and videos. However, the official interface doesn’t support medical data formats and has limitations on video length. To address this, we've developed a 3D Slicer Plugin and Gradio API for efficient 3D medical image and video segmentation. We invite you to try them out and provide feedback! 🔧 Deployment: - 3D Slicer Plugin: - Gradio API: (Note: Due to GPU limitations, the online API is available for only 12 hours and may be slow. We highly recommend deploying the Gradio API with your own computing resources: A big shoutout to Jun Ma (JunMa) who recently joined our UHN AI hub (UHN AI Hub) as Machine Learning Lead, and kudos to all co-authors: Sumin Kim, Feifei Li, Mohammed Baharoon (Mohammed Baharoon), Reza Asakereh, and Hongwei Lyu! This is true teamwork! Looking forward to collaborating with the community to advance 3D medical image and video segmentation foundation models! University Health Network U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology Temerty Centre for AI in Medicine (T-CAIREM) Vector Institute #MedTech #AIinHealthcare #DeepLearning #MedicalImaging #SAM2 #MedSAM #AIResearch

🚀 The Segment Anything Model (SAM) has been upgraded to SAM2, featuring an efficient image encoder for segmenting images and videos. But does SAM2 outperform SAM1 in medical image and video segmentation? We're thrilled to present our paper "Segment Anything in Medical Images and Videos: Benchmark and Deployment"! We comprehensively benchmark SAM2 across 11 medical image modalities and videos. 📄 Paper: 💻 Code: Highlights: 1. SAM2 doesn’t always outperform SAM1 in 2D medical images, but excels in video segmentation, making it more accurate and efficient for 3D images, such as CT and MR scans. 2. MedSAM still outperforms SAM2 on most 2D modalities, but SAM2 surpasses MedSAM for 3D image segmentation in a slice-by-slice approach. 3. Segmentation performance varies with model size; sometimes the smallest model outperforms larger ones. 4. Fine-tuning SAM2 significantly boosts its performance for medical image segmentation. While SAM2 may struggle with challenging objects that have unclear boundaries or low contrast, it excels in generating good initial segmentation masks for common medical images and videos. However, the official interface doesn’t support medical data formats and has limitations on video length. To address this, we've developed a 3D Slicer Plugin and Gradio API for efficient 3D medical image and video segmentation. We invite you to try them out and provide feedback! 🔧 Deployment: - 3D Slicer Plugin: - Gradio API: (Note: Due to GPU limitations, the online API is available for only 12 hours and may be slow. We highly recommend deploying the Gradio API with your own computing resources: A big shoutout to Jun Ma (JunMa) who recently joined our UHN AI hub (UHN AI Hub) as Machine Learning Lead, and kudos to all co-authors: Sumin Kim, Feifei Li, Mohammed Baharoon (Mohammed Baharoon), Reza Asakereh, and Hongwei Lyu! This is true teamwork! Looking forward to collaborating with the community to advance 3D medical image and video segmentation foundation models! University Health Network U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology Temerty Centre for AI in Medicine (T-CAIREM) Vector Institute #MedTech #AIinHealthcare #DeepLearning #MedicalImaging #SAM2 #MedSAM #AIResearch

178,481 次观看 • 1 年前

Asked DeepSeek-V4 “everything about virtual cell”. I am genuinely impressed by its speed and accuracy!! 🐋🐋

Asked DeepSeek-V4 “everything about virtual cell”. I am genuinely impressed by its speed and accuracy!! 🐋🐋

32,743 次观看 • 3 个月前

🔬 Exciting News! Our manuscript, "scGPT: toward building a foundation model for single-cell multi-omics using generative AI" is now finally published in Nature Methods (Nature Methods) 🎉 !!! (Re-)Introducing scGPT: A transformative foundation model engineered for single-cell omics analysis. Developed through the analysis of over 33 million human cells, scGPT sets a new benchmark for application versatility, offering both fine-tuning and zero-shot capabilities. Since its preprint in May 2023, scGPT has significantly impacted the field, evidenced by 13K+ installations, 600+ GitHub stars 🌟, and 40+ citations before its official publication! scGPT has been validated by numerous benchmark studies as a leading foundation model in single-cell analysis. Its pre-trained embeddings extend its utility beyond single-cell studies, enhancing a variety of downstream tasks including protein enrichment and genetic perturbation predictions. Some key updates lately: ---Expanded zero-shot applications for efficient reference mapping and integration, now with CellXGene census integration. ---Advanced perturbation analysis capabilities, including genome-scale perturb-seq data analysis and bulk sequencing data generalization. ---Upgraded scGPT package, offering versatile model loading compatible with PyTorch and flash-attn, for both GPU and CPU. ---Cloud-based scGPT applications for reference mapping, cell annotation, and gene regulatory network inference are available on ---Integration with Hugging Face for easier model training. Limitations: scGPT is an early foray into foundation models for single-cell omics, facing challenges like limited zero-shot learning in some tasks, pretraining constraints, data quality issues, and evaluation limitations. See our Supplementary Notes for details. 🚀 Future Work? Short-Term Goals: 1. Releasing a Mouse Model for broader analysis. 2. Developing a comprehensive evaluation suite for foundation models in single-cell analysis. 3. Creating a foundation model for single-cell spatial omics. 4. Enhancing zero-shot capacity by integrating scGPT with RAG (e.g., knowledge graphs). Long-Term Goals: 1. Expanding scGPT for comprehensive single-cell multi-omics analysis. 2. Developing an in-silico perturbation model for predicting genetic perturbation effects. 3. Merging scGPT with multi-modal genomic sequence models for a deeper understanding of cell biology. 📚 Access the paper on Nature Methods: 🔬Preprint in Bioarixv: 💻 All our codes/data/weights are open source: Wholehearted congratulations to all the authors, especially the two co-first authors, Haotian (Haotian Cui ) and Chloe (ChloeXWang), who are really the emerging superstars in AI and biology! Vector Institute Peter Munk Cardiac Centre AI U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology University Health Network University of Toronto #scGPT #GenerativeAI #AI4Science #Combio #opensource

🔬 Exciting News! Our manuscript, "scGPT: toward building a foundation model for single-cell multi-omics using generative AI" is now finally published in Nature Methods (Nature Methods) 🎉 !!! (Re-)Introducing scGPT: A transformative foundation model engineered for single-cell omics analysis. Developed through the analysis of over 33 million human cells, scGPT sets a new benchmark for application versatility, offering both fine-tuning and zero-shot capabilities. Since its preprint in May 2023, scGPT has significantly impacted the field, evidenced by 13K+ installations, 600+ GitHub stars 🌟, and 40+ citations before its official publication! scGPT has been validated by numerous benchmark studies as a leading foundation model in single-cell analysis. Its pre-trained embeddings extend its utility beyond single-cell studies, enhancing a variety of downstream tasks including protein enrichment and genetic perturbation predictions. Some key updates lately: ---Expanded zero-shot applications for efficient reference mapping and integration, now with CellXGene census integration. ---Advanced perturbation analysis capabilities, including genome-scale perturb-seq data analysis and bulk sequencing data generalization. ---Upgraded scGPT package, offering versatile model loading compatible with PyTorch and flash-attn, for both GPU and CPU. ---Cloud-based scGPT applications for reference mapping, cell annotation, and gene regulatory network inference are available on ---Integration with Hugging Face for easier model training. Limitations: scGPT is an early foray into foundation models for single-cell omics, facing challenges like limited zero-shot learning in some tasks, pretraining constraints, data quality issues, and evaluation limitations. See our Supplementary Notes for details. 🚀 Future Work? Short-Term Goals: 1. Releasing a Mouse Model for broader analysis. 2. Developing a comprehensive evaluation suite for foundation models in single-cell analysis. 3. Creating a foundation model for single-cell spatial omics. 4. Enhancing zero-shot capacity by integrating scGPT with RAG (e.g., knowledge graphs). Long-Term Goals: 1. Expanding scGPT for comprehensive single-cell multi-omics analysis. 2. Developing an in-silico perturbation model for predicting genetic perturbation effects. 3. Merging scGPT with multi-modal genomic sequence models for a deeper understanding of cell biology. 📚 Access the paper on Nature Methods: 🔬Preprint in Bioarixv: 💻 All our codes/data/weights are open source: Wholehearted congratulations to all the authors, especially the two co-first authors, Haotian (Haotian Cui ) and Chloe (ChloeXWang), who are really the emerging superstars in AI and biology! Vector Institute Peter Munk Cardiac Centre AI U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology University Health Network University of Toronto #scGPT #GenerativeAI #AI4Science #Combio #opensource

199,708 次观看 • 2 年前

Love seeing Silico (Goodfire ) used to probe our EchoJEPA's representations! this is exactly the kind of interpretability work that's been missing for JEPA-style models. One thing that makes EchoJEPA particularly interesting to interpret: unlike MAE-based approaches, it never reconstructs pixels. The model learns entirely in latent space through masked prediction, so you can't just look at decoder outputs to understand what it captured. Attribution onto a temporally aligned 3D mesh is a much more honest probe of what the representations actually encode. What we found in building EchoJEPA: training on 18M echo videos across 300K patients, the model learns to disentangle cardiac anatomy from ultrasound noise (speckle, reverberation artifacts) almost entirely through self-supervision. With 1% labeled data it already outperforms supervised baselines trained on 100%. The latent space is doing real anatomical work, but until you can visualize it like this, "real anatomical work" is mostly a claim. Paper + code: |

Love seeing Silico (Goodfire ) used to probe our EchoJEPA's representations! this is exactly the kind of interpretability work that's been missing for JEPA-style models. One thing that makes EchoJEPA particularly interesting to interpret: unlike MAE-based approaches, it never reconstructs pixels. The model learns entirely in latent space through masked prediction, so you can't just look at decoder outputs to understand what it captured. Attribution onto a temporally aligned 3D mesh is a much more honest probe of what the representations actually encode. What we found in building EchoJEPA: training on 18M echo videos across 300K patients, the model learns to disentangle cardiac anatomy from ultrasound noise (speckle, reverberation artifacts) almost entirely through self-supervision. With 1% labeled data it already outperforms supervised baselines trained on 100%. The latent space is doing real anatomical work, but until you can visualize it like this, "real anatomical work" is mostly a claim. Paper + code: |

29,452 次观看 • 2 个月前

🚀 We're thrilled to introduce Orthrus 🧬🐕—a groundbreaking mature RNA foundation model designed to push the boundaries of RNA property prediction! 🔬 What is Orthrus? Orthrus is a Mamba-based RNA foundation model, pre-trained using a novel self-supervised contrastive learning objective with biologically inspired augmentations. It optimizes the similarity between splicing isoforms and orthologous transcripts, capturing functional and evolutionary relationships to enhance mature RNA property prediction accuracy. 📑 Preprint: 💻 Code: 🌐 Project Page: 📦 Model Weights: 🧠 Why Orthrus? Decoding the RNA regulatory code is key to understanding biology, but traditional experimental approaches are slow and costly. Existing genomic foundation models rely on techniques like masked language modeling or next-token prediction, which aren't fully aligned with the complexities of genomic data—leading to suboptimal results. 🌟 Orthrus Highlights: - Biologically-Informed Contrastive Learning 🧪: A novel contrastive learning objective designed specifically for genomics, maximizing similarity between splicing isoforms and orthologous transcripts across species. - Extensive Pre-training 📊: Trained on splicing annotations from 10 species and orthologous alignments from 400+ mammalian species (Zoonomia Project), with a focus on sequences of high functional importance. - Superior Representations🏅: Orthrus outperforms existing genomic models on 5 mRNA property prediction tasks, often surpassing supervised methods with just a simple linear transformation. - Efficiency in Low-Data Settings📉: Orthrus excels in low-data regimes, achieving state-of-the-art results with as few as 45 labeled examples for fine-tuning on RNA half-life prediction. Shoutout to the amazing leading authors Phil (Phil Fradkin) and Ian (Ian Shi)! Also the work is impossible without an outstanding collaboration by Karina (Karin(a) Isaev), Brendan (Brendan Frey) , Quaid (Quaid Morris), Leo J. Lee! Vector Institute University Health Network U of T Department of Computer Science Temerty Centre for AI in Medicine (T-CAIREM) Department of Laboratory Medicine & Pathobiology

🚀 We're thrilled to introduce Orthrus 🧬🐕—a groundbreaking mature RNA foundation model designed to push the boundaries of RNA property prediction! 🔬 What is Orthrus? Orthrus is a Mamba-based RNA foundation model, pre-trained using a novel self-supervised contrastive learning objective with biologically inspired augmentations. It optimizes the similarity between splicing isoforms and orthologous transcripts, capturing functional and evolutionary relationships to enhance mature RNA property prediction accuracy. 📑 Preprint: 💻 Code: 🌐 Project Page: 📦 Model Weights: 🧠 Why Orthrus? Decoding the RNA regulatory code is key to understanding biology, but traditional experimental approaches are slow and costly. Existing genomic foundation models rely on techniques like masked language modeling or next-token prediction, which aren't fully aligned with the complexities of genomic data—leading to suboptimal results. 🌟 Orthrus Highlights: - Biologically-Informed Contrastive Learning 🧪: A novel contrastive learning objective designed specifically for genomics, maximizing similarity between splicing isoforms and orthologous transcripts across species. - Extensive Pre-training 📊: Trained on splicing annotations from 10 species and orthologous alignments from 400+ mammalian species (Zoonomia Project), with a focus on sequences of high functional importance. - Superior Representations🏅: Orthrus outperforms existing genomic models on 5 mRNA property prediction tasks, often surpassing supervised methods with just a simple linear transformation. - Efficiency in Low-Data Settings📉: Orthrus excels in low-data regimes, achieving state-of-the-art results with as few as 45 labeled examples for fine-tuning on RNA half-life prediction. Shoutout to the amazing leading authors Phil (Phil Fradkin) and Ian (Ian Shi)! Also the work is impossible without an outstanding collaboration by Karina (Karin(a) Isaev), Brendan (Brendan Frey) , Quaid (Quaid Morris), Leo J. Lee! Vector Institute University Health Network U of T Department of Computer Science Temerty Centre for AI in Medicine (T-CAIREM) Department of Laboratory Medicine & Pathobiology

114,913 次观看 • 1 年前

🎉 The best way to start the week is to find out that our MedSAM is finally published today in Nature Communications! **Segment anything in medical images** Paper: arXiv: Data & Code: MedSAM is the first promotable foundation model for medical image segmentation. **Highlights**: ⭐ Before its formal publication, we have received 220 citations and 1400+ GitHub stars 🙏🙏❤️‍🔥❤️‍🔥❤️‍🔥 📊 We curated a large-scale medical image dataset with 1,570,263 image-mask pairs, covering 10 imaging modalities and over 30 cancer types. 🚀 Built on top of SAM (AI at Meta ) with transfer learning, we have significantly enhanced its segmentation performance of medical images. 📈 Comprehensive evaluations of 86 internal validation tasks and 60 external validation tasks demonstrate its better accuracy and robustness than modality-wise specialist models. **What is Next? --- Clinical Translation!!** 🍕Our next goal is to make the model deployable on laptops (CPUs) or other edge devices without reliance on GPUs. We have distilled a lightweight model, LiteMedSAM, offering a speed boost of 10x while maintaining accuracy. Plus, we have integrated it into the 3D Slicer plugin, providing an efficient tool for medical image segmentation. 🌐 To further promote developments in this field, we organize a competition on #CVPR2026: Segment Anything in Medical Images on Laptop! An out-of-the-box baseline has been released to reduce the entry barriers. Welcome to join us to push the boundary further: 🙏 Massive thanks to MetaAI AI at Meta for their open-source project SAM and many reviewers/users for their invaluable feedback. A huge shoutout to my postdoc Jun Ma (JunMa) for his leadership on this project!! UHN AI Hub Vector Institute Peter Munk Cardiac Centre AI Department of Laboratory Medicine & Pathobiology U of T Department of Computer Science University of Toronto University Health Network Brad Wouters 🇨🇦 Barry Rubin MD, PhD, FRCSC Shaf Keshavjee

🎉 The best way to start the week is to find out that our MedSAM is finally published today in Nature Communications! Segment anything in medical images Paper: arXiv: Data & Code: MedSAM is the first promotable foundation model for medical image segmentation. Highlights: ⭐ Before its formal publication, we have received 220 citations and 1400+ GitHub stars 🙏🙏❤️‍🔥❤️‍🔥❤️‍🔥 📊 We curated a large-scale medical image dataset with 1,570,263 image-mask pairs, covering 10 imaging modalities and over 30 cancer types. 🚀 Built on top of SAM (AI at Meta ) with transfer learning, we have significantly enhanced its segmentation performance of medical images. 📈 Comprehensive evaluations of 86 internal validation tasks and 60 external validation tasks demonstrate its better accuracy and robustness than modality-wise specialist models. What is Next? --- Clinical Translation!! 🍕Our next goal is to make the model deployable on laptops (CPUs) or other edge devices without reliance on GPUs. We have distilled a lightweight model, LiteMedSAM, offering a speed boost of 10x while maintaining accuracy. Plus, we have integrated it into the 3D Slicer plugin, providing an efficient tool for medical image segmentation. 🌐 To further promote developments in this field, we organize a competition on #CVPR2026: Segment Anything in Medical Images on Laptop! An out-of-the-box baseline has been released to reduce the entry barriers. Welcome to join us to push the boundary further: 🙏 Massive thanks to MetaAI AI at Meta for their open-source project SAM and many reviewers/users for their invaluable feedback. A huge shoutout to my postdoc Jun Ma (JunMa) for his leadership on this project!! UHN AI Hub Vector Institute Peter Munk Cardiac Centre AI Department of Laboratory Medicine & Pathobiology U of T Department of Computer Science University of Toronto University Health Network Brad Wouters 🇨🇦 Barry Rubin MD, PhD, FRCSC Shaf Keshavjee

140,208 次观看 • 2 年前

🚀 Introducing scGPT-spatial! 🧬🌍 A game-changing spatial-omic foundation model, built on the powerful scGPT framework with MoE (mixture of experts) and continually pretrained on a massive 30 million spatial single-cell profiles! 🧠 What’s the challenge? Spatial transcriptomics is next-level complex—not only must we model single-cell/spot profiles, but we also need to capture intricate spatial relationships while handling diverse sequencing protocols (imaging-based vs. sequencing-based). 🔥 Why scGPT-spatial? ✨ A Spatial-omic Foundation Model with Continual Pretraining – Built on scGPT’s robust initialization, it unlocks spatial context in tissues. ✨ SpatialHuman30M Dataset – The largest curated dataset: 30M profiles from Visium, Visium HD, Xenium, and MERFISH across 821 slides. ✨ Revolutionary MoE Decoders – A cutting-edge Mixture of Experts (MoE) architecture for protocol-aware gene expression decoding. ✨ Spatially-Aware Training Strategy – A neighborhood-based masked reconstruction approach to capture complex cell-type colocalization. ✨ Multi-Modal & Multi-Slide Integration – Seamless clustering & spatial domain identification across slides and modalities. ✨ Cell-Type Deconvolution & Gene Imputation – Unlocks cross-resolution & cross-modality harmonization with fine-tuned embeddings. 📄 Read the preprint: 💻 Explore the code/weights: #SpatialTranscriptomics #SingleCell #AIResearch #MachineLearning #SpatialData Huge shoutout to the incredible PHD students Chloe (ChloeXWang) and Haotian (Haotian Cui) for leading this groundbreaking project! 🎉 Massive thanks to our amazing co-authors Andrew, Ronald, and Hani (Hani Goodarzi) from Arc Institute—this work wouldn't have been possible without you! 👏

🚀 Introducing scGPT-spatial! 🧬🌍 A game-changing spatial-omic foundation model, built on the powerful scGPT framework with MoE (mixture of experts) and continually pretrained on a massive 30 million spatial single-cell profiles! 🧠 What’s the challenge? Spatial transcriptomics is next-level complex—not only must we model single-cell/spot profiles, but we also need to capture intricate spatial relationships while handling diverse sequencing protocols (imaging-based vs. sequencing-based). 🔥 Why scGPT-spatial? ✨ A Spatial-omic Foundation Model with Continual Pretraining – Built on scGPT’s robust initialization, it unlocks spatial context in tissues. ✨ SpatialHuman30M Dataset – The largest curated dataset: 30M profiles from Visium, Visium HD, Xenium, and MERFISH across 821 slides. ✨ Revolutionary MoE Decoders – A cutting-edge Mixture of Experts (MoE) architecture for protocol-aware gene expression decoding. ✨ Spatially-Aware Training Strategy – A neighborhood-based masked reconstruction approach to capture complex cell-type colocalization. ✨ Multi-Modal & Multi-Slide Integration – Seamless clustering & spatial domain identification across slides and modalities. ✨ Cell-Type Deconvolution & Gene Imputation – Unlocks cross-resolution & cross-modality harmonization with fine-tuned embeddings. 📄 Read the preprint: 💻 Explore the code/weights: #SpatialTranscriptomics #SingleCell #AIResearch #MachineLearning #SpatialData Huge shoutout to the incredible PHD students Chloe (ChloeXWang) and Haotian (Haotian Cui) for leading this groundbreaking project! 🎉 Massive thanks to our amazing co-authors Andrew, Ronald, and Hani (Hani Goodarzi) from Arc Institute—this work wouldn't have been possible without you! 👏

59,008 次观看 • 1 年前

Alibaba just dropped Qwen3.5-397B-A17B and there's a lot to unpack. 397B params, 17B active per forward pass. Sparse MoE done right. But the real story isn't the size—it's the architecture choices. The MoE Design Most MoE models feel like bolt-ons. Qwen 3.5's sparse activation is native—only 4.3% of parameters fire per token. That's how you get trillion-parameter-class performance without trillion-parameter inference costs. The 0.8 RMB/million tokens pricing isn't subsidized; it's structurally earned. Native Multimodal, Not Glued-On This is a vision-language model from the ground up. Heterogeneous architecture—separate processing pipelines for text, image, video that fuse early. Not a vision encoder slapped onto an LLM. The result: 90.8 on OmniDocBench, 79.0 on MMMU-Pro. Document understanding and visual reasoning without the usual brittleness. The Context Window Reality Qwen3.5-Plus (the hosted version) ships with 1M tokens by default. That's not a marketing number—they're actually positioning it for long-document workflows. With built-in adaptive tool use, it's clearly aimed at agentic automation, not just chat. What Actually Impressed Me • FP8 native pipeline: ~50% activation memory reduction • Async RL framework for continuous refinement—training and inference workloads separated • 201 languages (up from 119), 250k vocab for better low-resource encoding • Apache 2.0 license. Full weights on HuggingFace and ModelScope. The Benchmark Context 76.4 on SWE-bench Verified puts it in the range where it can handle real debugging workflows. 72.9 on BFCL v4 for agentic tool use. 88.4 on GPQA Diamond. These aren't SOTA in isolation, but the breadth is unusual—strong across reasoning, coding, multimodal, and agentic tasks. The Honest Caveat I haven't stress-tested the 1M context for needle-in-haystack retrieval yet. And "native multimodal" claims need real-world torture testing—PDFs with tables, charts, mixed layouts. Benchmarks are benchmarks. Bottom Line This isn't just another model release. It's a bet on efficient scale: big model capabilities, small active compute, open weights. At 1/18th the cost of Gemini 3 Pro, it's going to force pricing conversations across the board.

Alibaba just dropped Qwen3.5-397B-A17B and there's a lot to unpack. 397B params, 17B active per forward pass. Sparse MoE done right. But the real story isn't the size—it's the architecture choices. The MoE Design Most MoE models feel like bolt-ons. Qwen 3.5's sparse activation is native—only 4.3% of parameters fire per token. That's how you get trillion-parameter-class performance without trillion-parameter inference costs. The 0.8 RMB/million tokens pricing isn't subsidized; it's structurally earned. Native Multimodal, Not Glued-On This is a vision-language model from the ground up. Heterogeneous architecture—separate processing pipelines for text, image, video that fuse early. Not a vision encoder slapped onto an LLM. The result: 90.8 on OmniDocBench, 79.0 on MMMU-Pro. Document understanding and visual reasoning without the usual brittleness. The Context Window Reality Qwen3.5-Plus (the hosted version) ships with 1M tokens by default. That's not a marketing number—they're actually positioning it for long-document workflows. With built-in adaptive tool use, it's clearly aimed at agentic automation, not just chat. What Actually Impressed Me • FP8 native pipeline: ~50% activation memory reduction • Async RL framework for continuous refinement—training and inference workloads separated • 201 languages (up from 119), 250k vocab for better low-resource encoding • Apache 2.0 license. Full weights on HuggingFace and ModelScope. The Benchmark Context 76.4 on SWE-bench Verified puts it in the range where it can handle real debugging workflows. 72.9 on BFCL v4 for agentic tool use. 88.4 on GPQA Diamond. These aren't SOTA in isolation, but the breadth is unusual—strong across reasoning, coding, multimodal, and agentic tasks. The Honest Caveat I haven't stress-tested the 1M context for needle-in-haystack retrieval yet. And "native multimodal" claims need real-world torture testing—PDFs with tables, charts, mixed layouts. Benchmarks are benchmarks. Bottom Line This isn't just another model release. It's a bet on efficient scale: big model capabilities, small active compute, open weights. At 1/18th the cost of Gemini 3 Pro, it's going to force pricing conversations across the board.

13,221 次观看 • 5 个月前