Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

⚡️🔬📣 Excited to share our new nature article building and evaluating PathChat, a multimodal generative AI copilot and chatbot for human pathology. Article: Open Access Link: We leverage our previous success in building foundation models for computational pathology such as UNI / CONCH and combine it with the advancements... of large vision language models and generative AI to enable PathChat to answer diverse pathology-related queries. We assessed PathChat using both multiple choice diagnostic questions and open-ended questions. Congratulations to Max Lu Bowen Chen @DFKW_MD Richard J. Chen and everyone else who contributed to this work. Also see blog post from Max Lu about this work: , also teasing the development and preview of PathChat 2, a successor to PathChat 1 bringing new capabilities and substantially improved performance to the state-of-the-art.show more

Faisal Mahmood

6,989 subscribers

291,475 Aufrufe • vor 2 Jahren •via X (Twitter)

Wissenschaft & Technologie Gesundheit & Wellness

Anya Rossi• Live Now

Private livecam show

10 Kommentare

Profilbild von Bo Wang

Bo Wangvor 2 Jahren

@Nature Congratulations @AI4Pathology ! Your team is on fire 🔥! Look forward to hearing about your research at CVPR!

Profilbild von Aaditya Ura

Aaditya Uravor 2 Jahren

@Nature Amazing work! Will it be open source?

Profilbild von Arjun (Raj) Manrai

Arjun (Raj) Manraivor 2 Jahren

@Nature Congrats @AI4Pathology !

Profilbild von Tanishq Mathew Abraham, Ph.D.

Tanishq Mathew Abraham, Ph.D.vor 2 Jahren

@Nature Congrats to the lab, great work!

Profilbild von Mingyao Li

Mingyao Livor 2 Jahren

@Nature Congratulations, Faisal!! 👍

Profilbild von Sebastian

Sebastianvor 2 Jahren

@Nature Congrats 🥳

Profilbild von Mehdi Maanaoui

Mehdi Maanaouivor 2 Jahren

@Nature .@Ijeb #When ??

Profilbild von Bryan Wong

Bryan Wongvor 2 Jahren

@Nature Great work! Are there any plans to open-source PathChat?

Profilbild von Khalid خالد

Khalid خالدvor 2 Jahren

@Nature very cool!

Profilbild von John

Johnvor 2 Jahren

@Nature That's fantastic! Congrats on the publication! PathChat sounds like a fascinating tool for pathology. Can't wait to see how it helps advance the field. 🌟 #Innovation #AIInHealthcare

Ähnliche Videos

We are excited to announce PathChat - a vision-language AI assistant for #Pathology that can analyze histology images and answer diverse pathology-related queries. Co-led by our superstars Max Lu Bowen Chen @DFKW_MD Preprint: Demo below,

We are excited to announce PathChat - a vision-language AI assistant for #Pathology that can analyze histology images and answer diverse pathology-related queries. Co-led by our superstars Max Lu Bowen Chen @DFKW_MD Preprint: Demo below,

Faisal Mahmood

106,580 Aufrufe • vor 2 Jahren

⚡️📣Today we are tremendously excited to announce ModellaAI the first startup from Mahmood Lab, based on an array of foundation models and generative AI tools including our recent PathChat article in nature ( ModellaAI will actualize these exciting developments and put them in the hands of pathologists, clinicians, researchers, and trainees. See our announcement below and sign up for the PathChat 2 waitlist at Congratulations to the entire team and especially Richard J. Chen Jill Stefanelli Max Lu kuanchen Bowen Chen, Long Le, and everyone else, stay tuned for exciting additional announcements.

⚡️📣Today we are tremendously excited to announce ModellaAI the first startup from Mahmood Lab, based on an array of foundation models and generative AI tools including our recent PathChat article in nature ( ModellaAI will actualize these exciting developments and put them in the hands of pathologists, clinicians, researchers, and trainees. See our announcement below and sign up for the PathChat 2 waitlist at Congratulations to the entire team and especially Richard J. Chen Jill Stefanelli Max Lu kuanchen Bowen Chen, Long Le, and everyone else, stay tuned for exciting additional announcements.

Faisal Mahmood

17,220 Aufrufe • vor 2 Jahren

⚡️📣👇Tremendously excited to share our new Cell article, where we develop TriPath, a method for analyzing 3D pathology samples using weakly supervised AI. Article: TriPath enables 3D computational pathology via 3D multiple instance learning allowing AI models to capture intricate morphological details from pathology volumes. Code: Blog post: Tested on two different imaging modalities, and patient cohorts from two institutions. Our superstar Andrew H. Song put in a monumental effort of leading the study, in a fantastic collaboration with Jonathan Liu at University of Washington . Interesting aspects: - Utilizing the whole tissue volume and leveraging 3D deep learning enable superior risk prediction performance compared to 2D deep learning baselines based on a few sampled tissue sections that emulate standard clinical practice. This indicates TriPath can harness additional information provided by 3D tissue morphology. - The performance is also superior to clinical baselines from a reader study that involved six expert pathologists. - The morphologically heterogeneous tissue volume could lead to opposing patient-level outcome predictions, dependent on which portion of the tissue volume is used. This concurs with current clinical literature warning that tissue sampling bias can lead to misdiagnosis. Some limitations: - While the 3D pathology cohort size is unprecedented, it is smaller than typical 2D pathology cohorts. Further large-scale studies will be required for validation. Nevertheless, we believe that this study will initiate a positive cycle, encouraging academic institutions and pharmaceutical companies to contribute large banks of human tissue blocks with paired clinical outcomes, thus speeding up advancements in 3D computational pathology. Concluding insights: We believe that 3D pathology is just around the corner - It has the huge potential to not only augment/improve the current clinical practice centered around 2D examination of human tissue, but also help reveal novel biomarkers for prognosis and therapeutic response.. Harvard Medical School Harvard Data Science Initiative Mass General Brigham Broad Institute

⚡️📣👇Tremendously excited to share our new Cell article, where we develop TriPath, a method for analyzing 3D pathology samples using weakly supervised AI. Article: TriPath enables 3D computational pathology via 3D multiple instance learning allowing AI models to capture intricate morphological details from pathology volumes. Code: Blog post: Tested on two different imaging modalities, and patient cohorts from two institutions. Our superstar Andrew H. Song put in a monumental effort of leading the study, in a fantastic collaboration with Jonathan Liu at University of Washington . Interesting aspects: - Utilizing the whole tissue volume and leveraging 3D deep learning enable superior risk prediction performance compared to 2D deep learning baselines based on a few sampled tissue sections that emulate standard clinical practice. This indicates TriPath can harness additional information provided by 3D tissue morphology. - The performance is also superior to clinical baselines from a reader study that involved six expert pathologists. - The morphologically heterogeneous tissue volume could lead to opposing patient-level outcome predictions, dependent on which portion of the tissue volume is used. This concurs with current clinical literature warning that tissue sampling bias can lead to misdiagnosis. Some limitations: - While the 3D pathology cohort size is unprecedented, it is smaller than typical 2D pathology cohorts. Further large-scale studies will be required for validation. Nevertheless, we believe that this study will initiate a positive cycle, encouraging academic institutions and pharmaceutical companies to contribute large banks of human tissue blocks with paired clinical outcomes, thus speeding up advancements in 3D computational pathology. Concluding insights: We believe that 3D pathology is just around the corner - It has the huge potential to not only augment/improve the current clinical practice centered around 2D examination of human tissue, but also help reveal novel biomarkers for prognosis and therapeutic response.. Harvard Medical School Harvard Data Science Initiative Mass General Brigham Broad Institute

Faisal Mahmood

65,520 Aufrufe • vor 2 Jahren

Super excited to share 🧠MLGym 🦾 – the first Gym environment for AI Research Agents 🤖🔬 We introduce MLGym and MLGym-Bench, a new framework and benchmark for evaluating and developing LLM agents on AI research tasks. The key contributions of our work are: 🕹️ Enables the exploration of different training algorithms for AI Research Agents such as RL 🛠️ Provides a flexible evaluation framework that can accommodate different artifacts such as models, algorithms, or predictions 🤖 Allows researchers to evaluate any model without the need to develop a custom agentic harness 🎯 Introduces 13 diverse open-ended AI Research tasks for evaluating AI Research Agents on a wide range of domains such as computer vision, natural language processing, reinforcement learning, game theory, and logical reasoning. 📈 Proposes a new evaluation metric for AI Research Agents MLGym makes it easy to: 1) Add new tasks 2) Evaluate new models 3) Integrate new agents Check out a video of the MLGym Agent to see how it performs the full pipeline of idea generation💡, implementation 👩‍💻, experimentation 👩‍🔬, and iteration 🔄 to improve on ML tasks. Huge thanks to the exceptionally talented Deepak Nathani who led this work and to all the other amazing collaborators who made this possible 🙏🫶🚀

Super excited to share 🧠MLGym 🦾 – the first Gym environment for AI Research Agents 🤖🔬 We introduce MLGym and MLGym-Bench, a new framework and benchmark for evaluating and developing LLM agents on AI research tasks. The key contributions of our work are: 🕹️ Enables the exploration of different training algorithms for AI Research Agents such as RL 🛠️ Provides a flexible evaluation framework that can accommodate different artifacts such as models, algorithms, or predictions 🤖 Allows researchers to evaluate any model without the need to develop a custom agentic harness 🎯 Introduces 13 diverse open-ended AI Research tasks for evaluating AI Research Agents on a wide range of domains such as computer vision, natural language processing, reinforcement learning, game theory, and logical reasoning. 📈 Proposes a new evaluation metric for AI Research Agents MLGym makes it easy to: 1) Add new tasks 2) Evaluate new models 3) Integrate new agents Check out a video of the MLGym Agent to see how it performs the full pipeline of idea generation💡, implementation 👩‍💻, experimentation 👩‍🔬, and iteration 🔄 to improve on ML tasks. Huge thanks to the exceptionally talented Deepak Nathani who led this work and to all the other amazing collaborators who made this possible 🙏🫶🚀

Roberta Raileanu

104,982 Aufrufe • vor 1 Jahr

Today is a good day for open science. As part of our continued commitment to the growth and development of an open ecosystem, today at Meta FAIR we’re announcing four new publicly available AI models and additional research artifacts to inspire innovation in the community and help advance AI in a responsible way. More in the video from Joelle Pineau. What we’re releasing: 🦎 Meta Chameleon 7B & 34B language models that support mixed-modal input and text-only outputs. 🪙 Meta Multi-Token Prediction Pretrained Language Models for code completion using Multi-Token Prediction. 🎼 Meta JASCO Generative text-to-music models capable of accepting various conditioning inputs for greater controllability. Paper available today with a pretrained model coming soon. 🗣️ Meta AudioSeal An audio watermarking model that we believe is the first designed specifically for the localized detection of AI-generated speech, available under a commercial license. 📝 Additional RAI artifacts Including research, data and code to measure and improve the representation of geographical and cultural preferences and diversity in AI systems. We believe that access to state-of-the-art AI creates opportunities for everyone – not just a small handful of Big Tech companies. We’re excited to share this work and to see how the community learns, iterates and builds using this technology. Details and access to everything released by FAIR today ➡️

Today is a good day for open science. As part of our continued commitment to the growth and development of an open ecosystem, today at Meta FAIR we’re announcing four new publicly available AI models and additional research artifacts to inspire innovation in the community and help advance AI in a responsible way. More in the video from Joelle Pineau. What we’re releasing: 🦎 Meta Chameleon 7B & 34B language models that support mixed-modal input and text-only outputs. 🪙 Meta Multi-Token Prediction Pretrained Language Models for code completion using Multi-Token Prediction. 🎼 Meta JASCO Generative text-to-music models capable of accepting various conditioning inputs for greater controllability. Paper available today with a pretrained model coming soon. 🗣️ Meta AudioSeal An audio watermarking model that we believe is the first designed specifically for the localized detection of AI-generated speech, available under a commercial license. 📝 Additional RAI artifacts Including research, data and code to measure and improve the representation of geographical and cultural preferences and diversity in AI systems. We believe that access to state-of-the-art AI creates opportunities for everyone – not just a small handful of Big Tech companies. We’re excited to share this work and to see how the community learns, iterates and builds using this technology. Details and access to everything released by FAIR today ➡️

AI at Meta

380,714 Aufrufe • vor 2 Jahren

Excited to share that I'm joining fal as a Creative Technologist! I'll be building workflows with state-of-the-art generative image, video, and audio models, and producing content for the fal YouTube channel.

Excited to share that I'm joining fal as a Creative Technologist! I'll be building workflows with state-of-the-art generative image, video, and audio models, and producing content for the fal YouTube channel.

Matt Workman

46,142 Aufrufe • vor 2 Monaten

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from Joelle Pineau. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from Joelle Pineau. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️

AI at Meta

150,222 Aufrufe • vor 1 Jahr

VITA Towards Open-Source Interactive Omni Multimodal LLM discuss: The remarkable multimodal capabilities and interactive experience of GPT-4o underscore their necessity in practical applications, yet open-source models rarely excel in both areas. In this paper, we introduce VITA, the first-ever open-source Multimodal Large Language Model (MLLM) adept at simultaneous processing and analysis of Video, Image, Text, and Audio modalities, and meanwhile has an advanced multimodal interactive experience. Starting from Mixtral 8x7B as a language foundation, we expand its Chinese vocabulary followed by bilingual instruction tuning. We further endow the language model with visual and audio capabilities through two-stage multi-task learning of multimodal alignment and instruction tuning. VITA demonstrates robust foundational capabilities of multilingual, vision, and audio understanding, as evidenced by its strong performance across a range of both unimodal and multimodal benchmarks. Beyond foundational capabilities, we have made considerable progress in enhancing the natural multimodal human-computer interaction experience. To the best of our knowledge, we are the first to exploit non-awakening interaction and audio interrupt in MLLM. VITA is the first step for the open-source community to explore the seamless integration of multimodal understanding and interaction. While there is still lots of work to be done on VITA to get close to close-source counterparts, we hope that its role as a pioneer can serve as a cornerstone for subsequent research.

VITA Towards Open-Source Interactive Omni Multimodal LLM discuss: The remarkable multimodal capabilities and interactive experience of GPT-4o underscore their necessity in practical applications, yet open-source models rarely excel in both areas. In this paper, we introduce VITA, the first-ever open-source Multimodal Large Language Model (MLLM) adept at simultaneous processing and analysis of Video, Image, Text, and Audio modalities, and meanwhile has an advanced multimodal interactive experience. Starting from Mixtral 8x7B as a language foundation, we expand its Chinese vocabulary followed by bilingual instruction tuning. We further endow the language model with visual and audio capabilities through two-stage multi-task learning of multimodal alignment and instruction tuning. VITA demonstrates robust foundational capabilities of multilingual, vision, and audio understanding, as evidenced by its strong performance across a range of both unimodal and multimodal benchmarks. Beyond foundational capabilities, we have made considerable progress in enhancing the natural multimodal human-computer interaction experience. To the best of our knowledge, we are the first to exploit non-awakening interaction and audio interrupt in MLLM. VITA is the first step for the open-source community to explore the seamless integration of multimodal understanding and interaction. While there is still lots of work to be done on VITA to get close to close-source counterparts, we hope that its role as a pioneer can serve as a cornerstone for subsequent research.

AK

23,958 Aufrufe • vor 1 Jahr

I'm excited to share my work from this summer OMEGA We're open-sourcing our experiments, code, and results on training state-of-the-art multimodal models for less than $2K It's been great leading this project w/ this team for my co-op + pioneering a new AI company from scratch

I'm excited to share my work from this summer OMEGA We're open-sourcing our experiments, code, and results on training state-of-the-art multimodal models for less than $2K It's been great leading this project w/ this team for my co-op + pioneering a new AI company from scratch

npjd

14,743 Aufrufe • vor 1 Jahr

I held a meeting with the heads of investment funds and business associations, both Ukrainian and international. We discussed how we can expand the capabilities of our defense industry, develop new areas of cooperation, and open export platforms for weapons. We count on continued support for Ukraine. For us, this is one of the new spheres where we can see tangible growth in cooperation and in Ukraine’s defense industry. According to our estimates, in 2026, the production potential for drones and missiles alone will reach 35 billion dollars. Ukraine is ready to share its expertise and technologies and to develop joint production. We are also ready to open export platforms for weapons in Europe, the United States, and other countries, provided there is proper control and protection of our technologies. I thank everyone who supports Ukraine, invests in Ukrainian intelligent strength, and enables us to keep building our joint capabilities together. I also thank the representatives for the meeting and for the proposals voiced. We will certainly work through each of them.

I held a meeting with the heads of investment funds and business associations, both Ukrainian and international. We discussed how we can expand the capabilities of our defense industry, develop new areas of cooperation, and open export platforms for weapons. We count on continued support for Ukraine. For us, this is one of the new spheres where we can see tangible growth in cooperation and in Ukraine’s defense industry. According to our estimates, in 2026, the production potential for drones and missiles alone will reach 35 billion dollars. Ukraine is ready to share its expertise and technologies and to develop joint production. We are also ready to open export platforms for weapons in Europe, the United States, and other countries, provided there is proper control and protection of our technologies. I thank everyone who supports Ukraine, invests in Ukrainian intelligent strength, and enables us to keep building our joint capabilities together. I also thank the representatives for the meeting and for the proposals voiced. We will certainly work through each of them.

Volodymyr Zelenskyy / Володимир Зеленський

202,920 Aufrufe • vor 8 Monaten

Tune Studio is an end-to-end platform for developing applications using Large Language Models. So far, I haven't seen any other platform like this one. You can do everything here: 1. You can curate your data. 2. Use the playground to play with different models and try your ideas. 3. Fine-tune an open-source model on your data. 4. Deploy the model when you are done. This is awesome for anyone building generative AI applications. You can use Tune Studio to work with any of the open-source models out there. They were one of the few companies to host Llama 2 and Llama 3 before anyone else. Here is a link to check it out: One of their main selling points is that Tune Studio scales! You don't have to worry about serving your model to lots of users. They also have built-in user management, authentication, on-prem support, user context management, and pretty much everything you need to build generative AI applications. Thanks to the Tune team for collaborating with me on this post. We are living through the best years of development tools for AI developers. The field is unstoppable.

Tune Studio is an end-to-end platform for developing applications using Large Language Models. So far, I haven't seen any other platform like this one. You can do everything here: 1. You can curate your data. 2. Use the playground to play with different models and try your ideas. 3. Fine-tune an open-source model on your data. 4. Deploy the model when you are done. This is awesome for anyone building generative AI applications. You can use Tune Studio to work with any of the open-source models out there. They were one of the few companies to host Llama 2 and Llama 3 before anyone else. Here is a link to check it out: One of their main selling points is that Tune Studio scales! You don't have to worry about serving your model to lots of users. They also have built-in user management, authentication, on-prem support, user context management, and pretty much everything you need to build generative AI applications. Thanks to the Tune team for collaborating with me on this post. We are living through the best years of development tools for AI developers. The field is unstoppable.

Santiago

39,101 Aufrufe • vor 2 Jahren

Today I’m excited to introduce Copilot Search in Bing. Copilot Search blends the best of traditional and generative search to help you find what you need. Whether it’s a navigational search result, a quick straightforward answer, or a complex query that leads you on a journey of discovery, Bing is your AI-powered search and answer engine. Copilot Search is rolling out today to everyone. To get started, go to and start exploring. This is a meaningful next step in our evolution of search, building on our learnings from Bing Chat, Copilot, and Bing Generative Search to provide our users the best search experience while supporting and building a healthy web ecosystem. Learn more in today’s announcement:

Today I’m excited to introduce Copilot Search in Bing. Copilot Search blends the best of traditional and generative search to help you find what you need. Whether it’s a navigational search result, a quick straightforward answer, or a complex query that leads you on a journey of discovery, Bing is your AI-powered search and answer engine. Copilot Search is rolling out today to everyone. To get started, go to and start exploring. This is a meaningful next step in our evolution of search, building on our learnings from Bing Chat, Copilot, and Bing Generative Search to provide our users the best search experience while supporting and building a healthy web ecosystem. Learn more in today’s announcement:

Jordi Ribas

16,729 Aufrufe • vor 1 Jahr

Firebase AI Logic gives you access to the latest generative AI models from Google: the Gemini models and Imagen models ✨ What are you building with gen AI?

Firebase AI Logic gives you access to the latest generative AI models from Google: the Gemini models and Imagen models ✨ What are you building with gen AI?

Firebase

20,903 Aufrufe • vor 1 Jahr

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models Contributions: • We introduce 4D LangSplat for open-vocabulary 4D spatial-temporal queries. To the best of our knowledge, we are the first to construct 4D language fields with object textual captions generated by MLLMs. • To model smooth transitions across states for objects in 4D scenes, we propose a status deformable network to capture continuous temporal changes. • Experiential results show that our method attains state-of-the-art performance for both time-agnostic and time-sensitive open-vocabulary queries.

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models Contributions: • We introduce 4D LangSplat for open-vocabulary 4D spatial-temporal queries. To the best of our knowledge, we are the first to construct 4D language fields with object textual captions generated by MLLMs. • To model smooth transitions across states for objects in 4D scenes, we propose a status deformable network to capture continuous temporal changes. • Experiential results show that our method attains state-of-the-art performance for both time-agnostic and time-sensitive open-vocabulary queries.

MrNeRF

10,953 Aufrufe • vor 1 Jahr

Together with the Ego4D consortium, we recently released Ego-Exo4D: A diverse, large-scale multi-modal, multi-view, video dataset and benchmark. Learn more about the work ➡️ Access the dataset ➡️ This work could help to advance AI models' understanding of complex human skills & enable new applications for AR systems, robotics & more.

Together with the Ego4D consortium, we recently released Ego-Exo4D: A diverse, large-scale multi-modal, multi-view, video dataset and benchmark. Learn more about the work ➡️ Access the dataset ➡️ This work could help to advance AI models' understanding of complex human skills & enable new applications for AR systems, robotics & more.

AI at Meta

78,059 Aufrufe • vor 2 Jahren

⚡🎉 We are thrilled to introduce VORTEX, an AI-powered computational framework for predicting 3D Spatial Transcriptomics (ST) using 3D tissue images and minimal 2D ST! 🧬 By combining cutting-edge 3D non-destructive tissue imaging with AI, VORTEX imputes the 3D molecular landscape of large tissue samples in a cost-effective and scalable manner. 🧠💡Our approach: By pretraining on diverse 3D morphology–2D transcriptomic pairs from heterogeneous tissue samples, and then fine-tuning on minimal 2D ST data from a volume of interest, VORTEX leverages both generic tissue-specific and sample-specific morphomolecular correlates to predict 3D ST. Congratulations to our superstar co-leads Cristina Almagro Pérez and Andrew H. Song, this was an exciting collaboration with Jonathan Liu Sizun Jiang Ali Bashashati. Preprint: Demo: Read the excellent blog from our superstar grad student Cristina Almagro Pérez: Also see our previous work on 3D Computational Pathology from Andrew H. Song published in Cell last year: Stay tuned for more to come.

⚡🎉 We are thrilled to introduce VORTEX, an AI-powered computational framework for predicting 3D Spatial Transcriptomics (ST) using 3D tissue images and minimal 2D ST! 🧬 By combining cutting-edge 3D non-destructive tissue imaging with AI, VORTEX imputes the 3D molecular landscape of large tissue samples in a cost-effective and scalable manner. 🧠💡Our approach: By pretraining on diverse 3D morphology–2D transcriptomic pairs from heterogeneous tissue samples, and then fine-tuning on minimal 2D ST data from a volume of interest, VORTEX leverages both generic tissue-specific and sample-specific morphomolecular correlates to predict 3D ST. Congratulations to our superstar co-leads Cristina Almagro Pérez and Andrew H. Song, this was an exciting collaboration with Jonathan Liu Sizun Jiang Ali Bashashati. Preprint: Demo: Read the excellent blog from our superstar grad student Cristina Almagro Pérez: Also see our previous work on 3D Computational Pathology from Andrew H. Song published in Cell last year: Stay tuned for more to come.

Faisal Mahmood

17,991 Aufrufe • vor 1 Jahr

Our vision is for AI that uses world models to adapt in new and dynamic environments and efficiently learn new skills. We’re sharing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 is a 1.2 billion-parameter model, trained on video, that can enable zero-shot planning in robots—allowing them to plan and execute tasks in unfamiliar environments. Learn more about V-JEPA 2 ➡️ As we continue working toward our goal of achieving advanced machine intelligence (AMI), we’re also releasing three new benchmarks for evaluating how well existing models can reason about the physical world from video. Learn more and download the new benchmarks ➡️

Our vision is for AI that uses world models to adapt in new and dynamic environments and efficiently learn new skills. We’re sharing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 is a 1.2 billion-parameter model, trained on video, that can enable zero-shot planning in robots—allowing them to plan and execute tasks in unfamiliar environments. Learn more about V-JEPA 2 ➡️ As we continue working toward our goal of achieving advanced machine intelligence (AMI), we’re also releasing three new benchmarks for evaluating how well existing models can reason about the physical world from video. Learn more and download the new benchmarks ➡️

AI at Meta

309,942 Aufrufe • vor 1 Jahr

Drop 11/14: We have been releasing Sarvam’s models and products - and yes, there is more to come. But today, we are excited to share the real and diverse impact our work is driving at scale. First up, preserving our cultural heritage. We have been working with Ekatra Foundation and Navajivan Trust (founded by Mahatma Gandhi) to digitize Gujarati documents from the early 19th and 20th centuries. Through this collaboration, we built out a whole product - Sarvam Akshar. Powered by the Sarvam Vision model, Akshar delivers state-of-the-art accuracy enabling reliable digitization of complex, real-world documents. Akshar sets the stage for our effort to leverage foundational models to preserve India’s cultural heritage - a direction we will double down on. Read more in our blog:

Drop 11/14: We have been releasing Sarvam’s models and products - and yes, there is more to come. But today, we are excited to share the real and diverse impact our work is driving at scale. First up, preserving our cultural heritage. We have been working with Ekatra Foundation and Navajivan Trust (founded by Mahatma Gandhi) to digitize Gujarati documents from the early 19th and 20th centuries. Through this collaboration, we built out a whole product - Sarvam Akshar. Powered by the Sarvam Vision model, Akshar delivers state-of-the-art accuracy enabling reliable digitization of complex, real-world documents. Akshar sets the stage for our effort to leverage foundational models to preserve India’s cultural heritage - a direction we will double down on. Read more in our blog:

Pratyush Kumar

60,348 Aufrufe • vor 4 Monaten

📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by a16z and UC Investments (University of California), we're proud to have the support of those that believe in both the science and the mission. We’re focused on building a neutral, open, community-driven platform that helps the world understand and improve the performance of AI models on real queries from real users. Also, big news is coming next week!👀 We're relaunching LMArena with a whole new look built directly with community feedback from the ground up 🧱 Link in thread.

📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by a16z and UC Investments (University of California), we're proud to have the support of those that believe in both the science and the mission. We’re focused on building a neutral, open, community-driven platform that helps the world understand and improve the performance of AI models on real queries from real users. Also, big news is coming next week!👀 We're relaunching LMArena with a whole new look built directly with community feedback from the ground up 🧱 Link in thread.

Arena.ai

435,405 Aufrufe • vor 1 Jahr

🔬 New Science pod with CuspAI! We are entering a new era where materials science and discovery is transitioning from slow, manual experimentation, to a high-speed search problem powered by generative AI and "physics processing units." Max Welling argues that the foundation of all modern technology—from GPUs to climate solutions—is a materials problem, and that unifying the mathematics of stochastic thermodynamics with generative AI will unlock a new paradigm of automated scientific discovery.

🔬 New Science pod with CuspAI! We are entering a new era where materials science and discovery is transitioning from slow, manual experimentation, to a high-speed search problem powered by generative AI and "physics processing units." Max Welling argues that the foundation of all modern technology—from GPUs to climate solutions—is a materials problem, and that unifying the mathematics of stochastic thermodynamics with generative AI will unlock a new paradigm of automated scientific discovery.

Latent.Space

20,343 Aufrufe • vor 4 Monaten