Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

“Can we get a new text analysis tool?” “No—we have Topic Model at home” Topic Model at home: outputs vague keywords; needs constant parameter fiddling🫠 Is there a better way? We introduce LLooM, a concept induction tool to explore text data in terms of interpretable concepts🧵

Michelle Lam

2,263 subscribers

38,102 views • 2 years ago •via X (Twitter)

Education Science & Technology

Anya Rossi• Live Now

Private livecam show

11 Comments

Michelle Lam2 years ago

Analysts have questions like “How are women in power described?” Vague topics like “women, power, female” aren’t what they’re after—they want to understand data with nuanced concepts like “Criticism of traditional gender roles,” which are central to theory-driven data analysis.

Michelle Lam2 years ago

LLooM fills this gap with concept induction, extracting high-level concepts defined by natural language descriptions & explicit inclusion criteria (e.g., “Dismissal of women’s concerns: Does the text dismiss or invalidate women’s concerns or experiences?”)

Michelle Lam2 years ago

Our algorithm draws on the ability of large language models (LLMs) to generalize from examples. LLooM samples extracted text and iteratively synthesizes proposed concepts of increasing generality. Once concepts are induced, LLooM can classify the entire dataset.

Michelle Lam2 years ago

We instantiate our algorithm in the LLooM Workbench, an open-source text analysis tool for computational notebooks. Users can explore text data in terms of high-level concepts—from a dataset overview to concept details to document-level scores, highlights, and rationale.

Michelle Lam2 years ago

In evals, LLooM exceeds baselines to recover 70-90% of human-annotated, generic topics, and LLooM is significantly better at surfacing specific, nuanced concepts. From content moderation to AI ethics statements, LLooM concepts help us make sense of data:

Michelle Lam2 years ago

Going further, expert data analysts used LLooM to uncover novel insights even on familiar datasets. Analysts were particularly excited that LLooM facilitated theory-driven analysis: they could read out patterns and write hypotheses through the language of LLooM concepts.

Michelle Lam2 years ago

This work would not be possible without my amazing collaborators and advisors—Janice Teoh, @landay, @jeffrey_heer, and @msbernst! Start using LLooM to explore text data via concepts ( or see our #CHI2024 paper to learn more (

dana calacci 🦋 @dana.witchy.business2 years ago

hell yeah. we've been using a similar method that's far less general than this for analyzing unstructured reddit data—so seeing some of the same ideas here, but generalized and tested rigorously is awesome

Michelle Lam2 years ago

that's great! yes, there's a lot of open ground for text analysis tools to better align with the way researchers want to think about their data—with this release we'd really love to learn how LLooM works for a broader range of domains and research questions!

Sheikh Shafayat2 years ago

This looks a pretty generalizable tool! Thanks for sharing, I was looking for something like this for my class project 👌

Michelle Lam2 years ago

thank you! awesome—hope LLooM can be helpful :)

Related Videos

Streamlining the UX in Gemini CLI ✨ We got a lot of feedback on how our UI was way too cluttered with noise and text... so we made changes! 🛠️ Compact tool calls: Tool calls for reading files, folders, searching text, etc are now a single line (no more tool boxes around everything!) 💭 Topics: The agent outputs a one line overview of the rationale and direction it is going. A topic can span several tool calls and makes it easy to see what the agent is working on at a glance and why.

Streamlining the UX in Gemini CLI ✨ We got a lot of feedback on how our UI was way too cluttered with noise and text... so we made changes! 🛠️ Compact tool calls: Tool calls for reading files, folders, searching text, etc are now a single line (no more tool boxes around everything!) 💭 Topics: The agent outputs a one line overview of the rationale and direction it is going. A topic can span several tool calls and makes it easy to see what the agent is working on at a glance and why.

Jack Wotherspoon

15,960 views • 1 month ago

Mom, can we get a minigun? No, we have a minigun at home. The minigun:

Mom, can we get a minigun? No, we have a minigun at home. The minigun:

Christopher Wipper

599,057 views • 5 months ago

Can we get the new Star Fox?? No, we have Star Fox at home Star Fox at home:

Can we get the new Star Fox?? No, we have Star Fox at home Star Fox at home:

MNKY

1,163,954 views • 1 month ago

I don't have a rose model so we got rose at home

I don't have a rose model so we got rose at home

Crayola

59,926 views • 1 year ago

"Mom can we have a CORSAIR Novablade?" "We have Novablade at home." Novablade at home:

"Mom can we have a CORSAIR Novablade?" "We have Novablade at home." Novablade at home:

Elgato

59,401 views • 5 months ago

Introducing the new Text tool in PrusaSlicer 2.6! Quickly customize any model by embossing/debossing text or even by using it as a modifier. The real game-changer? With a single click, the text can follow curved surfaces! #3Dprinting #PrusaSlicer

Introducing the new Text tool in PrusaSlicer 2.6! Quickly customize any model by embossing/debossing text or even by using it as a modifier. The real game-changer? With a single click, the text can follow curved surfaces! #3Dprinting #PrusaSlicer

Prusa3D

364,439 views • 3 years ago

New work with Edward Grefenstette at Google DeepMind: 🚨Interaction Dynamics as a Reward Signal for LLMs🚨 When it comes to interactions, the "how" is just as important as the "what" There is a signal in how we interact with a model that text analysis misses: hesitation, drift, friction

New work with Edward Grefenstette at Google DeepMind: 🚨Interaction Dynamics as a Reward Signal for LLMs🚨 When it comes to interactions, the "how" is just as important as the "what" There is a signal in how we interact with a model that text analysis misses: hesitation, drift, friction

Sian Gooding

36,647 views • 7 months ago

Introducing Synthesis! We built a new tool that merges different model responses into one:

Introducing Synthesis! We built a new tool that merges different model responses into one:

Charlie Holtz

77,395 views • 1 year ago

In Kafka, we have topics. Producers send data to a topic, and consumers pull data from it. Each topic has multiple partitions. You might have heard your team lead say that we can simply increase the partition count to make our cluster more scalable. There is no ordering guarantee across partitions for a topic, but messages within a partition are ordered. This can be confusing initially, but think of consumer groups as subscribers: each group has multiple consumers pulling data from a topic, and the group maintains its own offset for that topic. Check out the pinned post on my profile to understand this in detail!

Abhishek Singh

16,942 views • 5 months ago

🚨🚨🚨 We have released a database with the most searched keywords on the App Store, free for everyone! Just yesterday, we sent a message to all Astro ASO Tool users informing them that we had completed the integration of a database with one million keywords into our backend. This database allows us to access increasingly reliable data on popularity Today, we want to take a further step forward by making this database with 1 million keywords with a popularity value > 5 publicly available to everyone on our website! In this video, Alice Ercolani explains how to use this tool to discover new keywords to track in Astro! Link to the tool in the next post!

🚨🚨🚨 We have released a database with the most searched keywords on the App Store, free for everyone! Just yesterday, we sent a message to all Astro ASO Tool users informing them that we had completed the integration of a database with one million keywords into our backend. This database allows us to access increasingly reliable data on popularity Today, we want to take a further step forward by making this database with 1 million keywords with a popularity value > 5 publicly available to everyone on our website! In this video, Alice Ercolani explains how to use this tool to discover new keywords to track in Astro! Link to the tool in the next post!

Matteo Spada

53,256 views • 8 months ago

John Allan Namu: AI is a powerful tool but if you use it badly then you will get a bad outcome despite the power of the tool. Now that we are deploying this model across the most critical sectors in our economy, if we fail to get a richer understanding of the tool and fail to interrogate the intent with which we are using this tool, then we are going to end up in the situation that are in now #citizenexplainer

John Allan Namu: AI is a powerful tool but if you use it badly then you will get a bad outcome despite the power of the tool. Now that we are deploying this model across the most critical sectors in our economy, if we fail to get a richer understanding of the tool and fail to interrogate the intent with which we are using this tool, then we are going to end up in the situation that are in now #citizenexplainer

Citizen TV Kenya

11,775 views • 1 month ago

We’re excited to introduce Pocket TTS: a 100M-parameter text-to-speech model with high-quality voice cloning that runs on your laptop—no GPU required. Open-source, lightweight, and incredibly fast. 🧵👇

We’re excited to introduce Pocket TTS: a 100M-parameter text-to-speech model with high-quality voice cloning that runs on your laptop—no GPU required. Open-source, lightweight, and incredibly fast. 🧵👇

kyutai

236,570 views • 5 months ago

Text you when we get home.

Text you when we get home.

Baltimore Orioles

86,843 views • 2 years ago

You: Can we get GTA? Mom: We have GTA at home. The GTA at home (Streets of SimCity, 1997)

You: Can we get GTA? Mom: We have GTA at home. The GTA at home (Streets of SimCity, 1997)

Retro Tech Dreams

45,654 views • 1 year ago

Me: Mom, can we get Wolfenstein 3D: Spear of Destiny? My mom: We have it at home. Spear of Destiny at home...

Me: Mom, can we get Wolfenstein 3D: Spear of Destiny? My mom: We have it at home. Spear of Destiny at home...

Wizordum 🧙‍♂️ | NOW on Consoles 🎮

40,899 views • 6 months ago

Finally! A Text-to-SQL tool that actually works! Vanna is an open-source RAG framework for complex Text-to-SQL generation. It manages dynamic data and allows custom RAG model training for greater accuracy. 100% open-source.

Finally! A Text-to-SQL tool that actually works! Vanna is an open-source RAG framework for complex Text-to-SQL generation. It manages dynamic data and allows custom RAG model training for greater accuracy. 100% open-source.

Akshay 🚀

168,600 views • 1 year ago

#Survivor "Can we listen to Sia?" "We have Sia at home" Sia at home:

#Survivor "Can we listen to Sia?" "We have Sia at home" Sia at home:

Mike Bloom

83,890 views • 1 year ago

At the ElevenLabs Summit in Warsaw, we previewed on-device Text to Speech - a new model architecture that delivers human-level quality on limited hardware without an internet connection.

At the ElevenLabs Summit in Warsaw, we previewed on-device Text to Speech - a new model architecture that delivers human-level quality on limited hardware without an internet connection.

ElevenLabs

35,091 views • 26 days ago

We think text-to-image AI is pretty interesting, so here's text-to-BIM! It won’t “create a museum in the style of Zaha Hadid.” (Yet.) But you can describe your building and get an editable 3D model in return. Coming soon from Hypar !

We think text-to-image AI is pretty interesting, so here's text-to-BIM! It won’t “create a museum in the style of Zaha Hadid.” (Yet.) But you can describe your building and get an editable 3D model in return. Coming soon from Hypar !

Hypar

28,752 views • 3 years ago

Reasoning is central to purposeful action. Today we introduce MolmoAct — a fully open Action Reasoning Model (ARM) for robotics. Grounded in large-scale pre-training with action reasoning data, every predicted action is interpretable and user-steerable via visual trace. We are open-sourcing everything!

Reasoning is central to purposeful action. Today we introduce MolmoAct — a fully open Action Reasoning Model (ARM) for robotics. Grounded in large-scale pre-training with action reasoning data, every predicted action is interpretable and user-steerable via visual trace. We are open-sourcing everything!

Jiafei Duan

99,944 views • 10 months ago