Loading video...
Video Failed to Load
“Can we get a new text analysis tool?” “No—we have Topic Model at home” Topic Model at home: outputs vague keywords; needs constant parameter fiddling🫠 Is there a better way? We introduce LLooM, a concept induction tool to explore text data in terms of interpretable concepts🧵
38,102 views • 2 years ago •via X (Twitter)
11 Comments

Analysts have questions like “How are women in power described?” Vague topics like “women, power, female” aren’t what they’re after—they want to understand data with nuanced concepts like “Criticism of traditional gender roles,” which are central to theory-driven data analysis.

LLooM fills this gap with concept induction, extracting high-level concepts defined by natural language descriptions & explicit inclusion criteria (e.g., “Dismissal of women’s concerns: Does the text dismiss or invalidate women’s concerns or experiences?”)

Our algorithm draws on the ability of large language models (LLMs) to generalize from examples. LLooM samples extracted text and iteratively synthesizes proposed concepts of increasing generality. Once concepts are induced, LLooM can classify the entire dataset.

We instantiate our algorithm in the LLooM Workbench, an open-source text analysis tool for computational notebooks. Users can explore text data in terms of high-level concepts—from a dataset overview to concept details to document-level scores, highlights, and rationale.

In evals, LLooM exceeds baselines to recover 70-90% of human-annotated, generic topics, and LLooM is significantly better at surfacing specific, nuanced concepts. From content moderation to AI ethics statements, LLooM concepts help us make sense of data:

Going further, expert data analysts used LLooM to uncover novel insights even on familiar datasets. Analysts were particularly excited that LLooM facilitated theory-driven analysis: they could read out patterns and write hypotheses through the language of LLooM concepts.

This work would not be possible without my amazing collaborators and advisors—Janice Teoh, @landay, @jeffrey_heer, and @msbernst! Start using LLooM to explore text data via concepts ( or see our #CHI2024 paper to learn more (

hell yeah. we've been using a similar method that's far less general than this for analyzing unstructured reddit data—so seeing some of the same ideas here, but generalized and tested rigorously is awesome

that's great! yes, there's a lot of open ground for text analysis tools to better align with the way researchers want to think about their data—with this release we'd really love to learn how LLooM works for a broader range of domains and research questions!

This looks a pretty generalizable tool! Thanks for sharing, I was looking for something like this for my class project 👌

thank you! awesome—hope LLooM can be helpful :)
