Loading video...
Video Failed to Load
For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦
177,422 views • 1 year ago •via X (Twitter)
11 Comments

OLMoTrace connects phrases or even whole sentences in the language model’s output back to verbatim matches in its training data. It does this by searching billions of documents and trillions of tokens in real time and highlighting where it finds compelling matches.

OLMoTrace is useful for fact checking✅, understanding hallucinations🎃, tracing reasoning capabilities🧠, or just generally helping you see where an LLMs response may have come from.

This new feature is made possible by our commitment to fully open models, with everything from model weights, recipes, code, and training data freely available. Openness, transparency, and traceability are key to establishing trust in AI, and we hope this serves as a step in that direction. 💫

Try OLMoTrace in the Ai2 Playground today:

Learn more about how OLMoTrace works on our blog:

Want to learn how practical AI skills and automations for your business and work? Check out our 50+ step-by-step video tutorials 100% FREE 20+ hours of Ai and Automation goodness absolutely free 🥳

Excellent work! Will OLMoTrace be open-sourced in the future?

It already is! Check out the GitHub repo here: 🙌

Great 👍

very cool Since this uses exact string matching with n-grams, then any typos when talking to the chatbot will really ruin your chances of finding a match in the training data...

Now I'm trying to visualise how that would work for pictures 😅

