Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

We introduce representative generative benchmarking—custom eval sets built from your own data that reflect real user queries. thank you for collaborating! link to report in replies

Chroma

29,469 subscribers

74,443 views • 1 year ago •via X (Twitter)

Science & Technology Education

Anya Rossi• Live Now

Private livecam show

9 Comments

Chroma1 year ago

link to technical report: Grounded in experiments with production data, our method captures performance differences that public benchmarks like MTEB miss.

RTTS1 year ago

API testing of interfaces is critical to determine if they meet requirements for functionality, reliability, performance, and security. Check out RTTS - the automated testing experts since 1996. #API #testautomation #integrationtest

Sumuk1 year ago

@weights_biases this is super cool! at 🤗Huggingface we introduced a generative open source system but for full LLM evals instead! would be great to collab!

swyx1 year ago

@weights_biases you guys somehow made notebooks look good, incredible

Aarush Sah1 year ago

@weights_biases YES YES YES YES

ebaad1 year ago

@weights_biases Jealous of that @HermanMiller chair, how can I get a job at @trychroma.

LA Bloke1 year ago

@weights_biases Perhaps, you should use AI to reformat your message/paper?

Ryan1 year ago

@weights_biases This is what I need to do manually at the moment. very curious to see what this is capable of.

Allan Ryan1 year ago

@weights_biases Kelly is legit

Related Videos

User-Owned AI is how we democratize access and protect data privacy. Centralized AI? Your data trains it and you get zilch. Fork that. Choose NEAR. Built for users. Built for agents.

User-Owned AI is how we democratize access and protect data privacy. Centralized AI? Your data trains it and you get zilch. Fork that. Choose NEAR. Built for users. Built for agents.

NEAR Protocol

25,597 views • 11 months ago

From beta to fully onchain - in just 6 months. Byreal built on Solana to support real assets, real trades, and real user participation. Thank you to our users, builders, and partners. Stay tuned for Byreal 2026 edition.

From beta to fully onchain - in just 6 months. Byreal built on Solana to support real assets, real trades, and real user participation. Thank you to our users, builders, and partners. Stay tuned for Byreal 2026 edition.

Byreal

29,950 views • 6 months ago

"Will the PM respond to this report?" Lib Dem leader Ed Davey asks regarding a recent report that sets out recommendations for tackling antisemitism in the UK "We must fight antisemitism wherever we find it", replies Keir Starmer #PMQs

"Will the PM respond to this report?" Lib Dem leader Ed Davey asks regarding a recent report that sets out recommendations for tackling antisemitism in the UK "We must fight antisemitism wherever we find it", replies Keir Starmer #PMQs

BBC Politics

37,449 views • 1 year ago

Available to everyone: The new analytics lets you query your data directly from a report in real time using ShopifyQL, for more dynamic and customizable reporting. Everyone has a data warehouse now!

Available to everyone: The new analytics lets you query your data directly from a report in real time using ShopifyQL, for more dynamic and customizable reporting. Everyone has a data warehouse now!

tobi lutke

12,139 views • 1 year ago

We built Obsidian, but for your app. No mobile dev holds a perfect spatial model of their own app. It's too big to keep in one head. So we store it somewhere else. Watch me reconstruct a real user flow, pulled straight from the app's spatial graph ->

We built Obsidian, but for your app. No mobile dev holds a perfect spatial model of their own app. It's too big to keep in one head. So we store it somewhere else. Watch me reconstruct a real user flow, pulled straight from the app's spatial graph ->

hayden

83,846 views • 1 month ago

"For real tho, thank you so much for your love, your unconditional love that you've always given to us. iKON can never thank you enough for that so please keep in mind that we will treat you as you deserve. Thank you so much. Thank you iKONIC." BOBBY 😭😭😭

"For real tho, thank you so much for your love, your unconditional love that you've always given to us. iKON can never thank you enough for that so please keep in mind that we will treat you as you deserve. Thank you so much. Thank you iKONIC." BOBBY 😭😭😭

ziα •◡•

49,429 views • 3 years ago

Michelin Mac Vs Raider Mac…. You be the judge! Line up for your own custom creation as we load up for the #rwf Liquid Guild #esports #nutrition #wow Video @milamirandaofficial Thanks to Kai _0245 for collaborating on this video.

Michelin Mac Vs Raider Mac…. You be the judge! Line up for your own custom creation as we load up for the #rwf Liquid Guild #esports #nutrition #wow Video @milamirandaofficial Thanks to Kai _0245 for collaborating on this video.

Heidi

49,745 views • 1 year ago

60-Second SEO: All SEOs should be using the Google Ads "Search Keywords" report to immediately discover the most valuable queries for a business: Google Ads is the only place that you can get actual conversion data at the query level. The sad part is, this is a heavily underutilized report by many in the SEO community. By utilizing PPC data, you're able to see actual data around which queries are the most valuable for your business. To use it requires SEOs to be brave and venture into the Google Ads platform:

60-Second SEO: All SEOs should be using the Google Ads "Search Keywords" report to immediately discover the most valuable queries for a business: Google Ads is the only place that you can get actual conversion data at the query level. The sad part is, this is a heavily underutilized report by many in the SEO community. By utilizing PPC data, you're able to see actual data around which queries are the most valuable for your business. To use it requires SEOs to be brave and venture into the Google Ads platform:

Chris Long

20,891 views • 2 years ago

just shipped a tool that fixes the biggest prompt engineering mistake... -> terrible role definitions i built this inside Claude in under 10 minutes: - creates custom creative roles for any use case - works for any scenario you throw at it - unique for my own framework "AI-First Brain" you can just build your own stuff instead of waiting for a SaaS to solve your small problems get the prompt to build this for yourself in the replies ↓

just shipped a tool that fixes the biggest prompt engineering mistake... -> terrible role definitions i built this inside Claude in under 10 minutes: - creates custom creative roles for any use case - works for any scenario you throw at it - unique for my own framework "AI-First Brain" you can just build your own stuff instead of waiting for a SaaS to solve your small problems get the prompt to build this for yourself in the replies ↓

Machina

11,943 views • 11 months ago

Tune Studio is an end-to-end platform for developing applications using Large Language Models. So far, I haven't seen any other platform like this one. You can do everything here: 1. You can curate your data. 2. Use the playground to play with different models and try your ideas. 3. Fine-tune an open-source model on your data. 4. Deploy the model when you are done. This is awesome for anyone building generative AI applications. You can use Tune Studio to work with any of the open-source models out there. They were one of the few companies to host Llama 2 and Llama 3 before anyone else. Here is a link to check it out: One of their main selling points is that Tune Studio scales! You don't have to worry about serving your model to lots of users. They also have built-in user management, authentication, on-prem support, user context management, and pretty much everything you need to build generative AI applications. Thanks to the Tune team for collaborating with me on this post. We are living through the best years of development tools for AI developers. The field is unstoppable.

Tune Studio is an end-to-end platform for developing applications using Large Language Models. So far, I haven't seen any other platform like this one. You can do everything here: 1. You can curate your data. 2. Use the playground to play with different models and try your ideas. 3. Fine-tune an open-source model on your data. 4. Deploy the model when you are done. This is awesome for anyone building generative AI applications. You can use Tune Studio to work with any of the open-source models out there. They were one of the few companies to host Llama 2 and Llama 3 before anyone else. Here is a link to check it out: One of their main selling points is that Tune Studio scales! You don't have to worry about serving your model to lots of users. They also have built-in user management, authentication, on-prem support, user context management, and pretty much everything you need to build generative AI applications. Thanks to the Tune team for collaborating with me on this post. We are living through the best years of development tools for AI developers. The field is unstoppable.

Santiago

39,101 views • 2 years ago

We just shipped something different. A fully autonomous AI operator for your digital presence. It learns your style, connects to your data sources, and executes tasks on your behalf with transparency built in. Built for creators, professionals, and teams that expect real autonomy. Here’s how it works

We just shipped something different. A fully autonomous AI operator for your digital presence. It learns your style, connects to your data sources, and executes tasks on your behalf with transparency built in. Built for creators, professionals, and teams that expect real autonomy. Here’s how it works

OptimAI Network

20,404 views • 4 months ago

SO excited to announce you can now get your very own AR custom from me, we are living in the future!! Email me victoriamalfoy@snakeysmut.com to get started ;) Real Girls Now Naughty America ®

SO excited to announce you can now get your very own AR custom from me, we are living in the future!! Email me [email protected] to get started ;) Real Girls Now Naughty America ®

victoria malfoy

41,579 views • 5 months ago

we are live! you can now (for free!) check your brand's presence in the most popular AI models this is different from the promptwatch main platform, where you can monitor the real chatgpt user interface but this is great to see how well your brand is stored in the training data

Klaas

21,569 views • 1 year ago

Use this FREE tool to analyze data in 10 sec. This tool is used by more than 500,000 researchers. 1. Go to and log in 2. Click on Data Analysis and upload your data file. 3. Write your prompt for data analysis. 4. E.g., generate graphs for exploratory data analysis. 5. Liner will generate a variety of graphs for you. 6. It also generates an insightful data analysis report. ✦ You can edit the report. For example, you can: ✦ Insert citations in the report ✦ Make changes to the text ✦ Add/remove any graphs After making changes, you can download the report. Try Liner today. It’s FREE. Liner link:

Use this FREE tool to analyze data in 10 sec. This tool is used by more than 500,000 researchers. 1. Go to and log in 2. Click on Data Analysis and upload your data file. 3. Write your prompt for data analysis. 4. E.g., generate graphs for exploratory data analysis. 5. Liner will generate a variety of graphs for you. 6. It also generates an insightful data analysis report. ✦ You can edit the report. For example, you can: ✦ Insert citations in the report ✦ Make changes to the text ✦ Add/remove any graphs After making changes, you can download the report. Try Liner today. It’s FREE. Liner link:

Faheem Ullah

11,927 views • 3 months ago

.Vice President JD Vance at the March For Life: "All of us in the Trump administration from the President on down, we thank you for your prayers, we thank you for your perseverance, and we thank you that for today, we are ALL marching for life."

.Vice President JD Vance at the March For Life: "All of us in the Trump administration from the President on down, we thank you for your prayers, we thank you for your perseverance, and we thank you that for today, we are ALL marching for life."

Rapid Response 47

54,268 views • 6 months ago

NEW PC GIVEAWAY! In collaboration with AMD Gaming we bring you a Custom Built Monster Hunter Wilds PC. Enter now for a chance to win link in bio!

NEW PC GIVEAWAY! In collaboration with AMD Gaming we bring you a Custom Built Monster Hunter Wilds PC. Enter now for a chance to win link in bio!

Skytech Gaming

19,226 views • 3 months ago

In Colorado, we are taking action to build more homes, lower costs, and save Colorado families money, and the HOME Act does exactly that. Thank you Representative Andy Boesenecker Representative Andy Boesenecker for your leadership on making it easier to afford to live in our great state

In Colorado, we are taking action to build more homes, lower costs, and save Colorado families money, and the HOME Act does exactly that. Thank you Representative Andy Boesenecker Representative Andy Boesenecker for your leadership on making it easier to afford to live in our great state

Governor Jared Polis

17,692 views • 7 months ago

OpenAI people alerted that you will send personal data to DeepSeek, but you can actually rent a GPU from Hyperbolic and host your own R1 model using ollama to avoid sending data to any company. DeepSeek R1 is the true user-owned AGI. The beauty of open source!

OpenAI people alerted that you will send personal data to DeepSeek, but you can actually rent a GPU from Hyperbolic and host your own R1 model using ollama to avoid sending data to any company. DeepSeek R1 is the true user-owned AGI. The beauty of open source!

Jasper

880,229 views • 1 year ago

As the year comes to a close, it's time to reflect on everything we got up to in 2025 ✨ Thank you for all your support this year, we hope you're excited for 2026! 🔜

As the year comes to a close, it's time to reflect on everything we got up to in 2025 ✨ Thank you for all your support this year, we hope you're excited for 2026! 🔜

SEGA

52,853 views • 6 months ago