正在加载视频...

视频加载失败

We introduce representative generative benchmarking—custom eval sets built from your own data that reflect real user queries. thank you for collaborating! link to report in replies

74,443 次观看 • 1 年前 •via X (Twitter)

9 条评论

Chroma 的头像
Chroma1 年前

link to technical report: Grounded in experiments with production data, our method captures performance differences that public benchmarks like MTEB miss.

RTTS 的头像
RTTS1 年前

API testing of interfaces is critical to determine if they meet requirements for functionality, reliability, performance, and security. Check out RTTS - the automated testing experts since 1996. #API #testautomation #integrationtest

Sumuk 的头像
Sumuk1 年前

@weights_biases this is super cool! at 🤗Huggingface we introduced a generative open source system but for full LLM evals instead! would be great to collab!

swyx 的头像
swyx1 年前

@weights_biases you guys somehow made notebooks look good, incredible

Aarush Sah 的头像
Aarush Sah1 年前

@weights_biases YES YES YES YES

ebaad 的头像
ebaad1 年前

@weights_biases Jealous of that @HermanMiller chair, how can I get a job at @trychroma.

LA Bloke 的头像
LA Bloke1 年前

@weights_biases Perhaps, you should use AI to reformat your message/paper?

Ryan 的头像
Ryan1 年前

@weights_biases This is what I need to do manually at the moment. very curious to see what this is capable of.

Allan Ryan 的头像
Allan Ryan1 年前

@weights_biases Kelly is legit

相关视频