正在加载视频...

视频加载失败

New eval! Code duels for LMs ⚔️ Current evals test LMs on *tasks*: "fix this bug," "write a test" But we code to achieve *goals*: maximize revenue, cut costs, win users Meet CodeClash: LMs compete via their codebases across multi-round tournaments to achieve high-level goals

102,240 次观看 • 7 个月前 •via X (Twitter)

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频

Face card thread?? Lms yall face 🙄
0:15

Sensitive content

Face card thread?? Lms yall face 🙄

Ari🏳️‍🌈

475,388 次观看 • 2 年前