正在加载视频...

视频加载失败

Foundational model wars over the past 12 months OpenAI vs Google vs Anthropic vs 01 AI vs Meta vs Cohere vs Alibaba vs Mistral vs Databricks vs Nous Research & 10000+ more

270,801 次观看 • 2 年前 •via X (Twitter)

9 条评论

Chief AI Officer 的头像
Chief AI Officer2 年前

Want a daily fundraising report in AI? Join 5000+ tracking venture rounds in AI and get access to my free funding database:

Pseudonym 🦅 的头像
Pseudonym 🦅2 年前

The last 12 months were a long decade to live thru

Florian Laurent 的头像
Florian Laurent2 年前

very cool! @lmsysorg should add it to their leaderboard

Chris 的头像
Chris2 年前

My hot take: elo 1284 or not, gpt4o sucks at instruction following compared to gpt4 turbo and sometimes its older cousin for most of my use cases. It answers what it "thinks" I want, but doesn't consider what is being said (I think bc it is heavily distilled or "sparse").

RYAN 的头像
RYAN2 年前

Interesting how it appears that openai is holding back and releases just strong enough to be first place. Almost looks like Google collided with them like a pool ball.

Holistech 的头像
Holistech2 年前

Anthropic Claude Opus is in my experience much better than OpenAI ChatGPT-4* in long philosophical and scientific conversations. Ist way more knowledgeable and has better conclusions.

BIG Corp CEO 的头像
BIG Corp CEO2 年前

The winner of the war tells the story! This is one of the things that has @OpenAI and @GoogleAI on the #OGDKTop5 Trending Businesses this week 😎👌🏾

Benoît Roussel 的头像
Benoît Roussel2 年前

Cool ! How comes that OpenAI's scores go down ? LLM got effectively worse, or just the elo system ?

John Smith 的头像
John Smith2 年前

The elo metric is probably the worst benchmark to judge AI against. It's entirely based on vibes. It's fine if you want to build a chatbot, but not for getting work done. It's purely coincidental that it tends to mirror average benchmarks for specific domains.

相关视频