Video yükleniyor...
Video Yüklenemedi
Foundational model wars over the past 12 months OpenAI vs Google vs Anthropic vs 01 AI vs Meta vs Cohere vs Alibaba vs Mistral vs Databricks vs Nous Research & 10000+ more
270,801 görüntüleme • 2 yıl önce •via X (Twitter)
9 Yorum

Want a daily fundraising report in AI? Join 5000+ tracking venture rounds in AI and get access to my free funding database:

The last 12 months were a long decade to live thru

very cool! @lmsysorg should add it to their leaderboard

My hot take: elo 1284 or not, gpt4o sucks at instruction following compared to gpt4 turbo and sometimes its older cousin for most of my use cases. It answers what it "thinks" I want, but doesn't consider what is being said (I think bc it is heavily distilled or "sparse").

Interesting how it appears that openai is holding back and releases just strong enough to be first place. Almost looks like Google collided with them like a pool ball.

Anthropic Claude Opus is in my experience much better than OpenAI ChatGPT-4* in long philosophical and scientific conversations. Ist way more knowledgeable and has better conclusions.

The winner of the war tells the story! This is one of the things that has @OpenAI and @GoogleAI on the #OGDKTop5 Trending Businesses this week 😎👌🏾

Cool ! How comes that OpenAI's scores go down ? LLM got effectively worse, or just the elo system ?

The elo metric is probably the worst benchmark to judge AI against. It's entirely based on vibes. It's fine if you want to build a chatbot, but not for getting work done. It's purely coincidental that it tends to mirror average benchmarks for specific domains.

