
Omar Shaikh
@oshaikh13 • 2,097 subscribers
member of sociotechnical staff @Stanford
Videos

What’s the point of a “helpful assistant” if you have to always tell it what to do next? In a new paper, we introduce a reasoning model that predicts what you’ll do next over long contexts (LongNAP 💤). We trained it on 1,800 hours of computer use from 20 users. 🧵
Omar Shaikh123,679 views • 3 months ago

LLMs sound homogeneous *because* feedback modalities like rankings, principles, and pairs cater to group-level preferences. Asking an individual to rank ~1K outputs or provide accurate principles takes effort. What if we relied on a few demos to elicit annotator preferences?
Omar Shaikh52,304 views • 2 years ago
No more content to load