Dan Shipper 📧's banner
Dan Shipper 📧's profile picture

Dan Shipper 📧

@danshipper109,862 subscribers

ceo @every | the only subscription you need to stay at the edge of AI

Shorts

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week Every 📧 and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on Every 📧:

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week Every 📧 and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on Every 📧:

346,095 görüntüleme

codex-native weekend hack project: 1. buy cable to connect MIDI keyboard to computer 2. "hey codex, make a watcher script and a little web app to show me which chords im playing" 3. okay cool, now give me some exercises and help me see how to improve! literally 5 minutes start to finish, and it works flawlessly

codex-native weekend hack project: 1. buy cable to connect MIDI keyboard to computer 2. "hey codex, make a watcher script and a little web app to show me which chords im playing" 3. okay cool, now give me some exercises and help me see how to improve! literally 5 minutes start to finish, and it works flawlessly

32,664 görüntüleme

BREAKING: is your inbox a dumpster fire with 45,000 unreads? take it to 0 emails in 5 minutes—safely Declare inbox bankruptcy with Cora. it's reversible, smart, and totally free so you can start fresh with a fresh inbox. declare inbox bankruptcy today:

BREAKING: is your inbox a dumpster fire with 45,000 unreads? take it to 0 emails in 5 minutes—safely Declare inbox bankruptcy with Cora. it's reversible, smart, and totally free so you can start fresh with a fresh inbox. declare inbox bankruptcy today:

60,146 görüntüleme

Videos

danshipper's profile picture

codex teaches me to play piano:

Dan Shipper 📧

167,576 görüntüleme • 21 gün önce

danshipper's profile picture

BREAKING! Introducing Plus One: A hosted OpenClaw🦞 that lives in your Slack and comes pre-loaded with Every 📧's best tools, skills, and workflows. Set it up in one click, and use your ChatGPT subscription (or any other API key.) Bring your Plus One to work: Connected to the Every 📧 ecosystem Plus Ones automatically use Every 📧's agent-native apps, no setup required: - Cora for searching, sending, and managing email - Spiral for great writing in your voice - Proof ( for agent-native document editing Custom skills and workflows we use and love Plus Ones come pre-loaded with skills and workflows we use ourselves Every 📧 —some we've made, and some we think are great. - Content digest—summarizes the publications you read, starting with Every 📧 - Daily brief—your day's schedule and to-dos sent to you each morning - Animate—turn any static screenshot into an animation with Remotion - Frontend—Anthropic's front-end skill (which we use all the time!) We also make it fast to connect Google, Notion, Github, and more to your Plus One. Our goal is to give you a capable AI coworker right away, not a vanilla OpenClaw that you have to teach from scratch. Why we built Plus One OpenClaw🦞 has changed the way we work at Every. We effectively have a parallel org chart of AI coworkers, each with a name, a manager, and real responsibilities. Because of them our workflows are completely different—our company is different—and we would never go back. But getting here has been hard. Claws require a significant amount of manual setup and require a dedicated machine—like a Mac Mini—running 24/7 to stay responsive. We have learned that the hard part of Claws is the infrastructure around them—the hosting, the integrations, the skills, and the ongoing care. We’ve made them work great for our team, and we want to share everything we’ve learned with you. We're letting in 20 people a week to start, and scaling invites quickly from there. Every 📧 subscribers get priority. Bring your Plus One to work:

Dan Shipper 📧

257,149 görüntüleme • 2 ay önce

danshipper's profile picture

BREAKING: GPT-5.5 "Spud" is out and it is a BEAST We've been testing it Every 📧 for the last 3 weeks on everything from coding, to writing, to knowledge work. Here's our day 0 vibe check: - It's a step change in coding AND it's easy to talk to. It's fast and friendly and quickly became my daily driver. But it's also a coding powerhouse—a really rare combination. - It scored 62/100 on our Senior Engineer benchmark. Opus 4.7 scored only a 33/100. (But GPT-5.5 performed best when using an Opus 4.7 plan). Naveen Naidu used over 900 million tokens during testing—and it let him ship production features for Monologue at both high speed and quality. - It has serious conceptual clarity. It can hold a complex plan in its head over hours of work, without getting distracted by existing code. This makes it the first model that we've tested that can perform well on complex refactors requiring deleting and reimagining an substantial existing codebase. - It's a very good writer. This is the first OpenAI model in about a year that got our writers Every 📧 to switch away from Claude. 5.5 has Katie Parrott's seal of approval—not an easy task. Its writing feels more organic and it's better at mimicking a writing style without going overboard. - It's great for agentic knowledge-work. This is the first OpenAI model that manages to be both a stellar senior engineer AND that can be used for everything from spreadsheets to research. It's crazy fast, and it's amazing inside of the Codex desktop app, and got much of our team to switch away from Claude Code and Cowork during the testing period. However, it's not a perfect model. - 5.5 still loses to Opus 4.7 on plan quality. It's plans are extremely readable but Opus has better attention to detail and sharper insight. - 5.5 still loses to Opus 4.7 by a bit on front-end and full-stack product work. Kieran Klaassen found that it wasn't quite as good when full-stack thinking and design are involved. And it's not great writing Ruby. - 5.5 is a great vibe coder but if you're vibe coding without a plan it's worse than Opus. Mike Taylor found that Opus is better at reading in between the lines on underspecified vibe-coding tasks. Overall GPT-5.5 is a massive achievement from OpenAI and it deserves a serious look as your daily driver. Read our full vibe check on Every 📧 here:

Dan Shipper 📧

130,057 görüntüleme • 1 ay önce

danshipper's profile picture

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

Dan Shipper 📧

66,017 görüntüleme • 27 gün önce

danshipper's profile picture

Three months ago, Codex was trash for knowledge work. Now it's my daily driver. I use it for writing, recruiting, deep engineering work, and everything in between. It even keeps me at inbox 0. I chatted with Every 📧's head of growth Austin Austin Tedesco on Every 📧's AI & I about what changed, and why he now spends 80% of his working time in the Codex desktop app too. We get into: - How Codex went from making Austin feel like an idiot to being the place he goes to get stuff done, including complex tasks like writing go-to-market plans using existing material from Slack, Notion, and meeting transcripts. - Why the Codex’s desktop app, which is faster and more reliable than Claude Desktop/Cowork, is the real differentiator. - How I source candidates with Codex by having it identify career arcs, not keywords—my go-to move is identifying organizations likely to teach the skills Every needs for a role, and then find candidates from that pool who have since gone on to work in AI. This is a must-watch for anyone who's wondering whether it’s finally time to give Codex a try. Watch below! Timestamps How Codex went from a tool for senior engineers to a daily driver for knowledge work: 00:00:57 How Claude Code proved that a great coding agent works for any knowledge work: 00:02:42 Austin's switch to Codex: 00:07:24 How Austin set up Codex with folders, keys, and reviewer agents: 00:13:48 Using Codex to brainstorm automations across Gmail, Slack, and Notion: 00:18:24 How Austin manages the human review step when Codex is drafting communications: 00:22:42 Using Codex to build specialized agents inspired by product executive Claire Vo: 00:28:54 Synthesizing meeting transcripts and Slack threads into a go-to-market plan: 00:31:09 Building a live KPI tracker in Notion that agents can read: 00:40:15 Using Codex for recruiting: 00:44:54

Dan Shipper 📧

55,030 görüntüleme • 1 ay önce

danshipper's profile picture

BREAKING NEWS: Anthropic just dropped Claude Ops 4.5!! It is by FAR the best coding model I've ever used. We've been testing it internally Every 📧 for the last few days, and it is an absolute paradigm shift for any kind of coding task. It extends the horizon of what you can vibe code The current generation of new models—Anthropic’s Sonnet 4.5, Google’s Gemini 3, or OpenAI’s Codex Max 5.1—can all competently build a minimum viable product in one shot, or fix a highly technical bug autonomously. But eventually, if you kept pushing them to vibe code more, they’d start to trip over their own feet: The code would be convoluted and contradictory, and you’d get stuck in endless bugs. We have not found that limit yet with Opus 4.5—it seems to be able to vibe code forever. Takes working in parallel to a whole new level because it's far better at planning and coding, it can work with more autonomy—meaning you can do more in parallel without breaking anything . Kieran Klaassen worked on 11 different projects in six hours—and had good results on all of them. Great at design iteration Opus 4.5 is incredibly skilled at iterating through a design autonomously using an MCP like Playwright. previous models would lose the thread after a few cycles, or say a design was done when it wasn't. Opus 4.5 is incredible at autonomously iterating until a design is pixel perfect. we have a full 4,000 word vibe check on Every 📧 right now with everything we tested:

Dan Shipper 📧

272,434 görüntüleme • 6 ay önce

danshipper's profile picture

Andrew Wilkinson (Andrew Wilkinson) has been waking up at 4 a.m. because he can’t stop building with Anthropic’s Opus 4.5. He started vibe coding a couple of years ago, but it felt like the Palm Treo era of the smartphone—exciting, but not quite there. You could generate an app, but it would get stuck in bug loops or break the moment you pushed it further. Then he tried Opus 4.5 in Claude Code. It felt, he says, like having a “$100,000-a-month payroll of engineers” working for him 24/7. He’s built practical AI automations into every corner of his work and life, including: - A relationship counselor app called Deep Personality that consolidates 20 clinically validated personality tests into a 40-minute assessment, then generates a 45-page analysis. When both partners complete it, it maps compatibility and predicts conflicts—Wilkinson says it laid out every fight he and his girlfriend have. - A custom email client he built by handing Claude Code his Gmail credentials and describing his ideal workflow. It triages emails by priority and sender, handles quick replies via multiple choice, and walks him through complex emails question by question before drafting. - A personal stylist that texts him four outfit recommendations every morning. It checks the weather, pulls from a spreadsheet of his entire wardrobe (photos converted to CSV by Claude), generates four outfit options rendered as images with Nano Banana 2, and texts him what to wear down to the watch. - A Lindy agent that acts as an AI referee of sorts—it records his meetings and texts him if it detects psychological red flags like manipulation or gaslighting. The bar is high—he only gets a notification every few months—but when he does, it usually confirms a gut feeling he already had. Andrew is the cofounder of Tiny, the holding company that owns businesses like AeroPress and Dribbble. Earlier in his career, Andrew was a web designer, and he fits one of my predictions for 2026: Designers, who know how to create great experiences for users, are the unsung group most empowered by this AI moment. I had him on Every 📧's AI & I to talk about Opus 4.5, what he’s building with it, and how it’s changing the way he thinks about acquiring software businesses at Tiny. This is a must-watch for anyone who wants to put AI to work in their day-to-day life. Watch below! Timestamps: Introduction: 00:01:07 Why Opus 4.5 feels like the iPhone moment for vibe coding: 00:02:48 Why designers have a unique advantage with AI: 00:08:31 How Andrew built a custom email client with Claude Code: 00:14:10 An AI trained on your relationship that predicts your fights: 00:18:13 Using AI meeting notes to make your life better: 00:30:40 Don't inject your opinion into prompts: 00:35:11 Andrew's Claude Code tips and workflows: 00:40:21 Your personal stylist is a prompt away: 00:47:59 How AI is changing the way Andrew invests in software: 00:53:17

Dan Shipper 📧

154,567 görüntüleme • 4 ay önce

danshipper's profile picture

We use OpenClaws to do all of our work at Every 📧. We have 25 full-time employees, so we’re one of the few companies in the world that has seen how work changes when everyone has their own personal agent in the company Slack. I chatted with Every 📧 COO Brandon (Brandon Gell) and Every 📧 head of platform Willie (Willie) to share what we’ve learned. We get into: - Why agents become mirrors of their owners, and how that influences how other people on the team interact with them - How a parallel AI org chart forms on its own. People have stopped tagging me on Slack with questions about Proof, the document editor I vibe coded, because they knew my agent R2-C2 can step in - The etiquette for human-agent collaboration is being invented in real time. Brandon's rule is that if there's an established process or documented answer, always ask the agent, not their human - Why everyone is a manager now, and why even experienced managers carry limiting beliefs about what their agents can do - This is a must-watch for anyone trying to understand how AI workers change daily operations, not just in theory, but inside a company that’s half-agent Watch below! Timestamps Introduction: How Brandon built Zosia, an AI agent to run his household: Brandon’s “aha” moment: What happened when everyone on the team got their own agent: How agents take on their owners' personalities, and why that matters inside an org: Why it’s important for agents to work in public: What we’re still figuring out when it comes to agent behavior, including memory gaps, group chat etiquette, and the "ant death spiral" problem: How we built Plus One, our hosted OpenClaw product: The cultural shift required to make agents work at scale:

Dan Shipper 📧

67,770 görüntüleme • 1 ay önce

danshipper's profile picture

The rules of professional product development are being rewritten in real time. - PMs and designers can ship software as easily as engineers. - Software is no longer just built for humans—it’s also built for agents as first-class citizens. To better understand how we build products in this world, I invited Mike Krieger (Mike Krieger) on Every 📧’s AI & I podcast. Mike cofounded Instagram and is now a member of the technical staff at Anthropic, co-leading Anthropic Labs, their internal incubator for experimental products. He's been at the frontier of two transformative technology waves: mobile/social and now agent-native software. We discussed: - How to build a truly agent-native product. The best products today, like Claude Code, allow users to do things that their creators never intended. But that requires hard trade-offs between freedom and safety/reliability for frontier products, an issue that Mike's team is learning how to solve. - What's different about building now versus building Instagram. At Instagram, it took months to hit dead ends and learn what to cut. Now, that cycle runs in hours. - The trap of building too much, too fast with agents. You can go from idea to a nearly-shipped product in a day, but that process doesn’t give you the incremental feedback that used to tell you what not to build. The models are great at adding features, but can create a product that lacks coherence. - How Anthropic Labs structures product teams. New product experiments are led by only two people, usually a product manager or designer paired with an engineer. Mike says bigger teams tend to be too slow because of coordination costs. - Why you need to throw out your product and start over every three to six months. AI progress means most of your harness will be outdated quickly—the best teams build this into their product strategy. And much more! You should watch this one. Timestamps Introduction: What's gotten easier—and what hasn't—about building products in the age of AI: Why vibe coding creates "indoor trees": How rewrites have become a normal part of the development process: What "agent native" product design means: How Mike's labs team is structured and the cofounder model: The best signal for a product bet is someone with "break through walls" conviction: Navigating enterprise customers while keeping pace with rapid AI change: OpenClaw, personal agents, and the product question defining 2026:

Dan Shipper 📧

58,145 görüntüleme • 2 ay önce

danshipper's profile picture

Introducing: How Do You Use ChatGPT? 🚀 It's a weekly show where I interview the most interesting people in the world about how they use ChatGPT in their work and their lives—and show you every detail. The first episode is with Sahil Lavingia, CEO of Gumroad and Flexile. It's not theoretical: we screen-share through his actual prompts and responses, so you can see how ChatGPT helps him perform better at work and improve his life—one conversation at a time. We talk about how he's using ChatGPT to: Buy a building. He wants to buy a New York City hangout for Gumroad employees and customers, so he asked ChatGPT to research the history of real estate in NYC, suggest which neighborhoods might be best to target, generate questions for brokers, and even detail what the design of a particular property might look like. Write tweets. Sahil is a prolific Twitter/X user. He often uses ChatGPT to help him flesh out an idea. He says, “I [start] with a tweet, which is like a thesis, and then I just say, ‘Add three to four paragraphs to make the point compelling—also suggest more examples.’” We explore his precise process for using ChatGPT to help him brainstorm short tweets and longer essays in this episode. Pressure-test ideas. For Sahil, ChatGPT is like upgrading his peripheral vision. It lets him see around the corners, ask better questions of himself and other people, and avoid poor decisions. He told me, “I think a lot of people sort of delude themselves into thinking they have [good ideas]… I think that one of the most useful things about [ChatGPT] is it focuses your research on what actually matters.” It’s the ultimate tool to help him think better. Also in this episode: how ChatGPT could have helped Sahil save $70 million, how he thinks it will improve the most-talented creatives, and why he thinks—in the age of AI—people have no excuse for not knowing the answer to something anymore. Watch below! ---- Timestamps Intro 0:33 There’s no more excuse for not knowing anymore 2:00 He doesn’t spend as much time on bad ideas 2:50 How ChatGPT will make the top 1% of creative output better 6:15 How it turbocharges research 8:20 How he’s using ChatGPT to buy a building 11:00 How he uses ChatGPT to pressure-test ideas 17:43 How he uses DALL-E to help with interior design 20:50 How ChatGPT could have saved him $70 million 26:00 How he uses ChatGPT in his decision-making 29:50 How he uses ChatGPT for writing 38:00

Dan Shipper 📧

347,001 görüntüleme • 2 yıl önce