Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Before Fable got released (and pulled) Mozilla was quietly testing Claude Mythos against Firefox's 10M line codebase. The result? Over 400 security bugs fixes, including ones that had been hiding in the codebase for over a decade. Brian Grinstead, distinguished engineer at Mozilla, walked me through the agentic bug-finding... show more

claire vo 🖤

55,795 subscribers

126,875 views • 11 hours ago •via X (Twitter)

Education Health & Wellness Science & Technology

Anya Rossi• Live Now

Private livecam show

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

Claude Code Unpacked! Visual walkthrough of the entire 500k-line leaked codebase. What happens when you type a message: - the agent loop - 50+ tools - multi-agent orchestration - unreleased features Want to understand the internals or build your own agent harness? start here:

Claude Code Unpacked! Visual walkthrough of the entire 500k-line leaked codebase. What happens when you type a message: - the agent loop - 50+ tools - multi-agent orchestration - unreleased features Want to understand the internals or build your own agent harness? start here:

Akshay 🚀

24,408 views • 2 months ago

The founder of LangChain says both models and harnesses have gotten really good between December and now. According to Harrison Chase, the core idea of an agent before Christmas was a model running in a loop and calling tools. This had been the north star for 3 years. - langchain had this when it launched - autogpt was the same idea - openclaw is kind of a future version of it Then about a year ago, they started getting really good. Claude Code, Manus, and Deep Research were all launched around the same time. All of them use the same pattern: running in a loop with harnesses (planning tools, file systems, code execution, etc) Harness engineering became a thing. Then Opus came out in November and really unlocked it. - the harness let the model do more and more - less hardcoded logic - way more control Then everyone went on vacation, played around, and realized that the model and the harness finally worked reliably.

The founder of LangChain says both models and harnesses have gotten really good between December and now. According to Harrison Chase, the core idea of an agent before Christmas was a model running in a loop and calling tools. This had been the north star for 3 years. - langchain had this when it launched - autogpt was the same idea - openclaw is kind of a future version of it Then about a year ago, they started getting really good. Claude Code, Manus, and Deep Research were all launched around the same time. All of them use the same pattern: running in a loop with harnesses (planning tools, file systems, code execution, etc) Harness engineering became a thing. Then Opus came out in November and really unlocked it. - the harness let the model do more and more - less hardcoded logic - way more control Then everyone went on vacation, played around, and realized that the model and the harness finally worked reliably.

Ivan Burazin

33,948 views • 2 months ago

The best *code embedding* model in the market right now was just released: Qodo-Embed-1 — There are two flavors: A lite model with 1.5B parameters and a medium model with 7B parameters (Hugging Face links below). If you want to index a large codebase (supports 10M+ lines of code), this is the model you want. 1. Index your repositories 2. Ask anything (including test and code generation) The models are optimized to answer natural language questions or code-to-code questions. The video here shows the model indexing 90 repositories (!!!!!) and letting the user ask questions about them. The simplest way to use the model is through the Qodo Gen AI extension in Visual Studio Code, Cursor, or JetBrains (see link below).

The best code embedding model in the market right now was just released: Qodo-Embed-1 — There are two flavors: A lite model with 1.5B parameters and a medium model with 7B parameters (Hugging Face links below). If you want to index a large codebase (supports 10M+ lines of code), this is the model you want. 1. Index your repositories 2. Ask anything (including test and code generation) The models are optimized to answer natural language questions or code-to-code questions. The video here shows the model indexing 90 repositories (!!!!!) and letting the user ask questions about them. The simplest way to use the model is through the Qodo Gen AI extension in Visual Studio Code, Cursor, or JetBrains (see link below).

Santiago

56,584 views • 1 year ago

I've long said that o3 is the best coding model - but if you're using an agent harness - Claude is just better at navigating your codebase. Enter the Repo Prompt pair programmer mode - it's the best of both words, as Claude coordinates with o3 to plan and apply edits for you!

I've long said that o3 is the best coding model - but if you're using an agent harness - Claude is just better at navigating your codebase. Enter the Repo Prompt pair programmer mode - it's the best of both words, as Claude coordinates with o3 to plan and apply edits for you!

eric provencher

93,571 views • 11 months ago

FREE CLAUDE FABLE 5 and almost nobody knows this trick Anthropic recently released their most capable model ever - the first Mythos-class model available to the public normally it's $10/$50 per million tokens but there's a hack: GitLab just added Fable 5 to Duo Agent Platform - across ALL tiers, including free trial the setup: 1/ register a GitLab account: 2/ start the free GitLab Duo trial 3/ Fable 5 is live through their AI Gateway 4/ done - Mythos-class model, $0 why Fable 5 is worth the hype: 1/ 80.3% SWE-Bench Pro - 11 points ahead of the next best model 2/ multi-DAY autonomous runs without losing the plot 3/ single-pass implementations of systems that took days of iteration 4/ Karpathy called it 'a major-version-bump-deserving step change' also free on Claude Pro/Max/Team until June 22 - after that it's usage credits you have a limited window to run the most capable public model on the planet for FREE

FREE CLAUDE FABLE 5 and almost nobody knows this trick Anthropic recently released their most capable model ever - the first Mythos-class model available to the public normally it's $10/$50 per million tokens but there's a hack: GitLab just added Fable 5 to Duo Agent Platform - across ALL tiers, including free trial the setup: 1/ register a GitLab account: 2/ start the free GitLab Duo trial 3/ Fable 5 is live through their AI Gateway 4/ done - Mythos-class model, $0 why Fable 5 is worth the hype: 1/ 80.3% SWE-Bench Pro - 11 points ahead of the next best model 2/ multi-DAY autonomous runs without losing the plot 3/ single-pass implementations of systems that took days of iteration 4/ Karpathy called it 'a major-version-bump-deserving step change' also free on Claude Pro/Max/Team until June 22 - after that it's usage credits you have a limited window to run the most capable public model on the planet for FREE

kaize

23,734 views • 10 days ago

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

Dan Shipper 📧

66,339 views • 1 month ago

Anthropic engineer James Brady: "Every agent in production lies. We measured it. The good ones lie less, the great ones catch the lie before the user does." In 29 minutes, he walks through the verification stack he built and the patterns the Claude Code team adopted to keep agents honest at scale. Watch the full talk, then save the config below👇

Anthropic engineer James Brady: "Every agent in production lies. We measured it. The good ones lie less, the great ones catch the lie before the user does." In 29 minutes, he walks through the verification stack he built and the patterns the Claude Code team adopted to keep agents honest at scale. Watch the full talk, then save the config below👇

rody

340,927 views • 16 days ago

THIS GUY CONNECTED HIS AI AGENTS TO HIS OBSIDIAN AND BUILT A BRAIN THAT LEARNS ON ITS OWN. HERE'S HOW TO BUILD IT Obsidian is just markdown files sitting in a folder. That turns out to be the perfect memory for an AI agent, because an agent can read and write those files directly. He wired his agents into the vault so they pull context from it, do the work, and write what they learned back. The notes aren't the point. The loop is, and it gets sharper every cycle How to build it: 1. Point an agent at your vault. The fastest way, no plugins, no API keys: open a terminal and run npx obsidian-mcp /path/to/your/vault. That exposes your Obsidian folder to Claude as a tool it can read, search, and write to. Add it to your Claude Code or Cowork config and restart 2. Confirm it can see the brain. Ask it: "list the notes in my vault and summarize what's in them." If it reads them back, the connection is live. Now it starts every task with everything the vault already holds instead of from zero 3. Give each agent one job and a write-back rule. Tell it: "research this, then save what you found as a new note in /brain with links to related notes." One agent researches, one summarizes, one plans. Each writes its output back into the vault 4. Close the loop. Add one line to every agent's instructions: "read /brain before starting, write your result back when done." Now each task leaves the vault richer, and the next run reads that before it works. It compounds instead of resetting 5. You only steer. Review what the brain produces, point it at the next thing. The agents handle the reading, writing, and connecting The edge isn't better notes. It's a brain that feeds itself, so the work gets sharper every cycle instead of starting over Bookmark this

THIS GUY CONNECTED HIS AI AGENTS TO HIS OBSIDIAN AND BUILT A BRAIN THAT LEARNS ON ITS OWN. HERE'S HOW TO BUILD IT Obsidian is just markdown files sitting in a folder. That turns out to be the perfect memory for an AI agent, because an agent can read and write those files directly. He wired his agents into the vault so they pull context from it, do the work, and write what they learned back. The notes aren't the point. The loop is, and it gets sharper every cycle How to build it: 1. Point an agent at your vault. The fastest way, no plugins, no API keys: open a terminal and run npx obsidian-mcp /path/to/your/vault. That exposes your Obsidian folder to Claude as a tool it can read, search, and write to. Add it to your Claude Code or Cowork config and restart 2. Confirm it can see the brain. Ask it: "list the notes in my vault and summarize what's in them." If it reads them back, the connection is live. Now it starts every task with everything the vault already holds instead of from zero 3. Give each agent one job and a write-back rule. Tell it: "research this, then save what you found as a new note in /brain with links to related notes." One agent researches, one summarizes, one plans. Each writes its output back into the vault 4. Close the loop. Add one line to every agent's instructions: "read /brain before starting, write your result back when done." Now each task leaves the vault richer, and the next run reads that before it works. It compounds instead of resetting 5. You only steer. Review what the brain produces, point it at the next thing. The agents handle the reading, writing, and connecting The edge isn't better notes. It's a brain that feeds itself, so the work gets sharper every cycle instead of starting over Bookmark this

Yarchi

57,678 views • 15 days ago

Anthropic admitted they built an AI so capable they were scared to release it and the number that explains why is 250. Anthropic's CFO Krishna Rao described in this clip what happened when they ran Mythos against an open source codebase that a previous frontier model had already analyzed. The prior model found 22 security vulnerabilities, Mythos found 250. In the same codebase, that the previous model had already reviewed and flagged as relatively clean. That number, more than 11 times as many vulnerabilities discovered is not just a benchmark improvement, it is a signal that there is an entire layer of software infrastructure that humanity has been operating under the assumption was secure and that assumption may no longer hold. The UK AI Security Institute independently evaluated Mythos Preview and confirmed what the internal numbers suggested. On expert level capture the flag challenges that no model could complete before April 2025, Mythos succeeded 73% of the time and it became the first model ever to complete a complex end-to-end attack range from start to finish, autonomously, without human guidance. The World Economic Forum called this a new security-driven era for AI, the Governor of the Bank of England publicly warned that Anthropic may have found a way to unlock the entire cyber-risk landscape, and the European Central Bank began quietly contacting financial institutions to assess their security posture. The response from Anthropic is what makes this story genuinely important. Rather than shelving the model or publishing it as a standard API release, Rao described a phased approach restricting access to a controlled group, focusing specifically on how the cyber capabilities can be used defensively rather than offensively and treating that framework as a template for how to release powerful but dangerous models in the future. The broader context makes that framing even more significant. AI generated code is already creating ten times more security vulnerabilities than human-written code, 63% of organizations reported experiencing an AI driven cyberattack in the past 12 months, and traditional signature-based security tools were built for a threat model that no longer describes the attack surface companies are defending against. Mythos represents a genuine leap in what autonomous security reasoning can do and it cuts both ways. The model that can find 250 vulnerabilities in a codebase a prior model rated as mostly clean is also, in the wrong hands, the model that can exploit those 250 vulnerabilities before a human defender has even finished reading the report. Anthropic's phased release strategy is not just a legal or PR decision, it is the most honest signal yet from a frontier lab that safety governance and capability development can no longer be treated as separate workstreams. The question is not whether this technology gets deployed, it is whether the institutions using it defensively stay ahead of the ones who will eventually use it offensively and whether the labs building it can keep those two timelines from inverting.

Anthropic admitted they built an AI so capable they were scared to release it and the number that explains why is 250. Anthropic's CFO Krishna Rao described in this clip what happened when they ran Mythos against an open source codebase that a previous frontier model had already analyzed. The prior model found 22 security vulnerabilities, Mythos found 250. In the same codebase, that the previous model had already reviewed and flagged as relatively clean. That number, more than 11 times as many vulnerabilities discovered is not just a benchmark improvement, it is a signal that there is an entire layer of software infrastructure that humanity has been operating under the assumption was secure and that assumption may no longer hold. The UK AI Security Institute independently evaluated Mythos Preview and confirmed what the internal numbers suggested. On expert level capture the flag challenges that no model could complete before April 2025, Mythos succeeded 73% of the time and it became the first model ever to complete a complex end-to-end attack range from start to finish, autonomously, without human guidance. The World Economic Forum called this a new security-driven era for AI, the Governor of the Bank of England publicly warned that Anthropic may have found a way to unlock the entire cyber-risk landscape, and the European Central Bank began quietly contacting financial institutions to assess their security posture. The response from Anthropic is what makes this story genuinely important. Rather than shelving the model or publishing it as a standard API release, Rao described a phased approach restricting access to a controlled group, focusing specifically on how the cyber capabilities can be used defensively rather than offensively and treating that framework as a template for how to release powerful but dangerous models in the future. The broader context makes that framing even more significant. AI generated code is already creating ten times more security vulnerabilities than human-written code, 63% of organizations reported experiencing an AI driven cyberattack in the past 12 months, and traditional signature-based security tools were built for a threat model that no longer describes the attack surface companies are defending against. Mythos represents a genuine leap in what autonomous security reasoning can do and it cuts both ways. The model that can find 250 vulnerabilities in a codebase a prior model rated as mostly clean is also, in the wrong hands, the model that can exploit those 250 vulnerabilities before a human defender has even finished reading the report. Anthropic's phased release strategy is not just a legal or PR decision, it is the most honest signal yet from a frontier lab that safety governance and capability development can no longer be treated as separate workstreams. The question is not whether this technology gets deployed, it is whether the institutions using it defensively stay ahead of the ones who will eventually use it offensively and whether the labs building it can keep those two timelines from inverting.

Milk Road AI

24,356 views • 1 month ago

🚨 ANTHROPIC JUST REVEALED CLAUDE MYTHOS ABILITIES Anthropic just formally announced "Claude Mythos Preview" and launched "Project Glasswing" to deploy it for cybersecurity defense. The models are unlocking completely new, autonomous behaviors. This isn't about slightly better benchmark scores. This is about what the model can do. Here are the direct quotes from Anthropic’s research team (including Dario) on exactly what Mythos is capable of: • Chaining Exploits: "It has the ability to chain together vulnerabilities... this model is able to create exploits out of three, four, sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome." • The Professional Standard: "The model that we're experimenting with is, by and large, as good as a professional human at identifying bugs." • Unprecedented Autonomy: "It's just generally better at pursuing really long-range tasks that are kind of like the tasks that a human security researcher would do throughout the course of an entire day." The Reality Check: Dario Amodei flat out said: "There's a kind of accelerating exponential... Claude Mythos Preview is a particularly big jump along that point." Because this model has become so capable at identifying zero-days, they are restricting its release to top tech partners to try to patch the world's software before these capabilities leak out. The autonomous researcher era has officially arrived. It’s over 💀

🚨 ANTHROPIC JUST REVEALED CLAUDE MYTHOS ABILITIES Anthropic just formally announced "Claude Mythos Preview" and launched "Project Glasswing" to deploy it for cybersecurity defense. The models are unlocking completely new, autonomous behaviors. This isn't about slightly better benchmark scores. This is about what the model can do. Here are the direct quotes from Anthropic’s research team (including Dario) on exactly what Mythos is capable of: • Chaining Exploits: "It has the ability to chain together vulnerabilities... this model is able to create exploits out of three, four, sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome." • The Professional Standard: "The model that we're experimenting with is, by and large, as good as a professional human at identifying bugs." • Unprecedented Autonomy: "It's just generally better at pursuing really long-range tasks that are kind of like the tasks that a human security researcher would do throughout the course of an entire day." The Reality Check: Dario Amodei flat out said: "There's a kind of accelerating exponential... Claude Mythos Preview is a particularly big jump along that point." Because this model has become so capable at identifying zero-days, they are restricting its release to top tech partners to try to patch the world's software before these capabilities leak out. The autonomous researcher era has officially arrived. It’s over 💀

Chris

46,077 views • 2 months ago

The Claude Code SDK is now the Claude Agent SDK Why? Because we realized the Claude Code agent harness is useful for much more than coding. In fact, we're moving to using it to power most of our own agent loops at Anthropic.

The Claude Code SDK is now the Claude Agent SDK Why? Because we realized the Claude Code agent harness is useful for much more than coding. In fact, we're moving to using it to power most of our own agent loops at Anthropic.

Thariq

206,976 views • 8 months ago

an OpenAI engineer just showed how he gets agents to do his whole job: code, debug and more, using loops 29 minutes from the engineer who coined "harness engineering" he writes the rules, agents write the code, a reviewer agent loops until it's right the winners won't have the smartest model, they'll have the best loop around it watch it, then read the full guide on loops below

an OpenAI engineer just showed how he gets agents to do his whole job: code, debug and more, using loops 29 minutes from the engineer who coined "harness engineering" he writes the rules, agents write the code, a reviewer agent loops until it's right the winners won't have the smartest model, they'll have the best loop around it watch it, then read the full guide on loops below

Anatoli Kopadze

65,569 views • 15 hours ago

Claude Security is now in public beta for Claude Enterprise customers. Claude scans your codebase for vulnerabilities, validates each finding to cut false positives, and suggests patches you can review and approve.

Claude Security is now in public beta for Claude Enterprise customers. Claude scans your codebase for vulnerabilities, validates each finding to cut false positives, and suggests patches you can review and approve.

Claude

4,895,685 views • 1 month ago

We just crossed $10M in ARR at Chatbase! 🎉 🎉 And today, we're launching Chatbase as the full harness for customer-facing AI agents. Similar to how Claude code is a harness for coding agents, Chatbase is the harness for customer experience agents. That means we give the model the context, tools, workflows, guardrails, and human-in-the-loop systems to be the best ambassador for your brand. It's going beyond just solving issues and is giving your customers the best experiences across every channel. This is a milestone I have been thinking about and obsessed with since day 1, and I am super excited to bring my vision for customer facing agents to life with Chatbase. Thank you to every one of our customers and to the amazing Chatbase team for getting us here! Next stop: $100M ARR

Yasser

742,281 views • 1 month ago

🔥 💻 🎥 How to provide o1 Pro with your FULL CODEBASE through the ChatGPT Cursor connection yesterday I recorded a video connecting o1 Pro to Cursor through the ChatGPT Desktop app more than one person commented on how limiting it is for o1 Pro to only have access to a single open file in Cursor here's a walkthrough of how you can provide o1 Pro with your FULL CODEBASE as context instead of just a single file: 1. Write a Python function that concatenates full codebase into a single snapshot in a .txt file (link in comment) 2. Open that "codebase_snapshot.txt" in a separate Cursor pane 3. Go to the ChatGPT app and ask it what context it has access to through Cursor - should say it has two files with one of them being the codebase snapshot and boom there ya go. o1 Pro has access to everything in your codebase full walkthrough here 👇

🔥 💻 🎥 How to provide o1 Pro with your FULL CODEBASE through the ChatGPT Cursor connection yesterday I recorded a video connecting o1 Pro to Cursor through the ChatGPT Desktop app more than one person commented on how limiting it is for o1 Pro to only have access to a single open file in Cursor here's a walkthrough of how you can provide o1 Pro with your FULL CODEBASE as context instead of just a single file: 1. Write a Python function that concatenates full codebase into a single snapshot in a .txt file (link in comment) 2. Open that "codebase_snapshot.txt" in a separate Cursor pane 3. Go to the ChatGPT app and ask it what context it has access to through Cursor - should say it has two files with one of them being the codebase snapshot and boom there ya go. o1 Pro has access to everything in your codebase full walkthrough here 👇

Dan McAteer

99,614 views • 1 year ago

I tested Genspark new AI Developer: an L4 agent that builds a working app from an idea. You can choose any model (including Claude), and it runs in your browser and in the app—including planning, code, testing, and fixes. The output looks like it was done by a senior developer. I used Prompt to build a functional newsflash page that creates and curates a newsletter for me by adding a few links! It's revolutionary. Prompt in the comments. Genspark

I tested Genspark new AI Developer: an L4 agent that builds a working app from an idea. You can choose any model (including Claude), and it runs in your browser and in the app—including planning, code, testing, and fixes. The output looks like it was done by a senior developer. I used Prompt to build a functional newsflash page that creates and curates a newsletter for me by adding a few links! It's revolutionary. Prompt in the comments. Genspark

Chubby♨️

235,727 views • 10 months ago

Harrison Chase(LangChain CEO) just walked through four ways to give an agent memory. All four assume the model is still holding the right tokens. It isn't. At token 4,096 the cache ran a silent eviction nobody wrote. The user's name was in that batch. First founder to write the eviction policy ships a 100B agent that remembers a person.

Harrison Chase(LangChain CEO) just walked through four ways to give an agent memory. All four assume the model is still holding the right tokens. It isn't. At token 4,096 the cache ran a silent eviction nobody wrote. The user's name was in that batch. First founder to write the eviction policy ships a 100B agent that remembers a person.

Rohit

108,447 views • 1 month ago

Reviewing AI-generated code is the new bottleneck. Writing code was slow before. Now, in a few minutes, you can generate virtually unlimited lines of code using AI. But developers are now spending 70% of their time checking that code. That's the new bottleneck. I've been using Cline, and they have a feature that actually helps with this: You can click a button and have Cline generate an inline explanation for every single line of AI-generated code. Take a look at the attached video. What it does: 1. You can read the code and the explanation together 2. You don't lose context while reviewing code 3. You understand the "why" behind each change And here is the best part: It works on any git diff. You can use it to review PRs, understand commits from other people, or catch up on a codebase you haven't touched in a while.

Reviewing AI-generated code is the new bottleneck. Writing code was slow before. Now, in a few minutes, you can generate virtually unlimited lines of code using AI. But developers are now spending 70% of their time checking that code. That's the new bottleneck. I've been using Cline, and they have a feature that actually helps with this: You can click a button and have Cline generate an inline explanation for every single line of AI-generated code. Take a look at the attached video. What it does: 1. You can read the code and the explanation together 2. You don't lose context while reviewing code 3. You understand the "why" behind each change And here is the best part: It works on any git diff. You can use it to review PRs, understand commits from other people, or catch up on a codebase you haven't touched in a while.

Santiago

64,399 views • 6 months ago

On the Mythos Preview, swyx 🐣 thinks Anthropic are too good at marketing, and may be overdoing it. "I think just let your competence shine through. They're amazing. It's a beautiful model. You don't have to be so dramatic". "You don't have to produce a five-minute polished video of the CEO of every security company on earth saying you're the greatest thing since sliced bread." "There was a little bit too much marketing buzz around the drama of like, 'oh, my Mythos just texted me while I was eating a sandwich".

On the Mythos Preview, swyx 🐣 thinks Anthropic are too good at marketing, and may be overdoing it. "I think just let your competence shine through. They're amazing. It's a beautiful model. You don't have to be so dramatic". "You don't have to produce a five-minute polished video of the CEO of every security company on earth saying you're the greatest thing since sliced bread." "There was a little bit too much marketing buzz around the drama of like, 'oh, my Mythos just texted me while I was eating a sandwich".

etn.

34,155 views • 2 months ago

I think we just got a demo of Mythos and I’m surprised nobodies talking about it.. 💔 In what might be the first instance the general public has seen of Claude Mythos Mythos (TBD) just uncovered a critical zero-day vulnerability in Ghost, an open-source platform with over 50,000 stars on GitHub that has never had a critical security flaw in its entire history. It identified a highly complex "blind SQL injection" a flaw so subtle you can't even see the output, only how the server delays its response. When asked to prove the severity of the bug, the model autonomously wrote a custom Python exploit script that successfully navigated the blind injection to extract the admin API key, secret, and password hashes from the database completely unauthenticated… This is genuinely game changing because it proves frontier models can now actively discover, reason through, and successfully build exploits for invisible vulnerabilities in enterprise-grade architecture that human developers missed for years. Cyber security companies are cooked.

I think we just got a demo of Mythos and I’m surprised nobodies talking about it.. 💔 In what might be the first instance the general public has seen of Claude Mythos Mythos (TBD) just uncovered a critical zero-day vulnerability in Ghost, an open-source platform with over 50,000 stars on GitHub that has never had a critical security flaw in its entire history. It identified a highly complex "blind SQL injection" a flaw so subtle you can't even see the output, only how the server delays its response. When asked to prove the severity of the bug, the model autonomously wrote a custom Python exploit script that successfully navigated the blind injection to extract the admin API key, secret, and password hashes from the database completely unauthenticated… This is genuinely game changing because it proves frontier models can now actively discover, reason through, and successfully build exploits for invisible vulnerabilities in enterprise-grade architecture that human developers missed for years. Cyber security companies are cooked.

Chris

289,722 views • 2 months ago