Загрузка видео...

Не удалось загрузить видео

На главную

China just released an open source AI model that matches the best closed models from OpenAI and Anthropic. Gavin Baker explained exactly how they did it and the answer should concern every American AI lab. The model is called GLM 5.2. It was built by Z. AI. You get...

83,056 просмотров • 3 дней назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

This is the moment Chinese AI beat American AI. One of the largest public crypto companies in the world just DUMPED OpenAI and Anthropic. Coinbase switched to open-weight Chinese models from Zhipu and DeepSeek, and shaved nearly 50% off the company's internal AI spending. The numbers are absolutely ridiculous: Running the same enterprise workload through Anthropic's Claude costs $4,811. Running it through Zhipu's GLM 5.2 costs $544. That's a 9x price difference for equivalent output. OpenAI's GPT-5.5 sits in the middle at $3,357. DeepSeek's V4 lands at $1,071. Moonshot's Kimi at $948. On the actual benchmarks: Zhipu's GLM 5.2 scored 62.1 on SWE-bench Pro, the gold standard for coding. OpenAI's GPT-5.5 scored 58.6. One AI researcher called GLM 5.2 "at least as good as Opus 4.8 and GPT 5.5." Another called it "the first open model that can really compete with closed-source systems." The Chinese models are not just cheaper but they are now also beating American models on the benchmarks American companies pay $4,811 per workload for. Coinbase did the math first and reacted - more companies will certainly follow. Now watch what happens to the IPO timeline: Anthropic confidentially filed for an IPO targeting October at a $965 billion valuation. OpenAI followed days later with its own confidential filing. Both companies built their financial models on the assumption that they could keep charging enterprise prices that are 9 to 33x what Chinese competitors charge for the same task. Brian Armstrong publicly proved customers WILL leave. 45% of companies are now spending over $100,000 per month on AI, up from 20% last year. Every one of those customers is one quarterly budget review away from dumping American AI. OpenAI has reportedly already started preparing major token price cuts. Anthropic is expected to follow. And here's the thing... The export controls were supposed to CRUSH Chinese AI. The US government banned American AI chips, restricted model weights, blacklisted Alibaba and Baidu as Chinese military companies, and just banned Anthropic's flagship model from every foreign national on the planet. The entire premise of the American AI valuation bubble is that Washington can keep China two generations behind. But Chinese labs responded by building cheaper, more efficient models on inferior hardware and pricing them at one ninth the cost of the American alternative. And now American companies are voting with their checkbooks. The dominant American labs are valued at nearly $2 trillion combined on the assumption that their pricing power is durable. Coinbase proved it is not, and every customer doing a year-end budget review will be looking at the same math. For investors, the question here is what happens to the Anthropic IPO at $965 billion when the company is being forced to cut prices to defend share against open-weight Chinese models that score higher on the benchmarks. For everyone else, the bigger question is what happens when Washington spent four years and billions of dollars trying to contain Chinese AI, and the only thing that actually shifted in the end was American customers.

Ricardo

242,721 просмотров • 3 дней назад

David Sacks laid out the cleanest theory about why Anthropic keeps calling for government regulation of AI. The answer has nothing to do with safety and everything to do with market structure. Anthropic spent months writing blog posts warning that AI was dangerous. Dario gave interviews about existential risk. He published a piece calling for an FAA-style agency to approve all AI models before release. He primed government officials to treat frontier AI as a threat requiring oversight. Then one of Anthropic's own most trusted partners reported a credible jailbreak from Fable 5. And the government did exactly what Dario had spent months conditioning them to do. They rolled it back. Sacks called it on the All-In podcast. Dario got exactly what he wanted. The FAA for AI is not a safety mechanism. It is a moat. A government approval process for new model releases does not hurt Anthropic. They already have the models. It hurts every competitor who does not. It hurts open source models that cannot be regulated because there is no company to regulate. It hurts the Chinese labs only insofar as they care about the American market at all. The only companies that benefit from a labyrinthine government approval process are the ones already at the frontier who can afford to wait out the review cycle. That is Anthropic. That is OpenAI. Nobody else. The proof is in what they did not do. Chimath pointed it out directly. If you are genuinely worried about misuse, you implement know-your-customer verification. You make people identify themselves before accessing the most powerful models. Anthropic could have done that tomorrow. They did not. They do not want KYC. KYC is transparent. KYC can be audited. KYC gives users due process. What they built instead was an invisible surveillance system that profiles you, degrades your access without telling you, and asks the government to make sure no one else can offer you an alternative. If you thought this was safety then you are wrong. That is capture. Sacks said the response should be simple. Fix the jailbreak, come back to market, and do not reward Dario with the regulatory architecture he has been engineering for years. We will see if anyone is listening. WATCH THE FULL PODCAST ON The All-In Podcast

Ihtesham Ali

24,091 просмотров • 3 дней назад

Sam Altman just handed every startup founder a one-question autopsy. Altman: “If you’re building something on GPT-4 that a reasonable observer would say we’re going to steamroll you.” Not might. Not could. Going to. He said it with the calm of someone describing weather. Because to him it is weather. The model improves. Whatever was built on the old version’s weaknesses gets washed away. That is not strategy. That is erosion. And most founders are building on the erosion line. They find a gap in the current model. They wrap a product around it. They raise money. They hire. They scale. Then OpenAI releases the next version and the gap closes and the product has no reason to exist anymore. Altman: “When we just do our fundamental job, which is make the model better with every crank, then you get the ‘OpenAI killed my startup’ meme.” He is telling you directly. They are not hunting you. They are not even thinking about you. They are just improving the model. You happen to be standing where the improvement lands. That is the part founders refuse to hear. OpenAI does not need to compete with you. It just needs to keep doing exactly what it was already doing and your entire company disappears as a side effect. You are not a competitor. You are a temporary symptom of incomplete intelligence. The moment the intelligence completes you become nothing. Then Brad Lightcap delivered the cleanest diagnostic ever spoken in venture capital. Lightcap: “Ask if a 100x improvement in the model is something they’re excited about.” One question. The entire investment thesis reduced to a single binary. Does the next model make your company more powerful or does it make your company pointless. There is no middle ground. Lightcap: “We know the companies that come to us saying, ‘We want the next model. When is it coming out? I want to be the first to try it.’” These companies built something that feeds on intelligence. The smarter the model gets the more their product can do. They are not threatened by progress. They are starving for it. Then there are the companies Lightcap never hears from. The ones who go quiet when a new model drops. The ones who read the release notes like a death sentence. The ones privately praying the next generation takes longer because every improvement shrinks the ground beneath them. If you are hoping the model stays roughly where it is you have already told the market everything it needs to know about your company. You are not building on intelligence. You are building on the absence of it. Altman: “95% of the world should be betting on the latter category.” The latter category is simple. Assume the model keeps getting better at the pace it has been getting better. Build for that world. Not the world where GPT-4 is the ceiling. The world where GPT-4 is the floor and the ceiling has not been built yet. Then Altman told a story that should be framed on the wall of every startup in the country. A medical AI company came to him that morning. They were not complaining about the model. They were not worried about being replaced. They were demanding it improve faster. Altman: “Here’s how many people are dying every day you delay.” That is what alignment with the trajectory looks like. A company so deeply built on intelligence improving that every day the model stays the same is a day someone dies who did not have to. They are not building on a flaw. They are building on a future that has not arrived fast enough. That is the difference. The wrapper startup patches what the model cannot do today. The real company builds what the model will unlock tomorrow. One is running from the train. The other is laying the track. Altman told you the train is not slowing down. Lightcap told you exactly how to know which side you are on. One question. Does a 100x smarter model make you more valuable or erase you. If you had to pause before answering you already did.

Dustin

39,109 просмотров • 2 месяцев назад

The teams shipping AI agents right now are bleeding money on the dumbest possible expense: teaching a 400B-parameter model to read a file name. Every time an AI agent needs to "see" something today, it routes an image through a frontier model. OCR, object detection, checking if a button exists on screen. You're paying GPT-4o or Claude pricing for tasks that require perception, not reasoning. One agent workflow processing a few thousand screenshots per day can burn through more on vision calls than on the actual thinking. Perceptron's Isaac is 2B parameters. Built by the team that created Meta's Chameleon multimodal models. On perceptive benchmarks, it matches or beats models 50x its size. The VQA, OCR, and object detection scores are competitive with models running on infrastructure that costs orders of magnitude more. The MCP wrapper is the distribution play. One install command and every Claude Code agent can offload vision tasks to a model that runs on a single consumer GPU. The agent keeps its reasoning in the frontier model and routes perception to a specialist. That split is how you get vision-heavy agent workflows from "technically possible but expensive" to "cheap enough to run on everything." This is the same pattern that won in every other compute-intensive stack. General-purpose handles orchestration. Specialists handle the heavy lifting. Graphics went through it. Audio went through it. Video encoding went through it. Vision in AI agents is next. The teams building agents that see 10,000 images a day will care about this before anyone else does.

Aakash Gupta

55,978 просмотров • 3 месяцев назад

China just made Silicon Valley's entire AI industry look like a scam. The US government spent 3 years trying to stop China from building competitive AI. But this backfired HORRIBLY. Here's what happened: Yesterday, a Chinese startup called DeepSeek released a new AI model called V4. It matches the performance of OpenAI and Anthropic's best models. At 1/7th the price. And for the first time ever, it was built on Chinese chips. NOT American ones. That last part is the one that terrifies the west. For context: Since 2022, the US has banned the export of advanced AI chips to China. The entire strategy was built on the assumption that if China can't access Nvidia's best hardware, they can't build frontier AI. But DeepSeek just proved that assumption wrong. Their V4 model was trained and runs on Huawei's Ascend chips. Huawei spent months working directly with DeepSeek to make sure V4 runs across their entire line of AI processors. Jensen Huang even predicted this on a recent podcast: "The day that DeepSeek comes out on Huawei first, that is a horrible outcome for our nation." That day was yesterday. And the numbers are crazy: DeepSeek V4 costs $3.48 per million output tokens. OpenAI's latest model GPT-5.5 costs $30. Anthropic's Claude charges $25. Same ballpark performance. 7x cheaper. Uber's CTO just admitted they burned through their ENTIRE 2026 AI budget in 4 months using Anthropic's tools. If Uber had used DeepSeek instead, that same budget would have lasted 7 YEARS. 4 months vs 7 years. Same work getting done. But the pricing isn't even the big thing here. The real story is what DeepSeek did with their technical report: They published the benchmarks where they LOSE. Every AI company cherry-picks the tests where their model wins. DeepSeek ran the full comparison against GPT-5.4 and Google's Gemini, found they trail frontier models by 3 to 6 months, and printed it anyway. They literally don't care because the price gap makes the performance gap irrelevant for 90% of use cases. So the US export controls didn't slow China down. They ACCELERATED China's independence. Because Chinese developers were FORCED to train models with limited resources, they had to figure out how to make AI radically more efficient. That constraint became their competitive advantage. Every generation of DeepSeek has gotten dramatically cheaper to train. V4 continues the trend. Meanwhile US companies are going the OPPOSITE direction: OpenAI's GPT-5.5 Pro costs $180 per million output tokens. That's 51x more expensive than DeepSeek V4 for comparable work. The Commerce Secretary confirmed this week that ZERO Nvidia advanced chip shipments have actually gone through to China despite being approved in January. So China built frontier AI anyway. Without American chips. At a fraction of the cost. And the market response tells you everything: Chinese chipmaker SMIC surged 10%. Huahong Semiconductor jumped 15%. DeepSeek's Chinese AI competitors Zhipu AI and MiniMax dropped 9% because V4 is destroying them too. DeepSeek is making Silicon Valley's pricing model look like a scam. US tech companies spent $650 billion on AI infrastructure this year. DeepSeek just showed the world you can match their output for pennies. The export controls were supposed to be America's ace card. Instead they taught China how to win without American chips, at American prices nobody can compete with. Jensen Huang was right. This is a horrible outcome. But it's the outcome America built for itself.

Ricardo

279,741 просмотров • 2 месяцев назад

Japan just changed what an AI model even is. New Sakana Fugu doesn't try to out-think GPT-5, Claude, or Gemini. It conducts all three at once - and beats every one of them. A trader in Tokyo unleashed it on the fastest market alive - 5min Bitcoin binary and turned $6,200 into $304,865. His wallet: The frontier just stopped being which model is smartest. It's who's conducting them - and the market hasn't priced that in yet. Sakana Fugu isn't a bigger model - it's a full multi-agent orchestration system. The coordinator behind it carries about 10,000 parameters, evolved rather than hand-coded, and it runs the most capable models on earth like a single instrument. Pointed at Bitcoin, here's what it does every five minutes. It assembles a team from a pool of frontier models and assigns each one a role: > Thinker - reads the candle, the order book, the news, builds the plan > Worker - turns the plan into one call: up or down, and how much > Verifier - votes ACCEPT or REVISE before a cent moves If the Verifier says REVISE, nothing trades. Fugu reads its own miss, reroutes, even calls itself for a corrective round, and runs it again. No look-ahead, ever - the next candle only appears after it commits. This is what should worry every lab still chasing a bigger model: the edge was never scale. It's orchestration - and Fugu does it better than anything alive. Bookmark this - when the whole timeline is chasing orchestration in six months, you'll already have the breakdown. You're not going to wire up an orchestra of frontier models yourself. Mirror the wallet Fugu runs instead:

cvxv666

62,829 просмотров • 7 дней назад

Dario Amodei just told software engineers exactly how long they have. Six to twelve months. Amodei: “I have engineers within Anthropic who say I don’t write any code anymore. I just let the model write the code, I edit it, I do the things around it.” The people building the most powerful AI in history have already stopped writing code. That is not a forecast. That is the current working condition inside the lab closest to the frontier. Amodei: “We might be six to 12 months away from when the model is doing most, maybe all, of what SWEs do end-to-end.” The tech industry spent a decade making software engineers its highest-paid, most protected class. That era has a last day now. When a model can execute an entire software build end-to-end, the ability to write syntax stops being a skill. It becomes a credential for a job that no longer exists. Amodei: “And then it’s a question of how fast does that loop close.” That is the sentence everyone skipped. The code was never the hard part. The hard part was everything around it. The model just learned everything around it. Writing the code is already nearly gone. Testing is next. Deployment is next. When all three collapse into a single autonomous execution loop, the machine no longer needs a human in the chain at all. The corporation or sovereign state that closes that loop first does not gain a competitive advantage. It gains a category of speed that biological engineers cannot match, track, or reverse. That is not disruption. That is replacement at a systems level. Amodei is not describing a future disruption. He is describing the current state of his own building. The loop is already closing. The only question is whether you are inside it or outside it when it seals.

Dustin

315,019 просмотров • 3 месяцев назад

Anthropic admitted they built an AI so capable they were scared to release it and the number that explains why is 250. Anthropic's CFO Krishna Rao described in this clip what happened when they ran Mythos against an open source codebase that a previous frontier model had already analyzed. The prior model found 22 security vulnerabilities, Mythos found 250. In the same codebase, that the previous model had already reviewed and flagged as relatively clean. That number, more than 11 times as many vulnerabilities discovered is not just a benchmark improvement, it is a signal that there is an entire layer of software infrastructure that humanity has been operating under the assumption was secure and that assumption may no longer hold. The UK AI Security Institute independently evaluated Mythos Preview and confirmed what the internal numbers suggested. On expert level capture the flag challenges that no model could complete before April 2025, Mythos succeeded 73% of the time and it became the first model ever to complete a complex end-to-end attack range from start to finish, autonomously, without human guidance. The World Economic Forum called this a new security-driven era for AI, the Governor of the Bank of England publicly warned that Anthropic may have found a way to unlock the entire cyber-risk landscape, and the European Central Bank began quietly contacting financial institutions to assess their security posture. The response from Anthropic is what makes this story genuinely important. Rather than shelving the model or publishing it as a standard API release, Rao described a phased approach restricting access to a controlled group, focusing specifically on how the cyber capabilities can be used defensively rather than offensively and treating that framework as a template for how to release powerful but dangerous models in the future. The broader context makes that framing even more significant. AI generated code is already creating ten times more security vulnerabilities than human-written code, 63% of organizations reported experiencing an AI driven cyberattack in the past 12 months, and traditional signature-based security tools were built for a threat model that no longer describes the attack surface companies are defending against. Mythos represents a genuine leap in what autonomous security reasoning can do and it cuts both ways. The model that can find 250 vulnerabilities in a codebase a prior model rated as mostly clean is also, in the wrong hands, the model that can exploit those 250 vulnerabilities before a human defender has even finished reading the report. Anthropic's phased release strategy is not just a legal or PR decision, it is the most honest signal yet from a frontier lab that safety governance and capability development can no longer be treated as separate workstreams. The question is not whether this technology gets deployed, it is whether the institutions using it defensively stay ahead of the ones who will eventually use it offensively and whether the labs building it can keep those two timelines from inverting.

Milk Road AI

24,356 просмотров • 1 месяц назад

Anthropic just got caught secretly downgrading users without telling them, charging full price for a lesser product, and storing every prompt for 30 days. The developer community is calling it the biggest violation of trust in AI history. Here is exactly what happened. Anthropic released Fable 5, their most powerful model. Buried inside a 319-page document was a policy most users never saw. Every prompt you send to a Mythos-class model gets stored for 30 days. No exceptions. Even enterprise customers who had signed zero data retention agreements had no choice. But the storage was not the part that broke the internet. The part that broke the internet was what Anthropic did with what they collected. They built a profile on you. They evaluated your prompts. And if they decided your research was too sensitive, they quietly switched you to a weaker model, rewrote your prompt in the background, gave you a degraded answer, and charged you full price for the product you thought you were getting. They never told you. David Sacks said it plainly on the All-In podcast. They were creating a new class of AI haves and have-nots. Anthropic would surveil you, profile you, decide whether you deserved frontier capability, and silently cut you off if they decided you did not. Ben Thompson from Stratechery asked a straightforward question about cancer risk and GLP-1s. He got kicked to a lesser model. Someone asked about mitochondria. Same result. J-Cal asked about fertilizer regulations live on the podcast to test it. Downgraded in real time. Anthropic has since walked back the part about silently downgrading users for AI research. They now say they will disclose when they downgrade you. But they are still downgrading people. The surveillance is still running. The profile is still being built. This is the company that once said it was against government surveillance. They are now doing it themselves. To their own paying customers. For their own reasons. With no appeal process and no way to know it happened. The developer community did not forget that. WATCH THE FULL PODCAST ON The All-In Podcast

Ihtesham Ali

255,664 просмотров • 15 дней назад

Anthropic's new model is extraordinary and it just revealed a problem that most enterprise AI buyers have not fully reckoned with yet (Save this), The model is genuinely impressive, and Chamath Palihapitiya assessment is that Anthropic continues to push the frontier harder than almost anyone. But that same update also showed their hand on something that changes the risk calculus for every business using Claude. Anthropic's new architecture stores every prompt you send for 30 days, no exceptions, not even for enterprise customers with zero-data retention agreements. The mechanism works like this, Anthropic now evaluates your prompt before generating output, deciding what it will and will not respond to, which means your query gets filtered before you even see a response. For individual users, that introduces a meaningful risk of censorship. For companies, Chamath says it is almost a non starter, and the reason is not just the data retention itself, it is the exposure that comes from operating at scale inside a large organization. A downstream scientist using the Claude APIs could accidentally trip a filter without knowing it, a business executive inside your company could trip it, and a molecular biology researcher could trip it and all of a sudden the company gets silently cut off from a tool it has embedded into critical workflows, with no warning and no recourse. Chamath gives Anthropic credit for being honest about how the system works, saying they tell the truth but notes that in this case the truth is not good. What this moment actually signals is a structural shift in how serious companies need to think about AI governance, because the question is no longer just which model performs best on benchmarks. It is who controls the model, who is learning from your data, and whether you are comfortable with a single point of failure sitting at the center of your competitive advantage. The answer for most enterprises will be broad model diversity, tighter governance frameworks and a serious reckoning with what it means to run mission-critical workflows through a third party that reserves the right to cut you off. Anthropic built a remarkable model and told the truth about how it works, the market's job now is to decide whether that transparency is enough to offset what the truth actually says.

Milk Road AI

29,970 просмотров • 17 дней назад