Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

The Cost of Intelligence is Heading to Zero | Hyperspace P2P Distributed Cache We present to you our breakthrough cross-domain work across AI, distributed systems, cryptography, game theory to solve the primary structural inefficiency at the heart of AI infrastructure: most inference is redundant. Google has reported that only...

37,272 Aufrufe • vor 3 Monaten •via X (Twitter)

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

I pay Claude $20 a month. Most $TAO holders do too. There is a stack you can build in 15 minutes that fixes that completely. It runs on Bittensor. It costs $10. You do not write a single line of code. Here is how every AI chat product actually works under the hood. Three layers. Always three. The model. The brain. GPT, Claude, DeepSeek, Kimi, GLM. The inference layer. The GPU that runs the model when you hit send. The interface. The chat box you actually look at. ChatGPT and Claude bundle all three and hand you the result. You cannot change the model. You cannot change the inference. The interface is non-negotiable. Every prompt you type goes to a server run by a private company whose terms of service can quietly change next month. The anti-ChatGPT move is to pick each layer yourself. This is where $TAO comes in. Chutes is Subnet 64 on Bittensor. It is the inference layer. Open source models like DeepSeek, Kimi, GLM, and Llama get served by a global network of miner-operated GPUs. Validators score the output quality. The best inference wins the emissions. You hit send. A miner somewhere runs your prompt. You get the answer back. The TAO you hold is in part paying for the GPU you just used. The basic stack is one URL. chutes. ai/chat No account. No API key. No setup. Switch models mid-conversation. Web search built in. Image generation. File uploads. Free. The advanced stack is Chutes plus TypingMind. One-time license. No recurring fee. Plugins, agents, custom personas, a prompt library you build over months. Full model switching between Chutes, OpenAI, and Anthropic from the same window. Total cost: $10 a month to Chutes for inference. That $10 buys you $50 in actual usage. But here is the signal most people missed inside this story. Chutes ran a free tier until February. Then they killed it. Then they raised the minimum to $10 in May. Most people saw that as bad news. It is the opposite. Free things on the internet do not last. Real products do. Chutes is becoming a real product. A subnet that generates actual revenue from actual users paying actual money for actual AI inference. That is what $43 million in Q1 network revenue looks like at the individual subnet level. And there is one more thing ChatGPT and Claude cannot offer that Chutes already has. Trusted Execution Environments. Your prompt gets encrypted on your device, shipped to a confidential compute GPU, and the lock only breaks inside the chip. The miner running the model physically cannot read your prompt. ChatGPT cannot promise that. Claude cannot promise that. Bittensor already built it. You are holding a network where the subnets are generating real revenue, shipping real privacy infrastructure, and replacing $20 a month centralised subscriptions with $10 a month decentralised inference. The people who use the product always understand the investment better than the people who only watch the price.

2xnmore

26,871 Aufrufe • vor 1 Monat

70,000 Phones, One AI Agent — The World's Largest Edge AI Fleet Runs on Hermes We turned 70,000 phones into a shared AI compute network. Any device owner contributes idle compute. Any developer taps distributed inference at a fraction of cloud cost. Not a concept. Not a whitepaper. 70K devices online today. The problem: orchestrating a shared network of heterogeneous edge devices — different chipsets, different memory, different thermal profiles, different owners — is a coordination nightmare no human team can handle manually. So we gave the network a brain: Nous Research Hermes Agent. Hermes connects to 16 MCP servers and runs 24/7: 🔬 Research Loop — Tracks every breakthrough in on-device inference: quantization (GPTQ/AWQ/GGUF), speculative decoding on mobile SoCs, federated learning protocols. Auto-imports papers into NotebookLM. 36 research topics, zero manual curation. 🌐 Network Intelligence — Monitors device availability, compute capacity, and workload distribution across the shared fleet. Surfaces bottlenecks before they cascade. 🧬 Tech Tree Optimizer — Maps the full optimization frontier: from KV-cache compression to on-device LoRA to peer-to-peer model sharding. Hermes autonomously identifies which research paths unlock the most network-wide throughput gains. The result: a self-improving shared compute network. Research compounds daily. The fleet gets smarter without human intervention. Cloud AI scales with money. We scale with people. #HermesHackathon Teknium 🪽 Delphi Digital Tommy

Oyster Republic 🦪📲🦞👓

20,703 Aufrufe • vor 3 Monaten

Chamath said AI is not like the internet. Every new user costs real money. And the infrastructure making it possible was built by everyone. His argument was the clearest case for government ownership of AI labs I have ever heard. And it had nothing to do with Bernie Sanders. Start with the internet comparison. Google and Facebook became the most profitable companies in human history because of one number. The marginal cost of adding a new user was effectively zero. One more search query cost Google nothing. One more Facebook profile cost Meta nothing. They could serve a billion people and the incremental cost of that billion person was rounding error. That is the money printer. Infinite scale at zero marginal cost. AI breaks that model completely. Every single user taxes a GPU. Every query costs electricity. Every response requires memory and compute. The marginal cost of AI is real, significant, and does not disappear at scale. You cannot print money the same way. Then Chamath made the point that landed hardest. The infrastructure these companies depend on, the power grid, the land, the data centers, the permitting, the national security apparatus that protects their chips from being stolen, none of that was built by Anthropic or OpenAI. It was built by the public. By taxpayers. By decades of government investment in the physical and legal foundation these companies are now running on. He compared it to the interstate highway system. If the federal government built the roads and two companies transported all the goods on them, a logical question at that point would be how much of that should I own? You are riding on my rails. His conclusion was direct. If he were running a sovereign wealth fund and had the negotiating leverage of the US government, he would own 75% of these companies when he was done. The internet had zero marginal cost. That is why the founders captured almost all of the value. AI has real marginal cost and runs on public infrastructure. That changes who has a claim on what gets built. WATCH THE FULL PODCAST ON The All-In Podcast

Ihtesham Ali

78,878 Aufrufe • vor 12 Tagen

If intelligence is the log of compute… it starts with a lot of compute! And that’s why we’re scaling our GPU fleet faster than anyone else. Just last year, we added over 2 gigawatts of new capacity – roughly the output of 2 nuclear power plants. And today we’re going further, announcing the world's most powerful AI datacenter, located in southeastern Wisconsin. Fairwater is a seamless cluster of hundreds of thousands of NVIDIA GB200s, connected by enough fiber to circle the Earth 4.5 times. It will deliver 10x the performance of the world’s fastest supercomputer today, enabling AI training and inference workloads at a level never before seen. For AI training workloads, you need compute at exponential scale. That’s why we designed the datacenter, GPU fleet, and network together as one integrated system. This ensures a single job can run from day 1 at exponential scale across thousands of GPUs. Fairwater uses a liquid-cooled closed-loop system for cooling GPUs that requires zero water for operations after construction. And we’re matching all of the energy that is consumed with renewable sources. And of course, it is just one of several similar sites we’re lighting up across our 70+ regions. We have multiple identical Fairwater datacenters under construction in other locations across the US, in addition to our AI infrastructure already deployed in over 100 datacenters around the world, powering model training, test-time compute, RL tuning, and real-time inference at global scale. Too often during times like this, people go with the current and only later wonder, how did we get here? With Fairwater, we're charting a new path: doing the hard engineering work, bringing compute, network, and storage into one highly scaled cluster, and designing closed-loop energy systems to meet real-world computing needs. And partnering with local communities to ensure it's thoughtfully done in a way that is sustainable, creates new jobs, and expands opportunity. We are thrilled to see this take hold in Wisconsin, and we are just getting started.

Satya Nadella

2,019,532 Aufrufe • vor 9 Monaten

Today we announced our new Fairwater datacenter in Atlanta, connected with our first Fairwater site in Wisconsin and our broader Azure footprint to create the world’s first AI superfactory. Fairwater exemplifies our vision for a fungible fleet: infra that can serve any workload, anywhere, on fit-for-purpose accelerators and network paths, with maximum performance and efficiency. AI workloads have evolved beyond large-scale pre-training. Today, they encompass fine-tuning, reinforcement learning (RL), synthetic data generation, evaluation pipelines, and more. Fairwater is built to support this full lifecycle: Max density: Fairwater’s two-story design and liquid cooling system lets us place racks in three dimensions and pack them with GPUs as densely as possible, minimizing cable runs and improving latency and effective bandwidth. Fleet: Each Fairwater DC can integrate hundreds of thousands of the latest NVIDIA GPUs into a single coherent cluster. This provides flexible infra that can support the full spectrum of workloads, and ensure no GPU is left unnecessarily idle. And that’s on top of the more than 100,000 GB300s coming online this quarter alone for inference across the rest of our fleet. For us, it’s all about turning every gigawatt into the maximum number of useful tokens. Not every GW is created equal! Planet-scale: Every Fairwater DC will connect through our continent-spanning AI WAN to prior generations of AI supercomputers, forming a truly fungible pool of compute. This enables developers to scale beyond the capacity of a single site and dynamically land workloads on the right infra for their needs. Together, these innovations let us bring together different generations of silicon and AI systems across DCs and geos into a single elastic system that scales seamlessly across training and inference workloads And this elastic AI capacity is all available alongside all the other cloud services (compute, storage, databases, app services) that AI agents and workloads need. This is what we mean when we talk about building a fungible fleet – a single, unified platform that pushes the limits of performance per watt and per dollar. Read more:

Satya Nadella

907,214 Aufrufe • vor 7 Monaten