Uploaded: 2025-11-24T01:58:42.000Z
Duration: PT14.866S
Channel: SemiAnalysis

Something NVIDIA & Google do better than anyone else... is software-hardware-system co-design, and not just optimizing hardware for current model architectures, but predicting future ones. Back in early 2022, when NVIDIA started the design process for NVL72, MoE (Mixture of Experts) models were not yet the standard, and dense models were still dominant for frontier models. However, NVIDIA's strong software-hardware co-design culture enabled them to make a calculated bet that MoEs were the future, and they built NVL72 specifically for best MoE performance per TCO (Total Cost of Ownership). Furthermore, back in 2022, disaggregated prefill and wide expert parallelism (wideEP) MoE inference optimizations hadn't been invented yet, but it turns out that these MoE inference optimizations work best on large-scale systems like NVL72. While most other AI chip companies' in-house AI labs focus on training small 5B models that mainly use data parallelism, NVIDIA and Google's in-house AI labs continuously push the boundaries of model architecture and training recipes, such as NVFP4 training. Just like Super Idol & IShowSpeed, there must be a strong partnership between software engineers and hardware engineers to deliver the best systems that maximize performance per TCO.show more

SemiAnalysis

51,021 views • 7 months ago

MI355 disaggregated serving is competitive to B200 disaggregated serving... show more

SemiAnalysis

22,416 views • 3 months ago

In late 2023, AMD made its best acquisition to... date: NodAI, led by CEO Anush Elangovan. At the time, AMD had a 0% chance of challenging CUDA, while AMD was strong in hardware, it didn't understand software. Since the NodAI acquisition, Anush has driven AMD’s AI software strategy and helped reshape the org around the importance of software and software–hardware co-design. As a result, AMD now has a non-zero chance of breaking the CUDA moat. Had NVIDIA acquired NodAI instead, AMD would almost certainly still be stuck at a 0% chance.show more

SemiAnalysis

39,741 views • 5 months ago

Jensen announced on stage at #COMPUTEX2026 that we'll be... show more

CoreWeave

28,191 views • 21 days ago

"People underestimate how strong our AI team is" Tony... show more

Ritwik Pavan

12,749 views • 1 month ago

🎉 Congratulations Google DeepMind on the launch of Gemma... show more

NVIDIA AI Developer

13,828 views • 1 year ago

Proud to announce Dobb·E: the next step in home... show more

Mahi Shafiullah 🏠🤖

164,452 views • 2 years ago

Coding software is a joke compared to building hardware.... Because when a software fails, it can be fixed quickly. In hardware, one mistake can set you back for months or end you. In hardware, every decision is cash flow. Every tiny error can cost tens of 1000s of dollars. You don’t get second chances. You get lessons, paid for in time and money. But when you finally get it right… You don’t just ship a product. You build a legacy. Something people touch, use, and trust for years. That’s the beauty and the pain of building in the real world.show more

Prakash Dadlani

12,402 views • 7 months ago

This is... not a remotely accurate description of what... the 2023 Al executive order did? Undersecretary Emil Michael: "If you remember the Biden executive order on Al, which was this crazy executive order that limited the amount of compute any model company could do and was essentially grandfathering in a small number of ai companies that they were gonna designate as the winners, and everyone else was out" Its not true that the EO limited the compute that AI companies could do. What it did do was require companies who were training models above a certain very high compute threshold (10^26 FLOP or 10^23 FLOP for models trained primarily on biological sequence data) to notify the government and share what testing and red teaming they were doing for certain national security risks. People are free to dislike the Biden AI EO! But it seems good to factually describe what the policy said.show more

Nathan Calvin

58,745 views • 3 months ago

two weeks in Shenzhen and it’s an been eye-opening... experience im here for a month with MIT SCALE and the speed of hardware production is easily 10× faster than sf. from product concept to production, everything is streamlined within a single building. Nearly every component you could need is available immediately thru Taobao(e bay of China) or HQB and there are 24/7 makerspaces equipped to build anything from micrometer-scale PCBs to full assembly-line robots. What’s striking is that Shenzhen already has much of the hardware and robotics that startups in sf are still trying to build except these systems have been deployed and operating for years. The manufacturing capability, supply chain depth, and technical execution here are world-class. That said, one gap keeps surfacing: brand design, storytelling, and cohesive user experience. Software polish, UX consistency, and attention to detail often feel secondary. Take this with a grain of salt, but it increasingly feels like the company that pairs Shenzhen-level hardware velocity with strong design sensibility and UX-first thinking will dominate the market.show more

Miyu Horiuchi

149,403 views • 5 months ago

😱A bridge with a sharp turn has been built... show more

NEXTA

34,198 views • 11 months ago

World Models are the path for some AI Models... in the future. But how can we efficiently train these models to not only see the world the way humans do but to see the world in a new and unique way. By visualizing, what is normally sequenced audio patterns, we can derive much more insights. Here we see Paganini in a visual form that can than be described and transcribed into a World Model. We can observe connections in a manner that may not have been clear prior to the digitalization of music and sound in this way. The company with the most valuable potential in building a World Model is Tesla. Not that this type of visualization is being used, but that the mechanisms are in place, and the technology is in place for the company to thrive in this new form of AI.show more

Brian Roemmele

57,424 views • 7 months ago

Today we’re introducing Google AI Threat Defense - a... comprehensive AI-powered cybersecurity solution designed to help continuously monitor for and stop AI-powered threats before they can impact your business. Here’s how it works: 1. AI Threat Defense uses our cybersecurity platform Wiz to scan and prioritize what applications and systems have the highest security risk. 2. Gemini and other frontier AI models can then autonomously perform continual deep scanning of your applications - starting with those at the highest risk - to identify security vulnerabilities. 3. The capabilities of CodeMender - a new software repair agent - are then used to verify and accelerate the patching of vulnerabilities. 4. And our Wiz autonomous agents continuously test your systems to find unknown vulnerabilities before adversaries do so that you can remediate them before you are attacked. While other model providers focus on using AI to find and flag vulnerabilities, Google AI Threat Defense actively prioritizes your most critical real-world risks and accelerates their remediation using a variety of models since no single model finds a superset of the vulnerabilities found by all other models.show more

Thomas Kurian

193,376 views • 25 days ago

Most recent diffusion language model research (that I’ve seen)... seems to be using masking as the noising process. It looks like, however, most closed-source models (Google Gemini Diffusion and possibly Inception Labs’ Mercury) use a different noising process, where instead of masking tokens, they replace them with different tokens (either with a random token or a semantically similar token). I wondered how they were getting such high throughput with the latter noising process, since I believed that optimizing inference with KVCache approximation would be more difficult (for various reasons). I visualized this noising process with tiny-diffusion and compared it to normal unmasking, and was very surprised to see how fast the generation “settles” into a reasonable output, and then only slightly refines afterwards, requiring much fewer steps in total. Unmasking (where tokens are never remasked, the typical implementation) is inherently limited in generation speed by the fact that an increase in tokens decoded per step leads to more errors due to the mismatch between individual and marginal token probability distributions we sample from. The token replacement noising process seems to have a much different set of characteristics. Because we sample each token per step, every token makes “progress” towards the final output each iteration (in addition to *potentially* giving other tokens more information in future steps). Generally, masking has outperformed other noising processes, which is probably why most research focused on it (using smaller models). But the paper referred to in the retweet shows that random replacement as a noising process may scale better as model size increases. Big labs might have noticed these results much earlier (due to having drastically more training resources and being able to test larger models), which may explain the discrepancy in the choice of noising process. I’m gonna test this with larger models, since tiny-diffusion only has 10M parameters.show more

Nathan Barry

40,331 views • 5 months ago

🛠️ What if a robot could invent its own... tools. And teach itself how to use them? That’s exactly what VLMgineer does: a new framework that lets Vision Language Models (VLMs) design physical tools and the actions to use them, entirely on their own. No templates. No human demonstrations. Just raw, AI-driven creativity. Why it matters ✅ Co-designs tools and actions together using VLMs, ensuring tight coupling between form and function ✅ Uses VLM-guided evolution (not random search) to refine designs intelligently ✅ Outperforms human-designed tools by +64.7% in task success across 12 RoboToolBench challenges ✅ Produces better-than-everyday tools for real manipulation tasks—measured in success rate and elegance It builds on the emerging trend of large-model-guided evolutionary design (like Eureka and AlphaEvolve) and brings it into physical robotics. It opens the door to general-purpose, automated hardware design, no strong priors needed. Code & paper: —- Weekly robotics and AI insights. Subscribe free:show more

Ilir Aliu

13,984 views • 6 months ago

NVIDIA AI Released DiffusionRenderer: An AI Model for Editable,... Photorealistic 3D Scenes from a Single Video In a groundbreaking new paper, researchers at NVIDIA, University of Toronto, Vector Institute and the University of Illinois Urbana-Champaign have unveiled a framework that directly tackles this challenge. DiffusionRenderer represents a revolutionary leap forward, moving beyond mere generation to offer a unified solution for understanding and manipulating 3D scenes from a single video. It effectively bridges the gap between generation and editing, unlocking the true creative potential of AI-driven content. DiffusionRenderer treats the “what” (the scene’s properties) and the “how” (the rendering) in one unified framework built on the same powerful video diffusion architecture that underpins models like Stable Video Diffusion..... Read full article here: Paper: GitHub Page: NVIDIA NVIDIA AI NVIDIAnewsroom NVIDIA AIDevshow more

Marktechpost AI Dev News ⚡

104,741 views • 11 months ago

The AI Grand Prix is a one-of-a-kind global competition... show more

Jeff Miller

51,330 views • 4 months ago

NVIDIA announces the first open humanoid robot reference design... show more

NVIDIA Robotics

161,581 views • 21 days ago

Ukrainian drones struck the capital of the Chuvash republic... of Cheboksary in Russia, which is around 1,000 km from the Ukrainian border. Target was VNIIR-Progress factory. Explosion from the factory area can be seen from multiple angles. VNIIR is a leading Russian developer, manufacturer, and supplier of hardware and software solutions for relay protection and automation, automated process control systems, electronic component base (electronic modules), electrical engineering products, radio-electronic products, as well as automation control systems and hardware and software solutions. They also produce antennas for anti drone protection. All airports in that region have been closed.show more

(((Tendar)))

107,466 views • 6 months ago

Get you an architect who understands MEP well enough... show more

Marilyn Moedinger

28,358 views • 8 months ago

NVIDIA might have just declared war on the cloud... GPU business For years, AI builders had one option Rent compute Pay every month Watch the bill grow every time usage increased Now NVIDIA is putting serious AI hardware directly on people's desks Small enough to fit next to a monitor Powerful enough to run workloads that used to require expensive cloud infrastructure That's why this launch is getting so much attention The real story isn't the hardware specs It's the business model shift Every month, developers send money to cloud providers for inference, testing, fine-tuning and AI applications The question nobody can answer yet is what happens if enough developers decide they'd rather buy infrastructure once than rent it forever Because if local AI hardware keeps getting more powerful, the economics start changing very quickly Cloud providers built empires on renting access to compute NVIDIA is betting more people will eventually want to own it And that's a much bigger story than a new piece of hardware sitting on a deskshow more

beamnxw ./

30,338 views • 20 days ago

Live Cam