
Galileo
@rungalileo • 1,460 subscribers
The fastest way to ship reliable AI apps - Evaluation, Experimentation, and Observability Platform
Videos

Multi-agent systems offer incredible potential and unprecedented risks. How do you solve for observability, failure mode analysis, and guardrailing in the era of agents? Today, we’re announcing our Agent Reliability platform to observe, evaluate, guardrail, and improve agents at scale. You can get started with the complete platform for trustworthy agentic AI today for free, and here’s how we’re solving some of the biggest challenges in agent reliability: - Observability redesigned for agents Trace views collapse under complex workflows, so we created the Graph View, Timeline View, and Conversation View to offer rich, intuitive visualizations of agent decisions, tool calls, and conversation flows. This multi-dimensional approach enables teams to pinpoint exactly where and why agents deviate or fail. - Automated Failure Mode Analysis with our new Insights Engine Our Insights Engine ingests your logs, metrics, and agent code to automatically surface nuanced failure modes and their root causes. But knowing the problem is not enough; you need to know how to fix it. Insights Engine delivers actionable fixes and can even apply them automatically. With adaptive learning, your insights become smarter and more relevant as your agents evolve. - Evaluating Agents Across Multiple Dimensions Agentic systems interact across complex pathways, and evaluating their performance requires new metrics that reflect this increasing complexity. To deliver comprehensive agentic measurements, we’ve added more out-of-the-box agent metrics like flow adherence, agent flow, agent efficiency, and more. For specialized domains and unique workflows, custom metrics powered by our new Luna-2 small language models can be rapidly designed and fine-tuned for your specific use case. - Real-Time Guardrails Powered by Luna-2 As AI agents become more autonomous and complex, failures like hallucinations or unsafe actions increase dramatically. Without real-time guardrails, these errors will hurt your user experience and brand reputation. Our Luna-2 family of small language models is purpose-built to provide low-latency, cost-effective guardrails that actively stop agent errors before they happen. With support for out-of-the-box and custom metrics, Luna-2 enables enterprises to enforce safety, compliance, and reliability at scale. Enterprises running hundreds of agents and processing hundreds of millions of queries daily already rely on Galileo’s Agent Reliability platform to protect their users, safeguard brand trust, and accelerate innovation. Agent Reliability is available starting today. Try it for free and experience the new standard in AI reliability. Learn more below 👇
Galileo1,276,298 views • 11 months ago

Debugging agents shouldn’t feel like detective work. Today, we’re excited to release two new AI agent interfaces that make agent observability & evaluations even more effective. 🔎 Timeline View – See execution flow and bottlenecks at a glance. No more guessing where your agent gets stuck. 💬 Conversation View – Experience exactly what your users see. Debug from the user's perspective, not just the system's. Combined with last week's Graph View, you now have three complementary ways to debug your agents: → Graph: Visualize decision paths and tool usage → Timeline: Spot performance bottlenecks instantly → Conversation: See the user experience end-to-end AI evaluations + observability are crucial to building reliable AI. These interfaces make it simpler to identify blockers and improve your agents faster. See all three views in action, and try it free with the link below 👇
Galileo, now part of Cisco499,471 views • 1 year ago

No single vendor will win the AI race, but open ecosystems might. Real velocity in AI comes from interoperability, not lock-in. And AMD just made all of its software open source. At last week’s Advancing AI 2025, we sat down with AMD’s VP of AI Software Anush Elangovan and Sharon Zhou VP of AI at AMD, to discuss their case for why an open, multi-partner ecosystem will accelerate AI innovation faster than any proprietary alternative. AMD’s announcements last week double down on this OSS focus and their commitment to AI infrastructure, including: ✅ Open Source Ecosystem: ROCm 7, AMD’s latest open-source AI software stack, introduces kernel-level improvements for GEMM operations, optimized attention mechanisms, and expanded support for distributed inference. The update brings substantial speedups for inference workloads, with average performance increases of 3.2x to 3.8x ✅ Hardware: New MI355X GPU delivers up to 40% more tokens per dollar vs competition & the MI350 Series has seen a 35x generational leap in AI inference performance ✅ Infrastructure Investments: Oracle just committed to zettascale (‼️) clusters with up to 131,072 MI355X GPUs and AMD showcased their new $10 billion partnership with Saudi Arabian AI firm HUMAIN to build AI infrastructure, including data centers, powered by AMD chips. ✅ Partnership Momentum: 7 out of 10 top AI companies now run production workloads on AMD Instinct accelerators (including Meta, OpenAI, Microsoft & xAI) By inviting interoperability and contribution at every layer, AMD is enabling developers to build faster, optimize deeper, and deploy with flexibility. Listen to Anush and Sharon’s Chain of Thought Podcast episode with host Conor Bronsdon in the next tweet to get all the details and a deep dive into AMD’s strategy 👇
Galileo78,922 views • 1 year ago

How is an open ecosystem powering the next generation of AI for developers? Recording live from the heart of the action at AMD's Advancing AI 2025, Chain of Thought host Conor Bronsdon welcomes AMD’s Anush Elangovan, VP of AI Software, and Sharon Zhou, VP of AI. Together they unpack AMD's groundbreaking transformation from a hardware giant to a leader in full-stack AI, committed to an open ecosystem. Discover how new MI350 GPUs deliver mind-blowing performance with advanced data types and why ROCm 7 and AMD Developer Cloud offer Day Zero support for frontier models. This relentless pace of hardware and software innovation is reshaping the AI landscape. Then Conor welcomes Sharon Zhou, VP of AI at AMD, to discuss making AMD's powerful software stack truly accessible and how to drive developer curiosity. Sharon explains strategies for creating a "happy path" for community contributions, fostering engagement through teaching, and listening to developers at every stage. She shares her predictions for the future, including the rise of self-improving AI, the critical role of heterogeneous compute, and the potential of "vibes based feedback" to guide models. This vision for democratizing access to high-performance AI, driven by a deep understanding of the developer journey, promises to unlock the next generation of applications. 00:00 Live from AMD's Advancing AI 2025 Event 00:30 Introduction to Anush Elangovan 01:38 The MI350 GPU Series Unveiled 04:57 CDNA4 Architecture Explained 07:00 The Future of AI Infrastructure 08:32 AMD's Developer Cloud and ROCm 7 11:50 Cultural Shift at AMD 14:48 Open Source and Community Contributions 18:35 Software Longevity and Ecosystem Strategy 22:19 AI Agents and Performance Gains 27:36 AI's Role in Solving Power Challenges 28:11 Thanking Anush 28:42 Introduction to Sharon Zhou 29:45 Sharon's Focus at AMD 30:39 Engaging Developers with AMD's AI Tools 31:24 Listening to the AI Community 33:56 Open Source and AI Development 45:04 Future of AI and Self-Improving Models 48:04 Final Thoughts and Farewell
Galileo37,186 views • 1 year ago
No more content to load