Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

How does Exa serve billion-scale vector search? We combine binary quantization, Matryoshka embeddings, SIMD, and IVF into a novel system that can beat alternatives like HNSW. Shreyas gave a talk today at the AI Engineer World's Fair explaining our approach! ⬇️

Exa

57,172 subscribers

85,627 Aufrufe • vor 2 Jahren •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

10 Kommentare

Profilbild von Jeffrey Wang

Jeffrey Wangvor 2 Jahren

@shreyas4_ @aiDotEngineer I wanna be nearest neighbors w/ @shreyas4_

Profilbild von Tigran III

Tigran IIIvor 2 Jahren

@shreyas4_ @aiDotEngineer i am still struggling to believe how much cracked engineering talent is coming from that one university. @shreyas4_ what's the secret sauce?

Profilbild von Martyn Strydom 🤸

Martyn Strydom 🤸vor 2 Jahren

@shreyas4_ @aiDotEngineer Unreal @shreyas4_

Profilbild von Karan☕

Karan☕vor 2 Jahren

@shreyas4_ @aiDotEngineer great talk learned a lot of new things, had this question: I think if you use binary quantization, for smaller embeddings you will get poorer results because of lossy compression(already dimension reduction is done and then BQ)

Profilbild von Prashant Dixit

Prashant Dixitvor 2 Jahren

@shreyas4_ @aiDotEngineer Anyone wants to just give a quick try and Build Matryoshka Embedding based RAG in a min, Give it a try 🙂

Profilbild von sophia

sophiavor 2 Jahren

@shreyas4_ @aiDotEngineer I'm confused why you said 8TB of memory to hold everything in RAM is too expensive. Back of the envelope Hetzner has 24 core/192GB systems for $366/mo. 8TB would be ~$200k/y or ~18k queries/$ @ 100 QPS

Profilbild von Hamish Ogilvy

Hamish Ogilvyvor 2 Jahren

@shreyas4_ @aiDotEngineer Nice work. So funny how obsessed people were with HNSW…

Profilbild von omkaar

omkaarvor 2 Jahren

@shreyas4_ @aiDotEngineer awesome great job guys

Profilbild von Aarush Sah

Aarush Sahvor 2 Jahren

@shreyas4_ @aiDotEngineer i love shreyas shreyas is so cool

Profilbild von agi

agivor 1 Jahr

@shreyas4_ @aiDotEngineer love this - great insight for my product

Ähnliche Videos

Our approach to AI infra is simple: build the most fungible and flexible fleet to meet the real world's needs across inference and training as Scott Guthrie shared with Alex Kantrowitz. And we are already doing it at scale today, as we power the biggest AI workloads like Copilot and ChatGPT, APIs that power 3P products & enterprise workloads and high scale training.

Our approach to AI infra is simple: build the most fungible and flexible fleet to meet the real world's needs across inference and training as Scott Guthrie shared with Alex Kantrowitz. And we are already doing it at scale today, as we power the biggest AI workloads like Copilot and ChatGPT, APIs that power 3P products & enterprise workloads and high scale training.

Satya Nadella

114,478 Aufrufe • vor 9 Monaten

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Andrew Ng

146,313 Aufrufe • vor 1 Jahr

My pitch to Sam Altman was simple - "We want to serve a billion people" For that scale, we rebuilt the insurance infrastructure with AI and Bitcoin. Today, meanwhile | Bitcoin Life Insurance can close quarterly books in 2.5 hours. And customers can get a fully underwritten policy in a few days. w/ @TBPN

My pitch to Sam Altman was simple - "We want to serve a billion people" For that scale, we rebuilt the insurance infrastructure with AI and Bitcoin. Today, meanwhile | Bitcoin Life Insurance can close quarterly books in 2.5 hours. And customers can get a fully underwritten policy in a few days. w/ @TBPN

Zac Townsend

10,738 Aufrufe • vor 4 Monaten

How does Aaron Rodgers want to see the Steelers learn from last week’s loss and respond? “I don’t like getting too binary, but winning. That’s a good response, but we can’t get attached to the binary system that our league is judged on necessarily because it’s a 17-game season.”

How does Aaron Rodgers want to see the Steelers learn from last week’s loss and respond? “I don’t like getting too binary, but winning. That’s a good response, but we can’t get attached to the binary system that our league is judged on necessarily because it’s a 17-game season.”

Brooke Pryor

423,867 Aufrufe • vor 10 Monaten

INTELLIGENT TASKS ARE A STEPPING STONE TO AGI Today, we are launching ChatLLM Tasks. We have hooked up tools like web search, email, and web scrappers to mini-agents that can be triggered on a schedule. These tasks combine intelligent tool use with crons! We think this is the first step towards AGI. The next step is to connect our AI engineer to create more complex tasks. AGI STEP ONE - DONE!

INTELLIGENT TASKS ARE A STEPPING STONE TO AGI Today, we are launching ChatLLM Tasks. We have hooked up tools like web search, email, and web scrappers to mini-agents that can be triggered on a schedule. These tasks combine intelligent tool use with crons! We think this is the first step towards AGI. The next step is to connect our AI engineer to create more complex tasks. AGI STEP ONE - DONE!

Bindu Reddy

18,486 Aufrufe • vor 1 Jahr

🚨 Treasury Secretary Scott Bessent gave a FASCINATING insight into Trump's tariff strategy with regard to China today. BESSENT: "At the end of the day... we can probably reach a deal with our allies... and then we can approach China as a group." Helps to explain the 90-day pause with other nations & lowered reciprocal tariffs but the increased tariffs on China. ⬇️

🚨 Treasury Secretary Scott Bessent gave a FASCINATING insight into Trump's tariff strategy with regard to China today. BESSENT: "At the end of the day... we can probably reach a deal with our allies... and then we can approach China as a group." Helps to explain the 90-day pause with other nations & lowered reciprocal tariffs but the increased tariffs on China. ⬇️

Kayleigh McEnany

150,051 Aufrufe • vor 1 Jahr

We've raised $6.5M to kill vector databases. Every system today retrieves context the same way: vector search that stores everything as flat embeddings and returns whatever "feels" closest. Similar, sure. Relevant? Almost never. Embeddings can’t tell a Q3 renewal clause from a Q1 termination notice if the language is close enough. A friend of mine asked his AI about a contract last week, and it returned a detailed, perfectly crafted answer pulled from a completely different client’s file. Once you’re dealing with 10M+ documents, these mix-ups happen all the time. VectorDB accuracy goes to shit. We built HydraDB for exactly this. HydraDB builds an ontology-first context graph over your data, maps relationships between entities, understands the 'why' behind documents, and tracks how information evolves over time. So when you ask about 'Apple,' it knows you mean the company you're serving as a customer. Not the fruit. Even when a vector DB's similarity score says 0.94. More below ⬇️

We've raised $6.5M to kill vector databases. Every system today retrieves context the same way: vector search that stores everything as flat embeddings and returns whatever "feels" closest. Similar, sure. Relevant? Almost never. Embeddings can’t tell a Q3 renewal clause from a Q1 termination notice if the language is close enough. A friend of mine asked his AI about a contract last week, and it returned a detailed, perfectly crafted answer pulled from a completely different client’s file. Once you’re dealing with 10M+ documents, these mix-ups happen all the time. VectorDB accuracy goes to shit. We built HydraDB for exactly this. HydraDB builds an ontology-first context graph over your data, maps relationships between entities, understands the 'why' behind documents, and tracks how information evolves over time. So when you ask about 'Apple,' it knows you mean the company you're serving as a customer. Not the fruit. Even when a vector DB's similarity score says 0.94. More below ⬇️

Nishkarsh

3,863,047 Aufrufe • vor 4 Monaten

New short course on Building Applications with Vector Databases, taught by Pinecone’s Tim Tully! At the heart of a vector database is the ability to store a collection of vectors and then query against that, meaning input a new vector and find similar ones. This is useful for many AI applications. In this course, you'll learn how to use vector databases to build: (i) Semantic Search: Create a text search tool that goes beyond keyword matching, and instead focuses on the meaning of content. (ii) RAG (retrieval augmented generation): Enhance your LLM output by incorporating context from sources the model wasn't trained on. (iii) Recommender System: Combine semantic search and RAG to recommend topics, and demonstrate it with a news article recommender. (iv) Hybrid Search: Build an application that finds items using both images and descriptive text -- by combining both sparse and dense vector representations of the data -- using an eCommerce dataset as an example. (v) Image Similarity: Use image vector embeddings to create an app to compare facial features, using a database of public figures to determine the likeness between them. (vi) Anomaly Detection: Build an anomaly detection app that identifies unusual patterns in network communication logs. I hope you’ll enjoy learning how to build all these types of applications! Please sign up here:

New short course on Building Applications with Vector Databases, taught by Pinecone’s Tim Tully! At the heart of a vector database is the ability to store a collection of vectors and then query against that, meaning input a new vector and find similar ones. This is useful for many AI applications. In this course, you'll learn how to use vector databases to build: (i) Semantic Search: Create a text search tool that goes beyond keyword matching, and instead focuses on the meaning of content. (ii) RAG (retrieval augmented generation): Enhance your LLM output by incorporating context from sources the model wasn't trained on. (iii) Recommender System: Combine semantic search and RAG to recommend topics, and demonstrate it with a news article recommender. (iv) Hybrid Search: Build an application that finds items using both images and descriptive text -- by combining both sparse and dense vector representations of the data -- using an eCommerce dataset as an example. (v) Image Similarity: Use image vector embeddings to create an app to compare facial features, using a database of public figures to determine the likeness between them. (vi) Anomaly Detection: Build an anomaly detection app that identifies unusual patterns in network communication logs. I hope you’ll enjoy learning how to build all these types of applications! Please sign up here:

Andrew Ng

137,073 Aufrufe • vor 2 Jahren

Sergey Brin says AI is a bigger breakthrough than the internet the internet, like money, was a shared system that helped people trade and build global networks. but it didn’t test the limits of the universe. AI does — because we don’t know how far intelligence can go

Sergey Brin says AI is a bigger breakthrough than the internet the internet, like money, was a shared system that helped people trade and build global networks. but it didn’t test the limits of the universe. AI does — because we don’t know how far intelligence can go

Haider.

112,590 Aufrufe • vor 1 Jahr

A QUANT’S APPROACH: I always love sitting down with Clifford Asness, who cofounded AQR and grew it into a $240 billion quantitative investing giant. We talk about how to think about value, the edge you can get from alternative data and whether we are in a bubble. (Hint: he believes we aren’t!!) Here’s our full interview where we nerd out for about an hour:

A QUANT’S APPROACH: I always love sitting down with Clifford Asness, who cofounded AQR and grew it into a $240 billion quantitative investing giant. We talk about how to think about value, the edge you can get from alternative data and whether we are in a bubble. (Hint: he believes we aren’t!!) Here’s our full interview where we nerd out for about an hour:

Sonali Basak

424,637 Aufrufe • vor 17 Tagen

Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels. The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch. Our system is also able to produce highly optimized CUDA kernels that are much faster than existing CUDA kernels commonly used in production. We believe that fundamentally, AI systems can and should be as resource-efficient as the human brain, and that the best path to achieve this efficiency is to use AI to make AI more efficient! We are excited to publish our paper, The AI CUDA Engineer: Agentic CUDA Kernel Discovery, Optimization and Composition. We also release a dataset of over 17,000 verified CUDA kernels produced by The AI CUDA Engineer. Paper: Kernel Archive Webpage: HuggingFace Dataset: The AI CUDA Engineer utilizes evolutionary LLM-driven code optimization to autonomously improve the runtime of machine learning operations. Our system is not only able to convert PyTorch code into CUDA kernels, but through the use of evolution, it can also optimize the runtime performance of CUDA kernels, fuse multiple operations, and even discover novel solutions for writing efficient CUDA operations by learning from past innovations! We believe The AI CUDA Engineer opens a new era of AI-driven acceleration of AI and automated inference time optimization. We (Robert Lange, Aaditya Prasad 🇺🇸, sssss, Maxence Faldor, Yujin Tang, hardmaru) are excited to continue Sakana AI's mission of leveraging AI to improve AI.

Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels. The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch. Our system is also able to produce highly optimized CUDA kernels that are much faster than existing CUDA kernels commonly used in production. We believe that fundamentally, AI systems can and should be as resource-efficient as the human brain, and that the best path to achieve this efficiency is to use AI to make AI more efficient! We are excited to publish our paper, The AI CUDA Engineer: Agentic CUDA Kernel Discovery, Optimization and Composition. We also release a dataset of over 17,000 verified CUDA kernels produced by The AI CUDA Engineer. Paper: Kernel Archive Webpage: HuggingFace Dataset: The AI CUDA Engineer utilizes evolutionary LLM-driven code optimization to autonomously improve the runtime of machine learning operations. Our system is not only able to convert PyTorch code into CUDA kernels, but through the use of evolution, it can also optimize the runtime performance of CUDA kernels, fuse multiple operations, and even discover novel solutions for writing efficient CUDA operations by learning from past innovations! We believe The AI CUDA Engineer opens a new era of AI-driven acceleration of AI and automated inference time optimization. We (Robert Lange, Aaditya Prasad 🇺🇸, sssss, Maxence Faldor, Yujin Tang, hardmaru) are excited to continue Sakana AI's mission of leveraging AI to improve AI.

Sakana AI

1,159,053 Aufrufe • vor 1 Jahr

In our latest episode of Tech Talks, we explore the future of agentic game creation at Roblox. Discover how Roblox Assistant is evolving beyond a prompt tool into an AI-native system that can plan, execute, and verify complex game development tasks alongside creators.

In our latest episode of Tech Talks, we explore the future of agentic game creation at Roblox. Discover how Roblox Assistant is evolving beyond a prompt tool into an AI-native system that can plan, execute, and verify complex game development tasks alongside creators.

Roblox

86,415 Aufrufe • vor 3 Monaten

Filesystems vs Vector search is the new MCP vs CLI. Claude uses agentic search. And Dens Sumesh at mintlify like filesystems too. but Retrieval is still used by Notion, Cursor, and others. We hit the cafes of SF to see what people want. - The debate is hot. Filesystems won by 1 point - filesystems feel very simple and intuitive. - RAG requires embedding, vector search and other stuff filesystems won. Introducing SMFS - Supermemory Filesystem we brought the best of these worlds into one single product - it's a filesystem, but agent can also do semantic search using grep. live today. Try it! Works with any sandbox, you mount and sync with cloud, it has a sync engine built in, all filetypes supported - even images and videos can be grepped. So, what do you choose - Filesystems, or vector search? we have both!

Filesystems vs Vector search is the new MCP vs CLI. Claude uses agentic search. And Dens Sumesh at mintlify like filesystems too. but Retrieval is still used by Notion, Cursor, and others. We hit the cafes of SF to see what people want. - The debate is hot. Filesystems won by 1 point - filesystems feel very simple and intuitive. - RAG requires embedding, vector search and other stuff filesystems won. Introducing SMFS - Supermemory Filesystem we brought the best of these worlds into one single product - it's a filesystem, but agent can also do semantic search using grep. live today. Try it! Works with any sandbox, you mount and sync with cloud, it has a sync engine built in, all filetypes supported - even images and videos can be grepped. So, what do you choose - Filesystems, or vector search? we have both!

Dhravya Shah

153,119 Aufrufe • vor 2 Monaten

Building AI infrastructure at yotta-scale requires more than raw performance; it demands an open, modular rack design that can evolve across product generations, combining leadership compute engines with high-speed networking to connect thousands of accelerators into a single, unified system. Take a look at the Helios. Our rack-scale AI platform powered by MI455X GPUs and EPYC “Venice” CPUs and designed for trillion-parameter training.

Building AI infrastructure at yotta-scale requires more than raw performance; it demands an open, modular rack design that can evolve across product generations, combining leadership compute engines with high-speed networking to connect thousands of accelerators into a single, unified system. Take a look at the Helios. Our rack-scale AI platform powered by MI455X GPUs and EPYC “Venice” CPUs and designed for trillion-parameter training.

AMD

20,832 Aufrufe • vor 6 Monaten

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

Avi Chawla

972,347 Aufrufe • vor 5 Monaten

New short course: Building Multimodal Search and RAG", by Weaviate AI Database's Sebastia(N_) Witalec ✊🏽✊🏾✊🏿. Contrastive learning is used to train models to map vectors into an embedding space by pulling similar concepts closer together and pushing dissimilar concepts away from each other. This technique is also used to train multimodal embedding models that capture semantic similarity across different modalities like text, images, and audio. These multimodal embeddings can be used to build multimodal search and RAG systems. In this course, you'll learn how contrastive learning works, and how to add multimodality to RAG – so your models can draw on diverse, relevant context to answer questions. For example, a query about a financial report might synthesize information from text snippets, graphs, tables, and slides. You will also learn how visual instruction tuning lets you integrate image understanding into language models, and build a multi-vector recommender system using Weaviate’s open-source vector database. Please sign up here:

New short course: Building Multimodal Search and RAG", by Weaviate AI Database's Sebastia(N_) Witalec ✊🏽✊🏾✊🏿. Contrastive learning is used to train models to map vectors into an embedding space by pulling similar concepts closer together and pushing dissimilar concepts away from each other. This technique is also used to train multimodal embedding models that capture semantic similarity across different modalities like text, images, and audio. These multimodal embeddings can be used to build multimodal search and RAG systems. In this course, you'll learn how contrastive learning works, and how to add multimodality to RAG – so your models can draw on diverse, relevant context to answer questions. For example, a query about a financial report might synthesize information from text snippets, graphs, tables, and slides. You will also learn how visual instruction tuning lets you integrate image understanding into language models, and build a multi-vector recommender system using Weaviate’s open-source vector database. Please sign up here:

Andrew Ng

104,371 Aufrufe • vor 2 Jahren

Microsoft CEO, Satya Nadella: “At Microsoft, more than 60% of our code is already written by AI agents. We are planning to scale it to 2–20 million agents, all running in a loop.” In this 1-hour talk, Microsoft’s CEO discusses the future of agentic AI with the LinkedIn founder. Watch it today, then read how to build a self-improving agentic system in the article below.

Microsoft CEO, Satya Nadella: “At Microsoft, more than 60% of our code is already written by AI agents. We are planning to scale it to 2–20 million agents, all running in a loop.” In this 1-hour talk, Microsoft’s CEO discusses the future of agentic AI with the LinkedIn founder. Watch it today, then read how to build a self-improving agentic system in the article below.

Movez

132,254 Aufrufe • vor 7 Tagen

Mati Staniszewski on why ElevenLabs is betting on a cascaded approach for voice agents: “Our approach, as you think about voice agents, conversational agents, is effectively a cascaded approach. You use transcription or speech-to-text, an LLM, text-to-speech, and orchestrate all of that together. Then you have speech-to-speech, which goes directly from speech, and there’s a speech response on the other side. Today, we are optimizing heavily on a cascaded approach. As we work with a lot of the businesses and enterprises, they will need that visibility into what happens. They will want to execute certain tasks on top of that. They want good visibility into each of the steps and great accuracy of all the models. But beyond that, they can abstract away what’s the LLM layer, what’s the intelligence layer, and the integrations are easier in that system. That’s where we are betting a lot of the research work on how you can make that great, and we think we can make that great.” John Collison Mati Staniszewski ElevenLabs

Mati Staniszewski on why ElevenLabs is betting on a cascaded approach for voice agents: “Our approach, as you think about voice agents, conversational agents, is effectively a cascaded approach. You use transcription or speech-to-text, an LLM, text-to-speech, and orchestrate all of that together. Then you have speech-to-speech, which goes directly from speech, and there’s a speech response on the other side. Today, we are optimizing heavily on a cascaded approach. As we work with a lot of the businesses and enterprises, they will need that visibility into what happens. They will want to execute certain tasks on top of that. They want good visibility into each of the steps and great accuracy of all the models. But beyond that, they can abstract away what’s the LLM layer, what’s the intelligence layer, and the integrations are easier in that system. That’s where we are betting a lot of the research work on how you can make that great, and we think we can make that great.” John Collison Mati Staniszewski ElevenLabs

Stripe

17,118 Aufrufe • vor 3 Monaten

Today marks a pivotal moment for Isomorphic Labs. We have secured $2.1 Billion in our second external funding round, led by Thrive Capital. They are joined at the table by Alphabet, GV and new investors MGX, Temasek, CapitalG and the UK Sovereign AI Fund. This milestone accelerates our ability to build the pioneering novel AI models that power our AI drug design engine (IsoDDE) and deploy them at scale: delivering scientific breakthroughs with a precision previously thought impossible, accelerating and expanding our pipeline of therapeutic programs toward the clinic. All with the ultimate goal of delivering life-changing new medicines to patients. Moving forward, we will scale our drug candidate pipelines across multiple therapeutic areas, expand our global footprint, and push the boundaries of frontier AI research to power our drug design engine. Deeply grateful to everyone sharing our vision to solve all disease with AI. Let’s build the future of medicine. Read the full announcement here:

Today marks a pivotal moment for Isomorphic Labs. We have secured $2.1 Billion in our second external funding round, led by Thrive Capital. They are joined at the table by Alphabet, GV and new investors MGX, Temasek, CapitalG and the UK Sovereign AI Fund. This milestone accelerates our ability to build the pioneering novel AI models that power our AI drug design engine (IsoDDE) and deploy them at scale: delivering scientific breakthroughs with a precision previously thought impossible, accelerating and expanding our pipeline of therapeutic programs toward the clinic. All with the ultimate goal of delivering life-changing new medicines to patients. Moving forward, we will scale our drug candidate pipelines across multiple therapeutic areas, expand our global footprint, and push the boundaries of frontier AI research to power our drug design engine. Deeply grateful to everyone sharing our vision to solve all disease with AI. Let’s build the future of medicine. Read the full announcement here:

Isomorphic Labs

278,801 Aufrufe • vor 2 Monaten

This Tuesday, we announced Quack, our new protocol that turns DuckDB into a client-server database. Watch Hannes' talk, recorded at the AI Council, where he explained how Quack works in practice, what stack it's built on, how it performs, and what our long-term ambitions are.

This Tuesday, we announced Quack, our new protocol that turns DuckDB into a client-server database. Watch Hannes' talk, recorded at the AI Council, where he explained how Quack works in practice, what stack it's built on, how it performs, and what our long-term ambitions are.

DuckDB

11,069 Aufrufe • vor 2 Monaten