Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

Lets build `Auto-RAG` where we let the LLM pull the data it needs from different sources. 🔎 The user asks a question. 🤔 LLM decides whether to search its knowledge, memory, internet or make an API call. ✍️ LLM answers with the context. Code:

174,580 Aufrufe • vor 2 Jahren •via X (Twitter)

10 Kommentare

Profilbild von Vexxter
Vexxtervor 2 Jahren

is there any way to run a local quantized LLM via ollama in this?? amazing project btw!

Profilbild von Ashpreet Bedi
Ashpreet Bedivor 2 Jahren

@XPhyxer1 absolutely the Hermes2-llama3 might work well here :)

Profilbild von Jordan A. Metzner
Jordan A. Metznervor 2 Jahren

Just read the read me. Any plans for Groq on Llama 3.

Profilbild von Ashpreet Bedi
Ashpreet Bedivor 2 Jahren

@mrjmetz on it!

Profilbild von Ameriki Singh 🈳
Ameriki Singh 🈳vor 2 Jahren

Would love to see groq and Llmma3 on it

Profilbild von Emma.Ai
Emma.Aivor 2 Jahren

wow, can't wait to try this out

Profilbild von CoinCollector
CoinCollectorvor 2 Jahren

Ashpreet is coooooking

Profilbild von 0xba0e7f9d
0xba0e7f9dvor 2 Jahren

🧑‍🚀this is awesome demo!

Profilbild von Aws Abdo, Ph.D.
Aws Abdo, Ph.D.vor 2 Jahren

This work on automating retrieval and generation tasks is incredibly helpful. Thanks you! #MachineLearning #DataScience

Profilbild von Petamber
Petambervor 2 Jahren

You can also try Brave’s Search API for web search

Ähnliche Videos

New short course: LLMs as Operating Systems: Agent Memory, created with Letta, and taught by its founders Charles Packer and Sarah Wooders. An LLM's input context window has limited space. Using a longer input context also costs more and results in slower processing. So, managing what's stored in this context window is important. In the innovative paper MemGPT: Towards LLMs as Operating Systems, its authors (which include the instructors) proposed using an LLM agent to manage this context window. Their system uses a large persistent memory that stores everything that could be included in the input context, and an agent decides what is actually included. Take the example of building a chatbot that needs to remember what's been said earlier in a conversation (perhaps over many days of interaction with a user). As the conversation's length grows, the memory management agent will move information from the input context to a persistent searchable database; summarize information to keep relevant facts in the input context; and restore relevant conversation elements from further back in time. This allows a chatbot to keep what's currently most relevant in its input context memory to generate the next response. When I read the original MemGPT paper, I thought it was an innovative technique for handling memory for LLMs. The open-source Letta framework, which we'll use in this course, makes MemGPT easy to implement. It adds memory to your LLM agents and gives them transparent long-term memory. In detail, you’ll learn: - How to build an agent that can edit its own limited input context memory, using tools and multi-step reasoning - What is a memory hierarchy (an idea from computer operating systems, which use a cache to speed up memory access), and how these ideas apply to managing the LLM input context (where the input context window is a "cache" storing the most relevant information; and an agent decides what to move in and out of this to/from a larger persistent storage system) - How to implement multi-agent collaboration by letting different agents share blocks of memory This course will give you a sophisticated understanding of memory management for LLMs, which is important for chatbots having long conversations, and for complex agentic workflows. Please sign up here!

Andrew Ng

200,752 Aufrufe • vor 1 Jahr