Loading video...

Video Failed to Load

Go Home

📢Excited to release GoEx⚡️a runtime for LLM-generated actions like code, API calls, and more. Featuring "post-facto validation" for assessing LLM actions after execution 🔍 Key to our approach is "undo" 🔄 and "damage confinement" abstractions to manage unintended actions & risks. This paves the way for fully autonomous LLM...

57,977 views • 2 years ago •via X (Twitter)

6 Comments

Shishir Patil's profile picture
Shishir Patil2 years ago

🌐 Pioneering a future where LLMs empower microservices & apps, evolving from mere data retrievers 🧵to autonomous decision-makers within our digital world 🧙 Wondering about the safety and correctness of these interactions🤔? Our latest vision paper explores these questions, laying out design principles for the next-step in LLM powered applications 💯

Shishir Patil's profile picture
Shishir Patil2 years ago

We study the inherent challenges in relying on LLMs—addressing their unpredictability, the essential trust mechanisms for their decision-making, and hurdles in failure recognition & resolution. Our system, GoEx presents abstractions and policies to overcome these for RESTful APIs, and operations on databases and filesystems! An exhilarating collaboration with @tianjun_zhang @vivianfxng Noppapon C @_royh021 Aaron Hao @profjoeyg @ralucaadapopa Ion Stoica from @UCBerkeley and @martin_casado from @a16z

Jeff Schneider's profile picture
Jeff Schneider2 years ago

in your API inventory, what % had an undo function?

darya's profile picture
darya2 years ago

@vivianfxng hi

Davanum Srinivas's profile picture
Davanum Srinivas2 years ago

cc @ibuildthecloud

Tereza Tizkova's profile picture
Tereza Tizkova2 years ago

I read your paper and it's the first time I see this approach. Based on the code, you just define a "reverse tool" for each tool the LLM can use, is it correct? For the types of actions you cannot reverse, I suggest running the LLM output in a safe sandboxed environment, e.g. using the @e2b_dev Code Interpreter SDK: There, the actions that the LLM agent "decides" to do are isolated in a separate sandbox instance.

Related Videos

New short course: Building Code Agents with Hugging Face smolagents! Learn how to build code agents in this course, created in collaboration with Hugging Face, and taught by Thomas Wolf, its co-founder and CSO, and m_ric, Hugging Face’s Project Lead on Agents. Tool-calling agents use LLMs to generate multiple function calls sequentially to complete a complex sequence of tasks. They generate one function call, execute it, observe, reason, and decide what to do next. Code agents take a different approach. They consolidate all these calls into a single block of code, letting the LLM lay out an entire action plan at once, which can be executed efficiently to provide more reliable results. You’ll learn how to code agents using smolagents, a lightweight agentic framework from Hugging Face. Along the way, you’ll learn how to run LLM-generated code safely and develop an evaluation system to optimize your code agent for production. In detail, you’ll learn: - How agentic systems have evolved, gaining greater levels of agency over time—and why code agents are a next step. - How code agents write their actions in code. - When code agents outperform function-calling agents. - How to run code agents safely in your system using a constrained Python interpreter and sandboxing using E2B. - To trace, debug, and assess the code agent to optimize its behaviours for complex requests. - How to build a research multi-agent system that can find information online and organize it into an interactive report. By the end of this course, you’ll know how to build and run code agents using smolagents, and deploy them safely with a structured evaluation system in your projects. Please sign up here!

Andrew Ng

124,382 views • 1 year ago