Загрузка видео...
Не удалось загрузить видео
As AI agents face increasingly long and complex tasks, decomposing them into subtasks becomes increasingly appealing. But how do we discover such temporal structure? Hierarchical RL provides a natural formalism-yet many questions remain open. Here's our overview of the field🧵
36,008 просмотров • 11 месяцев назад •via X (Twitter)
Комментарии: 11

Humans constantly leverage temporal structure: we actuate muscles each millisecond, yet our plans can span days, months and even years. Computers are built on this same principle. How will AI agents discover and use such structure? What is "good" structure in the first place?

In this 80+ pages manuscript, we cover the rich, diverse and many-decades old literature studying temporal structure discovery in AI. When and in what way should we expect these methods to benefit agents? What are the trade-offs involved?

We cover methods that learn: (1) directly from experience, (2) through offline datasets and (3) with foundation models (LLMs). We present each methods through the fundamental challenges of decision making, namely: (a) exploration (b) credit assignment and (c) transferability

We often get bogged down by differences in formalisms (goal-direction RL, options, feudal RL, skills …) -- we unite these core ideas through a single perspective. We believe hierarchical RL is fundamentally about the algorithm through which we discover temporal structure.

We hope this work provides a good introduction to the field. Finding temporal structure is challenging. As such, we carefully laid down some of the most pressing questions in the field. We also identified domains that are particularly promising, e.g. open-ended systems.

This work was done over the course of many friendly virtual calls with @akhil_bagaria and @RayZiyan41307, and under the thoughtful guidance of researchers that have spent decades working on these problems, namely George Konidaris, Doina Precup and @MarlosCMachado

We are looking to continue to improve this manuscript, please share your feedback!

Always been fascinated by how HRL tackles the problem of breaking complex tasks into manageable steps. The fields huge potential imo, but yeah, still feels like we’re just scratching the surface of what’s possible

Very interesting work @MartinKlissarov !!!

Great work!! Thanks for the much needed unified overview - looking forward to reading it.

Thanks for the kind words Harsh!

