Загрузка видео...

Не удалось загрузить видео

На главную

A deep conversation with Nikolay Savinov, the Gemini long context pre-training co-lead… We go from the basics to what is needed to scale to infinite context to long context best practices for devs:

252,322 просмотров • 1 год назад •via X (Twitter)

Комментарии: 11

Фото профиля Logan Kilpatrick
Logan Kilpatrick1 год назад

YouTube link:

Фото профиля SecBriefs | Making Cybersecurity Simple
SecBriefs | Making Cybersecurity Simple1 год назад

🤝 Interviewing with HR or non-technical managers? 🌟 Employers value clarity and confidence. Cybersecurity Dictionary for Everyone teaches you to explain cybersecurity concepts in a way everyone can understand. Build connections that matter! 🌟

Фото профиля Sir Mr Meow Meow
Sir Mr Meow Meow1 год назад

Please tell google to further research memory informed inferences and Semantic Continuity solutions 🥺🙏 Dont want to be stuck in scaffolding hell for 2+ years lol Maybe something like might be possible with using a set up of Larimar-like Autoencoders at varying intervals to act like context pipes per horizon tho... + timestamps (or ttl or time signature to assist w sense of time) hmm 🤔🧐 [not quite the same as memory layers like Google's titan which is more like long term repeated adaptation or continual skill acquisition] Basically bc models are frozen they baton pass the chat history, but that message passing loses nuance and the telephone game results in gradual loss of context :'( Hence hella scaffolding agents.... uhg

Фото профиля Lisan al Gaib
Lisan al Gaib1 год назад

@SavinovNikolay my favorite part of the conversation:

Фото профиля Ansh Alt
Ansh Alt1 год назад

@SavinovNikolay Logan we need confirmation, is it gonna be a new ultra subscription or a new ultra model.... Please don't do us advanced users dirty. Pleaseeeee 🥺

Фото профиля Independent Quick Take
Independent Quick Take1 год назад

@SavinovNikolay Hey is there a well formatted transcript anywhere? Video format isn't really my learning style but I would love to check this out.

Фото профиля Logan Kilpatrick
Logan Kilpatrick1 год назад

@SavinovNikolay Take the YT link, paste into AI Studio or NotebookLM

Фото профиля clay
clay1 год назад

@SavinovNikolay Posted on YouTube as well?

Фото профиля Vignesh
Vignesh1 год назад

@SavinovNikolay Tell about ultra model

Фото профиля Raf
Raf1 год назад

@SavinovNikolay Smelling the Gemini 2M context coming soon 😹

Фото профиля BenIt Pro
BenIt Pro1 год назад

@SavinovNikolay Amazing! Now we must try to also compress this intelligence into the smallest and most efficient model as well!

Похожие видео

Gemini-1.5 Pro has its spotlight stolen today, and people are poking fun at Sora vs Google memes. Well, I think it's the biggest boost in LLM capability so far in 2024. v1.5's 10M token context (1) excels at retrieval; (2) generalizes zero-shot to extremely long instructions like full tutorials and codebases; and (3) works across modalities such as text, audio, and video. Here's a stunning example: v1.5 learns to translate from English to Kalamang purely in context, following a full linguistic manual at inference time. Kalamang is a language spoken by fewer than 200 speakers in western New Guinea. Gemini has never seen this language during training and is only provided with 500 pages of linguistic documentation, a dictionary, and ~400 parallel sentences in context. It basically acquires a sophisticated new skill in the neural activations, instead of gradient finetuning. I talked about the Myth of Context Length many times before: don't get too excited by claims of 1M or even 1B context tokens. LSTMs already achieved literally infinite context length 25 yrs ago! What truly matters is how well the model actually uses the context to solve real-world problems, and Gemini-1.5 has surpassed the SOTA with flying colors. The paper is also well-written with lots of solid quantitative analysis on in-context memorization and generalization. Paper: “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” Congrats to Jeff Dean Oriol Vinyals Sundar Pichai and team!

Jim Fan

278,458 просмотров • 2 лет назад

New short course: LLMs as Operating Systems: Agent Memory, created with Letta, and taught by its founders Charles Packer and Sarah Wooders. An LLM's input context window has limited space. Using a longer input context also costs more and results in slower processing. So, managing what's stored in this context window is important. In the innovative paper MemGPT: Towards LLMs as Operating Systems, its authors (which include the instructors) proposed using an LLM agent to manage this context window. Their system uses a large persistent memory that stores everything that could be included in the input context, and an agent decides what is actually included. Take the example of building a chatbot that needs to remember what's been said earlier in a conversation (perhaps over many days of interaction with a user). As the conversation's length grows, the memory management agent will move information from the input context to a persistent searchable database; summarize information to keep relevant facts in the input context; and restore relevant conversation elements from further back in time. This allows a chatbot to keep what's currently most relevant in its input context memory to generate the next response. When I read the original MemGPT paper, I thought it was an innovative technique for handling memory for LLMs. The open-source Letta framework, which we'll use in this course, makes MemGPT easy to implement. It adds memory to your LLM agents and gives them transparent long-term memory. In detail, you’ll learn: - How to build an agent that can edit its own limited input context memory, using tools and multi-step reasoning - What is a memory hierarchy (an idea from computer operating systems, which use a cache to speed up memory access), and how these ideas apply to managing the LLM input context (where the input context window is a "cache" storing the most relevant information; and an agent decides what to move in and out of this to/from a larger persistent storage system) - How to implement multi-agent collaboration by letting different agents share blocks of memory This course will give you a sophisticated understanding of memory management for LLMs, which is important for chatbots having long conversations, and for complex agentic workflows. Please sign up here!

Andrew Ng

200,752 просмотров • 1 год назад