Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

A deep conversation with Nikolay Savinov, the Gemini long context pre-training co-lead… We go from the basics to what is needed to scale to infinite context to long context best practices for devs:

Logan Kilpatrick

333,150 subscribers

252,322 просмотров • 1 год назад •via X (Twitter)

Наука и технологии Образование

Anya Rossi• Live Now

Private livecam show

Комментарии: 11

Фото профиля Logan Kilpatrick

Logan Kilpatrick1 год назад

YouTube link:

Фото профиля SecBriefs | Making Cybersecurity Simple

SecBriefs | Making Cybersecurity Simple1 год назад

🤝 Interviewing with HR or non-technical managers? 🌟 Employers value clarity and confidence. Cybersecurity Dictionary for Everyone teaches you to explain cybersecurity concepts in a way everyone can understand. Build connections that matter! 🌟

Фото профиля Sir Mr Meow Meow

Sir Mr Meow Meow1 год назад

Please tell google to further research memory informed inferences and Semantic Continuity solutions 🥺🙏 Dont want to be stuck in scaffolding hell for 2+ years lol Maybe something like might be possible with using a set up of Larimar-like Autoencoders at varying intervals to act like context pipes per horizon tho... + timestamps (or ttl or time signature to assist w sense of time) hmm 🤔🧐 [not quite the same as memory layers like Google's titan which is more like long term repeated adaptation or continual skill acquisition] Basically bc models are frozen they baton pass the chat history, but that message passing loses nuance and the telephone game results in gradual loss of context :'( Hence hella scaffolding agents.... uhg

Фото профиля Lisan al Gaib

Lisan al Gaib1 год назад

@SavinovNikolay my favorite part of the conversation:

Фото профиля Ansh Alt

Ansh Alt1 год назад

@SavinovNikolay Logan we need confirmation, is it gonna be a new ultra subscription or a new ultra model.... Please don't do us advanced users dirty. Pleaseeeee 🥺

Фото профиля Independent Quick Take

Independent Quick Take1 год назад

@SavinovNikolay Hey is there a well formatted transcript anywhere? Video format isn't really my learning style but I would love to check this out.

Фото профиля Logan Kilpatrick

Logan Kilpatrick1 год назад

@SavinovNikolay Take the YT link, paste into AI Studio or NotebookLM

Фото профиля clay

clay1 год назад

@SavinovNikolay Posted on YouTube as well?

Фото профиля Vignesh

Vignesh1 год назад

@SavinovNikolay Tell about ultra model

Фото профиля Raf

Raf1 год назад

@SavinovNikolay Smelling the Gemini 2M context coming soon 😹

Фото профиля BenIt Pro

BenIt Pro1 год назад

@SavinovNikolay Amazing! Now we must try to also compress this intelligence into the smallest and most efficient model as well!

Похожие видео

Context caching with Gemini is so good! Here I am caching the entire Gemini Cookbook (around 400k tokens) as an insanely long prompt to create the best Gemini app developer on the planet. Watch Gemini answer any coding questions related to its own APIs.

Context caching with Gemini is so good! Here I am caching the entire Gemini Cookbook (around 400k tokens) as an insanely long prompt to create the best Gemini app developer on the planet. Watch Gemini answer any coding questions related to its own APIs.

Pietro Schirano

107,369 просмотров • 2 лет назад

It's not only about how long your context is, but how well you use it. Great to see Gemini 2.5 models dominating MRCR and other benchmarks on long context! See 2.5 Pro tackle a complex coding task by reasoning over an entire repo (>500k tokens). Performance and effective use of the (loooong) context windows are what really matter!

It's not only about how long your context is, but how well you use it. Great to see Gemini 2.5 models dominating MRCR and other benchmarks on long context! See 2.5 Pro tackle a complex coding task by reasoning over an entire repo (>500k tokens). Performance and effective use of the (loooong) context windows are what really matter!

Oriol Vinyals

27,008 просмотров • 1 год назад

Introducing HydraDB. The graph native context infrastructure for agents. Purpose built to deliver precise context & observability into why agents act the way they do. We've always believed graphs are the best way to manage AI context, but they've been too expensive to scale or impractical for storing full context. Until now. HydraDB combines in memory, NVMe, and object storage into a single graph layer, making context delivery faster, cheaper, and more precise. We want context delivery to be extremely fast, 1000x cheap, and highly precise. Give your agents a brain.

Introducing HydraDB. The graph native context infrastructure for agents. Purpose built to deliver precise context & observability into why agents act the way they do. We've always believed graphs are the best way to manage AI context, but they've been too expensive to scale or impractical for storing full context. Until now. HydraDB combines in memory, NVMe, and object storage into a single graph layer, making context delivery faster, cheaper, and more precise. We want context delivery to be extremely fast, 1000x cheap, and highly precise. Give your agents a brain.

Nishkarsh

2,289,922 просмотров • 29 дней назад

Scale enterprise agents with Gemini 3 Pro and @eigent_ai 🚀 This integration combines Gemini 3 Pro’s long context window with Eigent’s infrastructure to support reliable, high-performance workflows. Use it to build agents for complex data analysis and automated decision-making.

Scale enterprise agents with Gemini 3 Pro and @eigent_ai 🚀 This integration combines Gemini 3 Pro’s long context window with Eigent’s infrastructure to support reliable, high-performance workflows. Use it to build agents for complex data analysis and automated decision-making.

Google AI Developers

43,495 просмотров • 5 месяцев назад

Gemini-1.5 Pro has its spotlight stolen today, and people are poking fun at Sora vs Google memes. Well, I think it's the biggest boost in LLM capability so far in 2024. v1.5's 10M token context (1) excels at retrieval; (2) generalizes zero-shot to extremely long instructions like full tutorials and codebases; and (3) works across modalities such as text, audio, and video. Here's a stunning example: v1.5 learns to translate from English to Kalamang purely in context, following a full linguistic manual at inference time. Kalamang is a language spoken by fewer than 200 speakers in western New Guinea. Gemini has never seen this language during training and is only provided with 500 pages of linguistic documentation, a dictionary, and ~400 parallel sentences in context. It basically acquires a sophisticated new skill in the neural activations, instead of gradient finetuning. I talked about the Myth of Context Length many times before: don't get too excited by claims of 1M or even 1B context tokens. LSTMs already achieved literally infinite context length 25 yrs ago! What truly matters is how well the model actually uses the context to solve real-world problems, and Gemini-1.5 has surpassed the SOTA with flying colors. The paper is also well-written with lots of solid quantitative analysis on in-context memorization and generalization. Paper: “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” Congrats to Jeff Dean Oriol Vinyals Sundar Pichai and team!

Gemini-1.5 Pro has its spotlight stolen today, and people are poking fun at Sora vs Google memes. Well, I think it's the biggest boost in LLM capability so far in 2024. v1.5's 10M token context (1) excels at retrieval; (2) generalizes zero-shot to extremely long instructions like full tutorials and codebases; and (3) works across modalities such as text, audio, and video. Here's a stunning example: v1.5 learns to translate from English to Kalamang purely in context, following a full linguistic manual at inference time. Kalamang is a language spoken by fewer than 200 speakers in western New Guinea. Gemini has never seen this language during training and is only provided with 500 pages of linguistic documentation, a dictionary, and ~400 parallel sentences in context. It basically acquires a sophisticated new skill in the neural activations, instead of gradient finetuning. I talked about the Myth of Context Length many times before: don't get too excited by claims of 1M or even 1B context tokens. LSTMs already achieved literally infinite context length 25 yrs ago! What truly matters is how well the model actually uses the context to solve real-world problems, and Gemini-1.5 has surpassed the SOTA with flying colors. The paper is also well-written with lots of solid quantitative analysis on in-context memorization and generalization. Paper: “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” Congrats to Jeff Dean Oriol Vinyals Sundar Pichai and team!

Jim Fan

278,458 просмотров • 2 лет назад

Flow matters Practices go from instructional to open environments Today we went drills that emphasize context of COD to open reactive environments #football #strengthandconditioning #workout #lift #training #lifting #speed #strengthtraining #sprint #play #agility

Flow matters Practices go from instructional to open environments Today we went drills that emphasize context of COD to open reactive environments #football #strengthandconditioning #workout #lift #training #lifting #speed #strengthtraining #sprint #play #agility

Joseph Guarascio

129,841 просмотров • 1 год назад

New ways for developers to build with Gemini → 😎 Gemini 1.5 Flash joins 1.5 Pro in public preview via the Gemini API in Google AI Studio 🥳 A new Context Caching feature in the Gemini API 🤗 A preview of our 2 million context window #GoogleIO

New ways for developers to build with Gemini → 😎 Gemini 1.5 Flash joins 1.5 Pro in public preview via the Gemini API in Google AI Studio 🥳 A new Context Caching feature in the Gemini API 🤗 A preview of our 2 million context window #GoogleIO

Google for Developers

17,467 просмотров • 2 лет назад

The best way to learn AI is to build with agents. To help with that, we've launched hands-on labs and a new series on Agentic Engineering. First topic: Agent Skills. Next in the pipeline: planning, context engineering, multi-agent systems, long-running agents,.. Go build!

The best way to learn AI is to build with agents. To help with that, we've launched hands-on labs and a new series on Agentic Engineering. First topic: Agent Skills. Next in the pipeline: planning, context engineering, multi-agent systems, long-running agents,.. Go build!

elvis

31,802 просмотров • 1 месяц назад

A conversation on the optimal reward for coding agents, infinite context models, and real-time RL

A conversation on the optimal reward for coding agents, infinite context models, and real-time RL

Cursor

318,436 просмотров • 1 год назад

Long video generation usually results in context increasing/scaling during chunk/frame-wise rollout. Considering context scaling may require context selection, we thus introduce the idea of MoE into long context modelling and propose Mixture of Contexts. All previous context/memory is considered while the chosen ones are computed in a data-driven manner. You can easily enjoy 7x compute savings.

Long video generation usually results in context increasing/scaling during chunk/frame-wise rollout. Considering context scaling may require context selection, we thus introduce the idea of MoE into long context modelling and propose Mixture of Contexts. All previous context/memory is considered while the chosen ones are computed in a data-driven manner. You can easily enjoy 7x compute savings.

Ceyuan Yang

22,141 просмотров • 10 месяцев назад

Cline v3.13.3 is here with new ways to manage context and costs: - /smol slash command to compress long conversations 🤏 - Gemini 2.5 Pro prompt caching (Cline, OpenRouter) - MCP server download counts - UI tooltips for easier navigation Here's the breakdown ↓

Cline v3.13.3 is here with new ways to manage context and costs: - /smol slash command to compress long conversations 🤏 - Gemini 2.5 Pro prompt caching (Cline, OpenRouter) - MCP server download counts - UI tooltips for easier navigation Here's the breakdown ↓

Cline

27,415 просмотров • 1 год назад

Question we get constantly: "How does Cline handle context limits in long-running tasks?" Here's how users can manage their context: /newtask: Creates a detailed handoff summary and starts fresh context. Like handing off work to a new engineer with full background. more 👇

Question we get constantly: "How does Cline handle context limits in long-running tasks?" Here's how users can manage their context: /newtask: Creates a detailed handoff summary and starts fresh context. Like handing off work to a new engineer with full background. more 👇

Cline

21,739 просмотров • 11 месяцев назад

New short course: LLMs as Operating Systems: Agent Memory, created with Letta, and taught by its founders Charles Packer and Sarah Wooders. An LLM's input context window has limited space. Using a longer input context also costs more and results in slower processing. So, managing what's stored in this context window is important. In the innovative paper MemGPT: Towards LLMs as Operating Systems, its authors (which include the instructors) proposed using an LLM agent to manage this context window. Their system uses a large persistent memory that stores everything that could be included in the input context, and an agent decides what is actually included. Take the example of building a chatbot that needs to remember what's been said earlier in a conversation (perhaps over many days of interaction with a user). As the conversation's length grows, the memory management agent will move information from the input context to a persistent searchable database; summarize information to keep relevant facts in the input context; and restore relevant conversation elements from further back in time. This allows a chatbot to keep what's currently most relevant in its input context memory to generate the next response. When I read the original MemGPT paper, I thought it was an innovative technique for handling memory for LLMs. The open-source Letta framework, which we'll use in this course, makes MemGPT easy to implement. It adds memory to your LLM agents and gives them transparent long-term memory. In detail, you’ll learn: - How to build an agent that can edit its own limited input context memory, using tools and multi-step reasoning - What is a memory hierarchy (an idea from computer operating systems, which use a cache to speed up memory access), and how these ideas apply to managing the LLM input context (where the input context window is a "cache" storing the most relevant information; and an agent decides what to move in and out of this to/from a larger persistent storage system) - How to implement multi-agent collaboration by letting different agents share blocks of memory This course will give you a sophisticated understanding of memory management for LLMs, which is important for chatbots having long conversations, and for complex agentic workflows. Please sign up here!

New short course: LLMs as Operating Systems: Agent Memory, created with Letta, and taught by its founders Charles Packer and Sarah Wooders. An LLM's input context window has limited space. Using a longer input context also costs more and results in slower processing. So, managing what's stored in this context window is important. In the innovative paper MemGPT: Towards LLMs as Operating Systems, its authors (which include the instructors) proposed using an LLM agent to manage this context window. Their system uses a large persistent memory that stores everything that could be included in the input context, and an agent decides what is actually included. Take the example of building a chatbot that needs to remember what's been said earlier in a conversation (perhaps over many days of interaction with a user). As the conversation's length grows, the memory management agent will move information from the input context to a persistent searchable database; summarize information to keep relevant facts in the input context; and restore relevant conversation elements from further back in time. This allows a chatbot to keep what's currently most relevant in its input context memory to generate the next response. When I read the original MemGPT paper, I thought it was an innovative technique for handling memory for LLMs. The open-source Letta framework, which we'll use in this course, makes MemGPT easy to implement. It adds memory to your LLM agents and gives them transparent long-term memory. In detail, you’ll learn: - How to build an agent that can edit its own limited input context memory, using tools and multi-step reasoning - What is a memory hierarchy (an idea from computer operating systems, which use a cache to speed up memory access), and how these ideas apply to managing the LLM input context (where the input context window is a "cache" storing the most relevant information; and an agent decides what to move in and out of this to/from a larger persistent storage system) - How to implement multi-agent collaboration by letting different agents share blocks of memory This course will give you a sophisticated understanding of memory management for LLMs, which is important for chatbots having long conversations, and for complex agentic workflows. Please sign up here!

Andrew Ng

200,752 просмотров • 1 год назад

CONTEXT: Here’s the clip that lead to this WILD moment 🤣

CONTEXT: Here’s the clip that lead to this WILD moment 🤣

Stag

197,227 просмотров • 3 месяцев назад

DeepMind's Nikolay Savinov says 10M-token context windows will transform how AI works. AI will ingest entire codebases at once, becoming "totally unrivaled… the new tool for every coder in the world." 100M is coming too -- and with it, reasoning across systems we can't yet imagine.

DeepMind's Nikolay Savinov says 10M-token context windows will transform how AI works. AI will ingest entire codebases at once, becoming "totally unrivaled… the new tool for every coder in the world." 100M is coming too -- and with it, reasoning across systems we can't yet imagine.

vitrupo

261,606 просмотров • 1 год назад

Today I’m proud to launch Context, a CLI and open-source corpus for the python ecosystem - 4M quality embeddings of the top 1218 Python Libraries. $ pip install fleet-context or download to get the entire python universe. Here’s why this is game-changing for both Devs and LLMs

Today I’m proud to launch Context, a CLI and open-source corpus for the python ecosystem - 4M quality embeddings of the top 1218 Python Libraries. $ pip install fleet-context or download to get the entire python universe. Here’s why this is game-changing for both Devs and LLMs

Nicolai Ouporov

310,777 просмотров • 2 лет назад

Meanwhile Fire Point co-founder Denis Shtillerman, the company behind the Flamingo missile, posted a teaser video. "A short video. No context. Context to follow" he said.

Meanwhile Fire Point co-founder Denis Shtillerman, the company behind the Flamingo missile, posted a teaser video. "A short video. No context. Context to follow" he said.

WarTranslated

38,039 просмотров • 4 месяцев назад

The Context Cloud for agents is here. Introducing the new supermemory. A new way to do context, for your agents Everything you need for context - Memory, RAG, Filesystems, Profiles, are now available as blocks you can compose to build something for your use case

The Context Cloud for agents is here. Introducing the new supermemory. A new way to do context, for your agents Everything you need for context - Memory, RAG, Filesystems, Profiles, are now available as blocks you can compose to build something for your use case

Dhravya Shah

88,380 просмотров • 1 месяц назад

Today we entered the Gemini 3 era, our next step on the path toward AGI. ⚡ Gemini 3 is our most intelligent model that combines capabilities like multimodality, long context and reasoning, so you can bring any idea to life. Explore more of what you can do and build with Gemini 3 🧵⬇️

Today we entered the Gemini 3 era, our next step on the path toward AGI. ⚡ Gemini 3 is our most intelligent model that combines capabilities like multimodality, long context and reasoning, so you can bring any idea to life. Explore more of what you can do and build with Gemini 3 🧵⬇️

Google

90,789 просмотров • 7 месяцев назад