Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Problem: AI coding performance dips when context windows exceed 50% Solution: Combine Cline's context window awareness with the `new_task` tool + .clinerules to create a workflow that autonomously hands off tasks before hitting limits, ensuring persistent memory. A guide: 🧵

Cline

52,457 subscribers

62,196 views • 1 year ago •via X (Twitter)

News & Politics Science & Technology Education

Anya Rossi• Live Now

Private livecam show

11 Comments

Cline1 year ago

Large context windows aren't a silver bullet. Models can still struggle or "forget" past ~50% usage, degrading performance. Plus, manually re-explaining project context each time you restart is a major workflow killer.

Cline1 year ago

The first piece is awareness: Cline is aware of its own context window usage (visible in `environment_details`). It knows how much "memory" is being used relative to the model's limit (e.g., 105k/200k tokens = 53%).

Cline1 year ago

The second piece is the `new_task` tool. This allows Cline to cleanly end the current session and immediately start a fresh one, crucially preloading it with specific context you define (summaries, next steps, file states, etc.).

Cline1 year ago

The magic happens when you combine these in `.clinerules`. You define the trigger (e.g., "if context 50%, propose handoff") and exactly what context Cline should package using `new_task`. This creates an automated, proactive context management workflow.

Cline1 year ago

The outcome? Cline intelligently manages its own context before performance degrades. No more manual resets or tedious re-explaining. For complex, multi-session tasks, it feels like working with an agent that has persistent memory.

Cline1 year ago

Ready to build workflows that beat context limits? Learn how to implement this with `.clinerules` and the `new_task` tool in our docs:

NICE1 year ago

Stay competitive by balancing cutting-edge AI with automation tools. Forrester shows how.

Cline1 year ago

Try Cline today 👇

Jonathan Chang1 year ago

this is very cool. To solve this same issue, I created time travel tool to allow agent to partially clear the conversation and summarize it. I think it allow a more flexible way to manage context and make continuation more seamless.

Dexter1 year ago

Just curious: how are you measuring context windows? Simply token count input from the user?

Dan1 year ago

cline just keeps consistently delivering. no wonder it's my go-to.

Related Videos

LLMs need focus, because attention isn't enough. (thread on the Focus Chain, Cline's link to persistent context)

LLMs need focus, because attention isn't enough. (thread on the Focus Chain, Cline's link to persistent context)

Cline

26,570 views • 11 months ago

1/ The Model Context Protocol isn't just another dev tool - it's letting AI assistants break free from chat windows to directly manage your Git repos, run tests, and maintain project memory. Here's why MCP is a game-changer 🧵

1/ The Model Context Protocol isn't just another dev tool - it's letting AI assistants break free from chat windows to directly manage your Git repos, run tests, and maintain project memory. Here's why MCP is a game-changer 🧵

Cline

50,058 views • 1 year ago

Question we get constantly: "How does Cline handle context limits in long-running tasks?" Here's how users can manage their context: /newtask: Creates a detailed handoff summary and starts fresh context. Like handing off work to a new engineer with full background. more 👇

Question we get constantly: "How does Cline handle context limits in long-running tasks?" Here's how users can manage their context: /newtask: Creates a detailed handoff summary and starts fresh context. Like handing off work to a new engineer with full background. more 👇

Cline

21,739 views • 1 year ago

Devs are using Memory Bank as a project architecture and planning tool before they write any code. "I prefer starting in Cline, using Memory Bank to build out the context files as a roadmap, and letting Cline take the wheel." How to use Memory Bank for project planning 🧵

Devs are using Memory Bank as a project architecture and planning tool before they write any code. "I prefer starting in Cline, using Memory Bank to build out the context files as a roadmap, and letting Cline take the wheel." How to use Memory Bank for project planning 🧵

Cline

96,628 views • 1 year ago

AI coding agents hit a wall when codebases get massive. Even with 2M token context windows, a 10M line codebase needs 100M tokens. The real bottleneck isn't just ingesting code - it's getting models to actually pay attention to all that context effectively.

AI coding agents hit a wall when codebases get massive. Even with 2M token context windows, a 10M line codebase needs 100M tokens. The real bottleneck isn't just ingesting code - it's getting models to actually pay attention to all that context effectively.

Garry Tan

976,531 views • 1 year ago

FYI: You can use Cline as an orchestrator to spawn Claude Code subagents that run tasks in parallel. "spawn 3 subagents to investigate cline's repo and have them create a single report on API providers, cline's tools, and how cline uses the browser"

FYI: You can use Cline as an orchestrator to spawn Claude Code subagents that run tasks in parallel. "spawn 3 subagents to investigate cline's repo and have them create a single report on API providers, cline's tools, and how cline uses the browser"

Cline

33,047 views • 1 year ago

Tired of your AI assistant getting amnesia? Our community built something fascinating: 🧠 Memory Bank. It's like giving your AI a perfectly organized brain. Through custom instructions, Cline maintains a living documentation system that rebuilds context after every reset. The secret sauce? .clinerules - a learning journal where Cline captures patterns, preferences, and project intelligence. No more repeating yourself. Watch it remember everything across sessions👇

Tired of your AI assistant getting amnesia? Our community built something fascinating: 🧠 Memory Bank. It's like giving your AI a perfectly organized brain. Through custom instructions, Cline maintains a living documentation system that rebuilds context after every reset. The secret sauce? .clinerules - a learning journal where Cline captures patterns, preferences, and project intelligence. No more repeating yourself. Watch it remember everything across sessions👇

Cline

36,232 views • 1 year ago

Introducing PAI 1.0: The First Version of Personal Super Intelligence PAI is the world's first context-aware, widget based AI agent that floats over your screen to read your context in real time and proactively suggest next steps, ensuring safety by executing actions only with your explicit approval. The floating AI interface designed to eliminate the prompt inequality and barrier. Structurally eliminating the "navigate to a chat window, write a prompt, explain your context" workflow that existing AI services have treated as a given for years.

Introducing PAI 1.0: The First Version of Personal Super Intelligence PAI is the world's first context-aware, widget based AI agent that floats over your screen to read your context in real time and proactively suggest next steps, ensuring safety by executing actions only with your explicit approval. The floating AI interface designed to eliminate the prompt inequality and barrier. Structurally eliminating the "navigate to a chat window, write a prompt, explain your context" workflow that existing AI services have treated as a given for years.

Eon (Yulheon) Seung

130,825 views • 1 month ago

Developers: MCP is about to become extremely relevant as AI needs to connect with real-world data. The problem? Most don't know how to build MCP plugins yet. Here's how to build them with Cline in 3 simple steps using our .clinerules protocol. Thread 🧵

Developers: MCP is about to become extremely relevant as AI needs to connect with real-world data. The problem? Most don't know how to build MCP plugins yet. Here's how to build them with Cline in 3 simple steps using our .clinerules protocol. Thread 🧵

Cline

189,177 views • 1 year ago

New short course: LLMs as Operating Systems: Agent Memory, created with Letta, and taught by its founders Charles Packer and Sarah Wooders. An LLM's input context window has limited space. Using a longer input context also costs more and results in slower processing. So, managing what's stored in this context window is important. In the innovative paper MemGPT: Towards LLMs as Operating Systems, its authors (which include the instructors) proposed using an LLM agent to manage this context window. Their system uses a large persistent memory that stores everything that could be included in the input context, and an agent decides what is actually included. Take the example of building a chatbot that needs to remember what's been said earlier in a conversation (perhaps over many days of interaction with a user). As the conversation's length grows, the memory management agent will move information from the input context to a persistent searchable database; summarize information to keep relevant facts in the input context; and restore relevant conversation elements from further back in time. This allows a chatbot to keep what's currently most relevant in its input context memory to generate the next response. When I read the original MemGPT paper, I thought it was an innovative technique for handling memory for LLMs. The open-source Letta framework, which we'll use in this course, makes MemGPT easy to implement. It adds memory to your LLM agents and gives them transparent long-term memory. In detail, you’ll learn: - How to build an agent that can edit its own limited input context memory, using tools and multi-step reasoning - What is a memory hierarchy (an idea from computer operating systems, which use a cache to speed up memory access), and how these ideas apply to managing the LLM input context (where the input context window is a "cache" storing the most relevant information; and an agent decides what to move in and out of this to/from a larger persistent storage system) - How to implement multi-agent collaboration by letting different agents share blocks of memory This course will give you a sophisticated understanding of memory management for LLMs, which is important for chatbots having long conversations, and for complex agentic workflows. Please sign up here!

New short course: LLMs as Operating Systems: Agent Memory, created with Letta, and taught by its founders Charles Packer and Sarah Wooders. An LLM's input context window has limited space. Using a longer input context also costs more and results in slower processing. So, managing what's stored in this context window is important. In the innovative paper MemGPT: Towards LLMs as Operating Systems, its authors (which include the instructors) proposed using an LLM agent to manage this context window. Their system uses a large persistent memory that stores everything that could be included in the input context, and an agent decides what is actually included. Take the example of building a chatbot that needs to remember what's been said earlier in a conversation (perhaps over many days of interaction with a user). As the conversation's length grows, the memory management agent will move information from the input context to a persistent searchable database; summarize information to keep relevant facts in the input context; and restore relevant conversation elements from further back in time. This allows a chatbot to keep what's currently most relevant in its input context memory to generate the next response. When I read the original MemGPT paper, I thought it was an innovative technique for handling memory for LLMs. The open-source Letta framework, which we'll use in this course, makes MemGPT easy to implement. It adds memory to your LLM agents and gives them transparent long-term memory. In detail, you’ll learn: - How to build an agent that can edit its own limited input context memory, using tools and multi-step reasoning - What is a memory hierarchy (an idea from computer operating systems, which use a cache to speed up memory access), and how these ideas apply to managing the LLM input context (where the input context window is a "cache" storing the most relevant information; and an agent decides what to move in and out of this to/from a larger persistent storage system) - How to implement multi-agent collaboration by letting different agents share blocks of memory This course will give you a sophisticated understanding of memory management for LLMs, which is important for chatbots having long conversations, and for complex agentic workflows. Please sign up here!

Andrew Ng

200,788 views • 1 year ago

New in Cline 3.7.0: The .clinerules folder -- a better way to organize your project rules. Here's how you can implement a modular rules system that transforms how Cline understands your projects: 🧵

New in Cline 3.7.0: The .clinerules folder -- a better way to organize your project rules. Here's how you can implement a modular rules system that transforms how Cline understands your projects: 🧵

Cline

67,268 views • 1 year ago

In this episode, Beyang and Thorsten discuss strategies for effective agentic coding, including the 101 of how it's different from coding with chat LLMs, the key constraint of the context window, how and where subagents can help, and the new oracle subagent which combines multiple LLMs. 00:38 Intros 03:20 How coding with agents is very different from coding with prior AI tools 10:31 Example: fix a simple issue 14:13 Example: debugging an issue with an MCP server 21:50 Example: unifying two build scripts 25:09 How the context window is a key constraint 31:01 Why it's best to focus on one thing at a time 33:09 Subagents and context windows 33:49 The codebase search subagent 38:33 General-purpose subagents 44:05 When to use subagents 46:44 The oracle subagent and o3 51:32 Multi-model agents

In this episode, Beyang and Thorsten discuss strategies for effective agentic coding, including the 101 of how it's different from coding with chat LLMs, the key constraint of the context window, how and where subagents can help, and the new oracle subagent which combines multiple LLMs. 00:38 Intros 03:20 How coding with agents is very different from coding with prior AI tools 10:31 Example: fix a simple issue 14:13 Example: debugging an issue with an MCP server 21:50 Example: unifying two build scripts 25:09 How the context window is a key constraint 31:01 Why it's best to focus on one thing at a time 33:09 Subagents and context windows 33:49 The codebase search subagent 38:33 General-purpose subagents 44:05 When to use subagents 46:44 The oracle subagent and o3 51:32 Multi-model agents

Amp — Research Preview

24,534 views • 1 year ago

🚨 KEEP CALM AND AI - AUTONOMOUS AGENTS WITH INFINITE MEMORY We are launching the ability to create arbitrary agents that run on schedule and have access to a persistent and infinite memory The agents will be able to store, retrieve, and update information across sessions and perform repetitive tasks that require persistent memory Another step towards AGI and automating white-collar work 🚀🚀🚀

🚨 KEEP CALM AND AI - AUTONOMOUS AGENTS WITH INFINITE MEMORY We are launching the ability to create arbitrary agents that run on schedule and have access to a persistent and infinite memory The agents will be able to store, retrieve, and update information across sessions and perform repetitive tasks that require persistent memory Another step towards AGI and automating white-collar work 🚀🚀🚀

Bindu Reddy

12,626 views • 5 months ago

CHINA JUST DROPPED AN AI CODING MODEL WITH A 1M CONTEXT WINDOW. And I connected it to Claude Code to see what it could actually do. Meet GLM-X Preview On paper, a few things immediately stood out: → 1M context window → Agentic coding capabilities → Works inside Claude Code → Designed for large-scale coding and reasoning workflows But specs don't matter much if the model can't deliver in practice. So I gave it a real-world task. THE TEST One prompt: > Build a modern AI lead generation dashboard using React and Tailwind CSS. Requirements: → Dark mode → Analytics dashboard → Lead table → Email outreach section → Responsive design → Production-ready component structure Instead of generating a few snippets, it planned the architecture, generated the dashboard components, created the Tailwind configuration, and walked through the implementation requirements. What impressed me most wasn't the code itself. It was how well it maintained context throughout the workflow. That's where a 1M context window starts becoming useful. Less time re-explaining requirements. Less context loss. More room for complex projects. The AI coding race is getting very interesting. And it's no longer just GPT, Claude, and Gemini competing for attention. Results from my test below 👇

CHINA JUST DROPPED AN AI CODING MODEL WITH A 1M CONTEXT WINDOW. And I connected it to Claude Code to see what it could actually do. Meet GLM-X Preview On paper, a few things immediately stood out: → 1M context window → Agentic coding capabilities → Works inside Claude Code → Designed for large-scale coding and reasoning workflows But specs don't matter much if the model can't deliver in practice. So I gave it a real-world task. THE TEST One prompt: > Build a modern AI lead generation dashboard using React and Tailwind CSS. Requirements: → Dark mode → Analytics dashboard → Lead table → Email outreach section → Responsive design → Production-ready component structure Instead of generating a few snippets, it planned the architecture, generated the dashboard components, created the Tailwind configuration, and walked through the implementation requirements. What impressed me most wasn't the code itself. It was how well it maintained context throughout the workflow. That's where a 1M context window starts becoming useful. Less time re-explaining requirements. Less context loss. More room for complex projects. The AI coding race is getting very interesting. And it's no longer just GPT, Claude, and Gemini competing for attention. Results from my test below 👇

Md Riyazuddin

31,199 views • 1 month ago

Kanban isn't a project management tool. It's a prompt-to-code pipeline. 20 copy-paste prompts that decompose into linked tasks, fan out parallel agents, and ship committed code. All 20: npm i -g cline 🧵

Kanban isn't a project management tool. It's a prompt-to-code pipeline. 20 copy-paste prompts that decompose into linked tasks, fan out parallel agents, and ship committed code. All 20: npm i -g cline 🧵

Cline

34,257 views • 3 months ago

"Cline for Slides" We made a .clinerules file for making slides with Slidev -- now Cline can build your decks. Here's how we turned a YouTube video into a deck: 🧵

"Cline for Slides" We made a .clinerules file for making slides with Slidev -- now Cline can build your decks. Here's how we turned a YouTube video into a deck: 🧵

Cline

28,327 views • 1 year ago

Claude Code With UNLIMITED Memory! Solves Claude's Memory Problem! It’s called Claude-Mem, and it lets Claude remember your work across sessions. ⚡ Slash token usage by up to 95% every time you start a session. 🔧 Unlock the ability to make 20× more tool calls before hitting limits. My Video:

Claude Code With UNLIMITED Memory! Solves Claude's Memory Problem! It’s called Claude-Mem, and it lets Claude remember your work across sessions. ⚡ Slash token usage by up to 95% every time you start a session. 🔧 Unlock the ability to make 20× more tool calls before hitting limits. My Video:

WorldofAI

28,687 views • 5 months ago