
ollama
@ollama • 159,525 subscribers
https://t.co/1JpLwJ9Bdv
Shorts
Videos

Ollama 0.17 makes it much simpler to use open models with OpenClaw🦞 Try it with: ollama launch openclaw Tutorial post in 🧵
ollama201,293 просмотров • 3 месяцев назад

🤯 Wow! In one prompt Qwen3-Coder-Next generated a fully working flappy birds game in HTML. (0:05) Claude Code with Qwen3-Coder-Next (0:26) Shows the game running Run it fully locally: ollama pull qwen3-coder-next Ollama's cloud if you can't run it locally: ollama pull qwen3-coder-next:cloud Try launching it with Claude Code using ollama launch (link to play 🧵) So cool! Qwen Tongyi Lab Junyang Lin
ollama117,339 просмотров • 3 месяцев назад

Ollama now supports subagents and web search in Claude Code! Subagents can run tasks in parallel, such as file search, code exploration, and research, each in their own context. No MCP servers to configure or API keys required. Try it with any model on Ollama's cloud: ollama launch claude --model minimax-m2.5:cloud
ollama83,979 просмотров • 3 месяцев назад

Ollama v0.8 is here! Now it can stream responses with tool calling! Example of Ollama doing web search:
ollama148,171 просмотров • 1 год назад

Ollama 0.2 is here! Concurrency is now enabled by default. This unlocks 2 major features: Parallel requests Ollama can now serve multiple requests at the same time, using only a little bit of additional memory for each request. This enables use cases such as: - Handling multiple chat sessions at the same time - Hosting code completion LLMs for your team - Processing different parts of a document simultaneously - Running multiple agents at the same time Run multiple models Ollama now supports loading different models at the same time. This improves several use cases: - Retrieval Augmented Generation (RAG): both the embedding and text completion models can be loaded into memory simultaneously. - Agents: multiple versions of an agent can now run simultaneously - Running large and small models side-by-side Models are automatically loaded and unloaded based on requests and how much GPU memory is available.
ollama219,337 просмотров • 1 год назад

Ollama can now think! 🤔🤔🤔 For thinking models, and especially useful for very thoughtful models like DeepSeek-R1-0528, Ollama can separate the thoughts and the response. Thinking can also be disabled! This is useful for getting a direct response. This works across Ollama's CLI, API, and Python/JavaScript libraries. 🧵 blog post 👇👇👇
ollama106,183 просмотров • 1 год назад
Больше нет контента для загрузки
