
ollama
@ollama • 159,525 subscribers
https://t.co/1JpLwJ9Bdv
Shorts
Videos

🤯 Wow! In one prompt Qwen3-Coder-Next generated a fully working flappy birds game in HTML. (0:05) Claude Code with Qwen3-Coder-Next (0:26) Shows the game running Run it fully locally: ollama pull qwen3-coder-next Ollama's cloud if you can't run it locally: ollama pull qwen3-coder-next:cloud Try launching it with Claude Code using ollama launch (link to play 🧵) So cool! Qwen Tongyi Lab Junyang Lin
ollama117,339 Aufrufe • vor 3 Monaten

Ollama now supports subagents and web search in Claude Code! Subagents can run tasks in parallel, such as file search, code exploration, and research, each in their own context. No MCP servers to configure or API keys required. Try it with any model on Ollama's cloud: ollama launch claude --model minimax-m2.5:cloud
ollama83,979 Aufrufe • vor 3 Monaten

Ollama v0.8 is here! Now it can stream responses with tool calling! Example of Ollama doing web search:
ollama148,171 Aufrufe • vor 1 Jahr

Ollama 0.2 is here! Concurrency is now enabled by default. This unlocks 2 major features: Parallel requests Ollama can now serve multiple requests at the same time, using only a little bit of additional memory for each request. This enables use cases such as: - Handling multiple chat sessions at the same time - Hosting code completion LLMs for your team - Processing different parts of a document simultaneously - Running multiple agents at the same time Run multiple models Ollama now supports loading different models at the same time. This improves several use cases: - Retrieval Augmented Generation (RAG): both the embedding and text completion models can be loaded into memory simultaneously. - Agents: multiple versions of an agent can now run simultaneously - Running large and small models side-by-side Models are automatically loaded and unloaded based on requests and how much GPU memory is available.
ollama219,337 Aufrufe • vor 1 Jahr

Ollama can now think! 🤔🤔🤔 For thinking models, and especially useful for very thoughtful models like DeepSeek-R1-0528, Ollama can separate the thoughts and the response. Thinking can also be disabled! This is useful for getting a direct response. This works across Ollama's CLI, API, and Python/JavaScript libraries. 🧵 blog post 👇👇👇
ollama106,183 Aufrufe • vor 1 Jahr
Keine weiteren Inhalte verfügbar
