Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

Introducing DeepThought-8B: Transparent reasoning model built on LLaMA-3.1 with test-time compute scaling. - JSON-structured thought chains & controllable inference paths. - ~16GB VRAM, competitive w/ 70B models. - Open model weights, and inference scripts.

Ruliad

1,351 subscribers

219,315 просмотров • 1 год назад •via X (Twitter)

Образование Новости и политика Наука и технологии

Anya Rossi• Live Now

Private livecam show

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

Llama 2: Now on Hugging Chat 🤗🦙 Try out the 70B Chat model for free with super fast inference, web search, and powered by open-source tools! 👉

Llama 2: Now on Hugging Chat 🤗🦙 Try out the 70B Chat model for free with super fast inference, web search, and powered by open-source tools! 👉

Hugging Face

403,519 просмотров • 2 лет назад

`transformers` + `torchao` quantization + `torch.compile` for faster inference speed and less memory usage 🔥 Demo of "meta-llama/Meta-Llama-3.1-8B-Instruct" quantized in 4-bit weight-only :

`transformers` + `torchao` quantization + `torch.compile` for faster inference speed and less memory usage 🔥 Demo of "meta-llama/Meta-Llama-3.1-8B-Instruct" quantized in 4-bit weight-only :

Marc Sun

24,515 просмотров • 1 год назад

starting the week with a true groundbreaking work 💥 Large Language Diffusion Models the first billion-parameter scale diffusion model competitive with its pairs (8B model comparable to LLaMA 3 8B) it gets rid of the michael scott syndrome on existing LLMs

starting the week with a true groundbreaking work 💥 Large Language Diffusion Models the first billion-parameter scale diffusion model competitive with its pairs (8B model comparable to LLaMA 3 8B) it gets rid of the michael scott syndrome on existing LLMs

apolinario 🌐

11,833 просмотров • 1 год назад

First came pre-training scaling; then came inference-time scaling. Now comes judge-time scaling. Despite progress in AI through scaled inference-time compute, AI remains unreliable in open-ended, non-verifiable domains. The key limitation is not generation—it is evaluation. Therefore, the next big leap for AI comes from better judging. In service of this future, today we release Verdict, a library for scaling judge-time compute.

First came pre-training scaling; then came inference-time scaling. Now comes judge-time scaling. Despite progress in AI through scaled inference-time compute, AI remains unreliable in open-ended, non-verifiable domains. The key limitation is not generation—it is evaluation. Therefore, the next big leap for AI comes from better judging. In service of this future, today we release Verdict, a library for scaling judge-time compute.

Leonard Tang

111,298 просмотров • 1 год назад

Introducing 𝗦𝘂𝗽𝗲𝗿 𝗝𝗦𝗢𝗡 𝗠𝗼𝗱𝗲, a framework for low latency structured output generation from LLMs. Generate JSON up to 𝟮𝟬𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 from OpenAI and open source models. ❌ No need to threaten the model, tip the AI, etc ❌ Built with Alex Derhacobian 🔧 🧵👇

Introducing 𝗦𝘂𝗽𝗲𝗿 𝗝𝗦𝗢𝗡 𝗠𝗼𝗱𝗲, a framework for low latency structured output generation from LLMs. Generate JSON up to 𝟮𝟬𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 from OpenAI and open source models. ❌ No need to threaten the model, tip the AI, etc ❌ Built with Alex Derhacobian 🔧 🧵👇

Varun Shenoy

166,119 просмотров • 2 лет назад

Introducing ✨ Aya Vision ✨ - an open-weights model to connect our world through language and vision Aya Vision adds breakthrough multimodal capabilities to our state-of-the-art multilingual 8B and 32B models. 🌿

Introducing ✨ Aya Vision ✨ - an open-weights model to connect our world through language and vision Aya Vision adds breakthrough multimodal capabilities to our state-of-the-art multilingual 8B and 32B models. 🌿

Cohere Labs

206,502 просмотров • 1 год назад

You can now run inference directly on the Llama 4 Hugging Face model page – powered by Together AI!

You can now run inference directly on the Llama 4 Hugging Face model page – powered by Together AI!

Together AI

21,489 просмотров • 1 год назад

Laika AI x Inference Labs Excited to announce our partnership with Inference Labs We're providing our real-time RAG & AI model API to Inference Labs, powering their verification infrastructure with live blockchain data. Inference Labs delivers open-source, trustless verification for AI agent outputs, so you can trust what you see—without relying on centralized gatekeepers.

Laika AI x Inference Labs Excited to announce our partnership with Inference Labs We're providing our real-time RAG & AI model API to Inference Labs, powering their verification infrastructure with live blockchain data. Inference Labs delivers open-source, trustless verification for AI agent outputs, so you can trust what you see—without relying on centralized gatekeepers.

Laika AI

13,727 просмотров • 1 год назад

NVIDIA Nemotron 3 Nano Omni, a new multimodal reasoning model, is now live on Jetson AI Lab and unifies vision, audio, and language into a single reasoning loop. 🙌 Power your NemoClaws by running this model with Ollama, vLLM and other inference frameworks on NVIDIA Jetson hardware. Try it ➡️

NVIDIA Nemotron 3 Nano Omni, a new multimodal reasoning model, is now live on Jetson AI Lab and unifies vision, audio, and language into a single reasoning loop. 🙌 Power your NemoClaws by running this model with Ollama, vLLM and other inference frameworks on NVIDIA Jetson hardware. Try it ➡️

NVIDIA Robotics

15,828 просмотров • 1 месяц назад

The easiest way to use this new model is through HuggingChat with the link below. Just create a free account and select the model “nvidia/Llama-3.1-Nemotron-70B-Instruct-HF”. And you're ready to start chatting!

The easiest way to use this new model is through HuggingChat with the link below. Just create a free account and select the model “nvidia/Llama-3.1-Nemotron-70B-Instruct-HF”. And you're ready to start chatting!

Paul Couvert

81,620 просмотров • 1 год назад

Llama 3.1 Nemotron 70B is the latest model from NVIDIA, released only a few hours ago. Initial testing shows the model outperforms GPT-4o and Sonnet 3.5 on several benchmarks. Try it on Akash Chat for free:

Llama 3.1 Nemotron 70B is the latest model from NVIDIA, released only a few hours ago. Initial testing shows the model outperforms GPT-4o and Sonnet 3.5 on several benchmarks. Try it on Akash Chat for free:

Akash Network

38,472 просмотров • 1 год назад

Introducing Alpamayo 1.5. Based on community feedback, we’ve updated our 10B-parameter chain-of-thought reasoning VLA model to be a more interactive and steerable engine for autonomous vehicle development. Built on the Cosmos-Reason2 VLM backbone, this release adds support for navigation guidance and flexible camera configurations while providing new post-training scripts for model adaptation. 🤗 Learn more:

Introducing Alpamayo 1.5. Based on community feedback, we’ve updated our 10B-parameter chain-of-thought reasoning VLA model to be a more interactive and steerable engine for autonomous vehicle development. Built on the Cosmos-Reason2 VLM backbone, this release adds support for navigation guidance and flexible camera configurations while providing new post-training scripts for model adaptation. 🤗 Learn more:

NVIDIA DRIVE

50,693 просмотров • 2 месяцев назад

MolmoAct2 is landing in LeRobot! Ai2's open Action Reasoning Model combines a Molmo2-ER vision-language backbone with a flow-matching continuous action expert to predict robot action chunks from images, language instructions, and proprioceptive state. An open robot foundation model built for real-world control, with strong out-of-the-box performance and easy fine-tuning in LeRobot. Pick-and-place inference running on NVIDIA DGX Spark! Blog: Paper: Thanks to Ai2 Jiafei Duan Haoquan Fang

MolmoAct2 is landing in LeRobot! Ai2's open Action Reasoning Model combines a Molmo2-ER vision-language backbone with a flow-matching continuous action expert to predict robot action chunks from images, language instructions, and proprioceptive state. An open robot foundation model built for real-world control, with strong out-of-the-box performance and easy fine-tuning in LeRobot. Pick-and-place inference running on NVIDIA DGX Spark! Blog: Paper: Thanks to Ai2 Jiafei Duan Haoquan Fang

LeRobot

24,456 просмотров • 22 дней назад

PRIMA.CPP Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

PRIMA.CPP Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

AK

48,241 просмотров • 1 год назад

🤖 ExoBrain is a compute device built for large-scale AI collaboration, supporting model training, inference, and long-running Agents With a standardized architecture and continuous execution capability, it delivers efficient, reliable compute for AI systems ⚡ From one-time calls to continuous operation, ExoBrain enables AI to execute persistently, produce ongoing outputs, and generate long-term value #ExoBrain #AICompute #AIAgents #AGI

🤖 ExoBrain is a compute device built for large-scale AI collaboration, supporting model training, inference, and long-running Agents With a standardized architecture and continuous execution capability, it delivers efficient, reliable compute for AI systems ⚡ From one-time calls to continuous operation, ExoBrain enables AI to execute persistently, produce ongoing outputs, and generate long-term value #ExoBrain #AICompute #AIAgents #AGI

ExoBrain

21,529 просмотров • 3 месяцев назад

We just shipped support for tool use and JSON mode for DeepSeek R1 Distil-Llama 70B on Groq Inc. 🛠️ The coolest part is that our API now has a `reasoning_format` parameter that lets you control how the model outputs its thought process.

We just shipped support for tool use and JSON mode for DeepSeek R1 Distil-Llama 70B on Groq Inc. 🛠️ The coolest part is that our API now has a `reasoning_format` parameter that lets you control how the model outputs its thought process.

Hatice Ozen

43,225 просмотров • 1 год назад

$Llama 3.3 70B is live on AkashChat. The latest state-of-the-art multilingual AI model released by Meta is as performant as Llama 3.1 405B at a fraction of the size. Try it today:$

Llama 3.3 70B is live on AkashChat. The latest state-of-the-art multilingual AI model released by Meta is as performant as Llama 3.1 405B at a fraction of the size. Try it today:

Akash Network

16,463 просмотров • 1 год назад

🔥🔥🔥We’ve been listening to your feedback! Our latest world model HY-World 1.5 just got a major upgrade to make world generation more accessible than ever: 🛠️ Open Training Code: Fully customizable code for building and training your own models. ⚡ Accelerated Inference: Turbocharged speed and optimized VRAM for real-time interaction. 📉 Lite 5B Model: A new lightweight model that fits into small-VRAM GPUs. 🙌 Zero Waitlist: Our online app is now fully open to everyone—no application required. This is just the beginning. HY-World is building the future of spatial intelligence—open, accessible, and community-driven. 🕹️ Play now: ⭐ GitHub:

🔥🔥🔥We’ve been listening to your feedback! Our latest world model HY-World 1.5 just got a major upgrade to make world generation more accessible than ever: 🛠️ Open Training Code: Fully customizable code for building and training your own models. ⚡ Accelerated Inference: Turbocharged speed and optimized VRAM for real-time interaction. 📉 Lite 5B Model: A new lightweight model that fits into small-VRAM GPUs. 🙌 Zero Waitlist: Our online app is now fully open to everyone—no application required. This is just the beginning. HY-World is building the future of spatial intelligence—open, accessible, and community-driven. 🕹️ Play now: ⭐ GitHub:

Tencent Hy

20,581 просмотров • 5 месяцев назад

Enterprise wants privacy. Builders want flexibility. Users want speed. Nesa gives all three in one place. Encrypted inference. Low hardware requirements. Instant model routing. Test the new models in an environment that protects you. ⬟

Enterprise wants privacy. Builders want flexibility. Users want speed. Nesa gives all three in one place. Encrypted inference. Low hardware requirements. Instant model routing. Test the new models in an environment that protects you. ⬟

Nesa

52,336 просмотров • 6 месяцев назад

First steps for a specialized DeepSeek v4 Flash inference engine focused on inference quality / stability at different quantizations, with networked API that is batching capable. This is the 2 bit quants model running on my M3 Max 128GB.

First steps for a specialized DeepSeek v4 Flash inference engine focused on inference quality / stability at different quantizations, with networked API that is batching capable. This is the 2 bit quants model running on my M3 Max 128GB.

antirez

14,159 просмотров • 1 месяц назад