正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

🔥 another win for open source AI SNOWFLAKE ARCTIC: 1] Mixture of Experts model with 128 (!) experts 2] open weights, open code 3] Company is sharing "cookbook" and data and teaching people how to build their own world class MoE models. 4] very cheap model to train (16x... show more

Wes Roth

35,717 subscribers

10,233 次观看 • 2 年前 •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

1 条评论

Saquib Mehmood 的头像

Saquib Mehmood2 年前

Well, I am excited to hear this. Let's see how it works out.

相关视频

LETS GOO! Generate full songs (4 min) with vocals in less than 10 seconds - open weights model! 🔥 It's crazy how much you can achieve from open models as of today! - going right for Suno and likes! VAE + Base model combined is < 2.5GB Open weights on the hub and a space to play around with it! 🤯

LETS GOO! Generate full songs (4 min) with vocals in less than 10 seconds - open weights model! 🔥 It's crazy how much you can achieve from open models as of today! - going right for Suno and likes! VAE + Base model combined is < 2.5GB Open weights on the hub and a space to play around with it! 🤯

Vaibhav (VB) Srivastav

63,641 次观看 • 1 年前

this is why open-source wins every time.. open weights allow you to invent new model capabilities, train + generate locally (crucial for NDA'd and personal data), push the limits of what's currently possible + lean into a huge community of experts for advice and inspiration 🖤

this is why open-source wins every time.. open weights allow you to invent new model capabilities, train + generate locally (crucial for NDA'd and personal data), push the limits of what's currently possible + lean into a huge community of experts for advice and inspiration 🖤

Ingi Erlingsson 🪄

16,819 次观看 • 1 个月前

Introducing 0GM-1.0-35B-A3B. Our first proprietary AI model. Mixture of Experts (MoE), 35B parameters, 3B active per token. Trained on our own decentralized GPU network. Open source under Apache 2.0.

Introducing 0GM-1.0-35B-A3B. Our first proprietary AI model. Mixture of Experts (MoE), 35B parameters, 3B active per token. Trained on our own decentralized GPU network. Open source under Apache 2.0.

0G Labs (Home of Infinite AI)

48,649 次观看 • 1 个月前

Introducing "Building with Llama 4." This short course is created with Meta AI at Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced the Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here:

Introducing "Building with Llama 4." This short course is created with Meta AI at Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced the Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here:

Andrew Ng

67,587 次观看 • 1 年前

🆕 How to run (and finetune) open source AI models with a simple API! In 5 mins, I go over how to: ◆ Generate text with DeepSeek R1 & Llama 3 ◆ Generate code with Qwen on LlamaCoder ◆ Generate images with Flux on BlinkShot ◆ Finetune a model on your own data & run it

🆕 How to run (and finetune) open source AI models with a simple API! In 5 mins, I go over how to: ◆ Generate text with DeepSeek R1 & Llama 3 ◆ Generate code with Qwen on LlamaCoder ◆ Generate images with Flux on BlinkShot ◆ Finetune a model on your own data & run it

Hassan

30,236 次观看 • 1 年前

Exciting News from Open-Sora! 🚀 They've just made the ENTIRE suite of their video-generation model open source! Dive into the world of cutting-edge AI with access to model weights, comprehensive training source code, and detailed architecture insights. Start building your dream video-generation model today! Check it out 👉

Exciting News from Open-Sora! 🚀 They've just made the ENTIRE suite of their video-generation model open source! Dive into the world of cutting-edge AI with access to model weights, comprehensive training source code, and detailed architecture insights. Start building your dream video-generation model today! Check it out 👉

Yang You

245,732 次观看 • 2 年前

Jensen Huang open-sourced NVIDIA's flagship AI model, its weights, its data, AND how they created it. "We open sourced the models," Huang says. "We open sourced the weights." "We open sourced the data." "We open sourced how we created it." Four layers of openness in one model release. "Open source is fundamentally necessary for many industries to join the AI revolution." "NVIDIA has the scale and the motivation to build and continue to build these AI models for as long as we shall live." The chip seller benefits when every model is open. NVIDIA is the chip seller. P.S. I made a playbook breaking down 100+ most powerful decision making mental models used by history's greatest thinkers. 5,000+ downloads. 113 five-star reviews. Grab a free copy here: If you're new here, follow GeniusThinking for content on the greatest minds in economics, psychology, and history. — Jensen Huang ( NVIDIA ), NVIDIA CEO, on Lex Fridman's ( Lex Fridman ) podcast

Jensen Huang open-sourced NVIDIA's flagship AI model, its weights, its data, AND how they created it. "We open sourced the models," Huang says. "We open sourced the weights." "We open sourced the data." "We open sourced how we created it." Four layers of openness in one model release. "Open source is fundamentally necessary for many industries to join the AI revolution." "NVIDIA has the scale and the motivation to build and continue to build these AI models for as long as we shall live." The chip seller benefits when every model is open. NVIDIA is the chip seller. P.S. I made a playbook breaking down 100+ most powerful decision making mental models used by history's greatest thinkers. 5,000+ downloads. 113 five-star reviews. Grab a free copy here: If you're new here, follow GeniusThinking for content on the greatest minds in economics, psychology, and history. — Jensen Huang ( NVIDIA ), NVIDIA CEO, on Lex Fridman's ( Lex Fridman ) podcast

GeniusThinking

14,769 次观看 • 17 天前

1. Meta’s open-sourced multisensory model Meta is back (again!) with yet another exciting open-source project. Introducing ImageBind, a new AI research model that understands and combines text, audio, visual, movement, thermal, AND depth data.

1. Meta’s open-sourced multisensory model Meta is back (again!) with yet another exciting open-source project. Introducing ImageBind, a new AI research model that understands and combines text, audio, visual, movement, thermal, AND depth data.

Rowan Cheung

173,984 次观看 • 3 年前

1/ Introducing Lynx - the leading hallucination detection model 🚀👀 - Beats GPT-4o on hallucination tasks - Open source, open weights, open data - Excels in real-world domains like medicine and finance We are excited to launch Lynx with Day 1 integration partners: NVIDIA, MongoDB, and Nomic 🔥

1/ Introducing Lynx - the leading hallucination detection model 🚀👀 - Beats GPT-4o on hallucination tasks - Open source, open weights, open data - Excels in real-world domains like medicine and finance We are excited to launch Lynx with Day 1 integration partners: NVIDIA, MongoDB, and Nomic 🔥

PatronusAI

81,446 次观看 • 1 年前

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

Andrew Ng

162,798 次观看 • 2 年前

.Misha Laskin and Reflection have raised $2 billion to build America's next top open source AI model. This week Misha comes on the pod to talk DeepSeek, open weights and AI freedom for all Thanks, as always, to Brex and @e1ventures for backing the Core Memory pod

.Misha Laskin and Reflection have raised $2 billion to build America's next top open source AI model. This week Misha comes on the pod to talk DeepSeek, open weights and AI freedom for all Thanks, as always, to Brex and @e1ventures for backing the Core Memory pod

Ashlee Vance

14,471 次观看 • 7 个月前

ELON: CHINA HAS THE BEST OPEN SOURCE AI MODELS RIGHT NOW “The best open source models are generally from China, which is bizarre. I think the second best one, or maybe it’s better than second best, is Grok 2.5. The open source model is actually very good, and we will continue to open source our models.” Source: The All-In Podcast

ELON: CHINA HAS THE BEST OPEN SOURCE AI MODELS RIGHT NOW “The best open source models are generally from China, which is bizarre. I think the second best one, or maybe it’s better than second best, is Grok 2.5. The open source model is actually very good, and we will continue to open source our models.” Source: The All-In Podcast

Mario Nawfal

1,163,582 次观看 • 7 个月前

The future of AI is open-source. And ollama is the easiest way to build AI applications with open-source LLMs. Here's how to build a free, private RAG app using open-source tools. We'll use: - Ollama for LLMs and embedding models - PostgreSQL for data storage and retrieval - pgai Vectorizer for embedding creation and sync (I use Nomic for embeddings and tinnyllama as my LLM but you can substitute them for any models on Ollama)

The future of AI is open-source. And ollama is the easiest way to build AI applications with open-source LLMs. Here's how to build a free, private RAG app using open-source tools. We'll use: - Ollama for LLMs and embedding models - PostgreSQL for data storage and retrieval - pgai Vectorizer for embedding creation and sync (I use Nomic for embeddings and tinnyllama as my LLM but you can substitute them for any models on Ollama)

Avthar

34,261 次观看 • 1 年前

1/ Introducing Glider - the smallest model to beat GPT-4o-mini on eval tasks ⚡🚀 - Open source, open weights, open code - Explainable evaluations by nature - Trained on 183 criteria and 685 domains Try it out for free at 🔥

1/ Introducing Glider - the smallest model to beat GPT-4o-mini on eval tasks ⚡🚀 - Open source, open weights, open code - Explainable evaluations by nature - Trained on 183 criteria and 685 domains Try it out for free at 🔥

PatronusAI

14,842 次观看 • 1 年前

“don’t train your own model” is common ai advice. it's wrong. your token bill's the proof. today, we’re excited to launch castform into open preview. castform is the easiest way for you to train your own model, on your own data. open-weights models are performant and much cheaper. when trained on your task & proprietary data, they beat closed models. the thing standing between you and that was weeks of plumbing & years of ml expertise. with castform, model training is as simple as prompt engineering. castform bring your agent traces or raw corpora. castform turns it into training data, picks the right algorithmic recipes, manages gpus, and gives you an ide to watch and chat with your model as it learns. see what you can build with castform👇

“don’t train your own model” is common ai advice. it's wrong. your token bill's the proof. today, we’re excited to launch castform into open preview. castform is the easiest way for you to train your own model, on your own data. open-weights models are performant and much cheaper. when trained on your task & proprietary data, they beat closed models. the thing standing between you and that was weeks of plumbing & years of ml expertise. with castform, model training is as simple as prompt engineering. castform bring your agent traces or raw corpora. castform turns it into training data, picks the right algorithmic recipes, manages gpus, and gives you an ide to watch and chat with your model as it learns. see what you can build with castform👇

girish

447,223 次观看 • 12 天前

The upcoming OpenAI's open model is mixture of expert model. They take their proprietary training process and train a model with an model architecture that only contains publicly well known model arch layers & optimizations similar. This allows OpenAI to keep their internal model architecture an secret while they open weight an SOTA model. Thus, none of their long context optimizations needed for acutal real world production serving are in the open model architecture. OpenAI has created multiple different sizes of open weights distilled o3 including an 120B param that fits on a single node and 20B param that fits on a single chip.

The upcoming OpenAI's open model is mixture of expert model. They take their proprietary training process and train a model with an model architecture that only contains publicly well known model arch layers & optimizations similar. This allows OpenAI to keep their internal model architecture an secret while they open weight an SOTA model. Thus, none of their long context optimizations needed for acutal real world production serving are in the open model architecture. OpenAI has created multiple different sizes of open weights distilled o3 including an 120B param that fits on a single node and 20B param that fits on a single chip.

SemiAnalysis

80,891 次观看 • 10 个月前

“We have a crisis of open source models in the Western world. Outside of China, there are no good open source models. We don't even have any in the US now. The talent, the capital and the focus to be best-in-class at pre-training, mid-training, post-training is an extremely scarce skillset. The answer for a lot of countries may just be to take an open weights model from China, post-train or fine-tune their own version, and have that be what they start from." Everett Randle Why does the West have such poor open source models and is it a national security threat to rely on open-source Chinese models Demis Hassabis Yann LeCun Aidan Gomez Lilian Weng Jan Leike

“We have a crisis of open source models in the Western world. Outside of China, there are no good open source models. We don't even have any in the US now. The talent, the capital and the focus to be best-in-class at pre-training, mid-training, post-training is an extremely scarce skillset. The answer for a lot of countries may just be to take an open weights model from China, post-train or fine-tune their own version, and have that be what they start from." Everett Randle Why does the West have such poor open source models and is it a national security threat to rely on open-source Chinese models Demis Hassabis Yann LeCun Aidan Gomez Lilian Weng Jan Leike

Harry Stebbings

61,925 次观看 • 4 天前

NVIDIA just released a new open source transcription model, Nemotron Speech ASR, designed from the ground up for low-latency use cases like voice agents. Here's a voice agent built with this new model. 24ms transcription finalization and total voice-to-voice inference time under 500ms. This agent actually uses *three* NVIDIA open source models: - Nemotron Speech ASR - Nemotron 3 Nano 30GB in a 4-bit quant (released in December) - A preview checkpoint of the upcoming Magpie text-to-speech model These models are all truly open source: weights, training data, training code, and inference code. This is a big deal! Jensen said in the CES keynote yesterday that he expects open source models to catch up to proprietary models this year in a number of categories. NVIDIA is putting their weight behind making this happen. (As Alan Kay said, the best way to predict the future is to invent it.) The code for this agent is open source too, of course. You can deploy it to production with Modal and Pipecat AI cloud, or run locally on an NVIDIA DGX Spark or RTX 5090.

NVIDIA just released a new open source transcription model, Nemotron Speech ASR, designed from the ground up for low-latency use cases like voice agents. Here's a voice agent built with this new model. 24ms transcription finalization and total voice-to-voice inference time under 500ms. This agent actually uses three NVIDIA open source models: - Nemotron Speech ASR - Nemotron 3 Nano 30GB in a 4-bit quant (released in December) - A preview checkpoint of the upcoming Magpie text-to-speech model These models are all truly open source: weights, training data, training code, and inference code. This is a big deal! Jensen said in the CES keynote yesterday that he expects open source models to catch up to proprietary models this year in a number of categories. NVIDIA is putting their weight behind making this happen. (As Alan Kay said, the best way to predict the future is to invent it.) The code for this agent is open source too, of course. You can deploy it to production with Modal and Pipecat AI cloud, or run locally on an NVIDIA DGX Spark or RTX 5090.

kwindla

274,298 次观看 • 5 个月前

New from Meta FAIR: Code World Model (CWM), a 32B-parameter research model designed to explore how world models can transform code generation and reasoning about code. We believe in advancing research in world modeling and are sharing CWM under a research license to help empower the community to build upon our work. ➡️ Read the technical report: ➡️Download the open weights: ➡️Download the code:

New from Meta FAIR: Code World Model (CWM), a 32B-parameter research model designed to explore how world models can transform code generation and reasoning about code. We believe in advancing research in world modeling and are sharing CWM under a research license to help empower the community to build upon our work. ➡️ Read the technical report: ➡️Download the open weights: ➡️Download the code:

AI at Meta

313,236 次观看 • 9 个月前

Introducing Open TTS Tracker! 🗣️ *sound on* A one-stop shop to track all open access/ source TTS models! Ranging from XTTS to Pheme, OpenVoice to VITS, and more... ⚡ For each model, we compile: 1. Souce-code 2. Checkpoints 3. License 4. Fine-tuning code 5. Languages supported 6. Paper 7. Demo Help us make it more complete! Let's 2024 the year of open TTS models! ❤️

Introducing Open TTS Tracker! 🗣️ sound on A one-stop shop to track all open access/ source TTS models! Ranging from XTTS to Pheme, OpenVoice to VITS, and more... ⚡ For each model, we compile: 1. Souce-code 2. Checkpoints 3. License 4. Fine-tuning code 5. Languages supported 6. Paper 7. Demo Help us make it more complete! Let's 2024 the year of open TTS models! ❤️

Vaibhav (VB) Srivastav

68,037 次观看 • 2 年前