正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context window and improved support for 8 languages... among other improvements. Llama 3.1 405B rivals leading closed source models on state-of-the-art capabilities across a range of tasks in general knowledge, steerability, math, tool use and multilingual translation. The models are available to download now directly from Meta or Hugging Face. With today’s release the ecosystem is also ready to go with 25+ partners rolling out our latest models — including Amazon Web Services, NVIDIA, Databricks, Groq Inc, Dell Technologies, Microsoft Azure and Google Cloud ready on day one. More details in the full announcement ➡️ Download Llama 3.1 models ➡️ With these releases we’re setting the stage for unprecedented new opportunities and we can’t wait to see the innovation our newest models will unlock across all levels of the AI community.show more

AI at Meta

824,487 subscribers

1,268,811 次观看 • 2 年前 •via X (Twitter)

教育科学技术

Anya Rossi• Live Now

Private livecam show

8 条评论

AI at Meta 的头像

AI at Meta2 年前

Training a model as large and capable as Llama 3.1 405B was no simple task. The model was trained on over 15 trillion tokens over the course of several months requiring over 16K @NVIDIA H100 GPUs — making it the first Llama model ever trained at this scale. We also used the 405B parameter model to improve the post-training quality of our smaller models.

AI at Meta 的头像

AI at Meta2 年前

With Llama 3.1, we evaluated performance on >150 benchmark datasets spanning a wide range of languages — in addition to extensive human evaluations in real-world scenarios. These results show that the 405B competes with leading closed source models like GPT-4, Claude 2 and Gemini Ultra across a range of tasks. Our upgraded Llama 3.1 8B & 70B models are also best-in-class, outperforming other models at their size while also delivering a better balance of helpfulness and safety than their predecessors. These smaller models support the same improved 128K token context window, multilinguality, improved reasoning and state-of-the-art tool use to enable more advanced use cases.

AI at Meta 的头像

AI at Meta2 年前

We’ve also updated our license to allow developers to use the outputs from Llama models — including 405B — to improve other models for the first time. We’re excited about how this will enable new advancements in the field through synthetic data generation and model distillation workflows, capabilities that have never been achieved at this scale in open source.

AI at Meta 的头像

AI at Meta2 年前

As Mark Zuckerberg shared in an open letter this morning: we believe that open source will ensure that more people around the world have access to the benefits and opportunities of AI, that power isn't concentrated in the hands of a small few, and that the technology can be deployed more evenly and safely across society. That’s why we continue to take steps on the path for open source AI to become the industry standard. Read the letter ⬇️

Vaibhav (VB) Srivastav 的头像

Vaibhav (VB) Srivastav2 年前

Congratulations on the release @AIatMeta! Thanks for your unwavering support for Open Source 🤗 I put down some notes from the release below!

AI at Meta 的头像

AI at Meta2 年前

Open source AI is the path forward. ❤️

Luis Ceze 的头像

Luis Ceze2 年前

Fantastic to partner with Meta on this! Thank you Meta! And big thank you to the incredible team at OctoAI putting the models on the platform at launch! 🚀🙏🐙

Prime Intellect 的头像

Prime Intellect2 年前

Awesome research and progress towards open source AGI!!

相关视频

You can now try Llama 3.1 405B for free (link below)! This is the largest open-source model out there, and for the first time, an open model is competitive with closed models. This time around, Meta did something new: Llama 3.1 has a license that allows developers to use it to enhance other models. For the first time, you can distill Llama 3.1 405B's capabilities into a smaller, more practical model for your use case. First, here is the link where you can play with Llama 3.1 for free: The model is hosted in Tune Studio, an end-to-end platform for developing applications using Large Language Models. They are sponsoring this post. Take a look at the attached video. It will show you how you can fine-tune a simple model using Llama 3.1 without leaving the platform: 1. You can create an empty dataset 2. Use the playground to generate and record interactions with Llama 3.1 3. Modify the dataset directly using the playground 4. Export the data and fine-tune a smaller model Fast and easy! As long as you have a web browser, you can start experimenting with fine-tuning and Llama 3.1. That's all it takes!

You can now try Llama 3.1 405B for free (link below)! This is the largest open-source model out there, and for the first time, an open model is competitive with closed models. This time around, Meta did something new: Llama 3.1 has a license that allows developers to use it to enhance other models. For the first time, you can distill Llama 3.1 405B's capabilities into a smaller, more practical model for your use case. First, here is the link where you can play with Llama 3.1 for free: The model is hosted in Tune Studio, an end-to-end platform for developing applications using Large Language Models. They are sponsoring this post. Take a look at the attached video. It will show you how you can fine-tune a simple model using Llama 3.1 without leaving the platform: 1. You can create an empty dataset 2. Use the playground to generate and record interactions with Llama 3.1 3. Modify the dataset directly using the playground 4. Export the data and fine-tune a smaller model Fast and easy! As long as you have a web browser, you can start experimenting with fine-tuning and Llama 3.1. That's all it takes!

Santiago

55,609 次观看 • 1 年前

This is the fastest I've seen Llama 3.3 running anywhere! Llama 3.3 70B running at 652 t/s is lightning fast. And if you want Llama 3.1, here are the speeds I was able to get: • Llama 3.1 8B: 1006 t/s • Llama 3.1 70B: 709 t/s • Llama 3.1 405B: 206 t/s (You can access all of these models for free! See the link below.) This speed is incredible, but the interesting part is what's happening behind the scenes: These models aren't running on a GPU! In this video, I'm using the SambaNova cloud to access these models. They built a custom chip (SN40L) optimized for AI workflows. A single SN40L chip can hold hundreds of models (trillions of parameters) in memory! The speed alone is a huge deal, but the big advantage is for agentic workflows running multiple specialized models. A GPU can only host a single model and switch (unload and load) to a different model if necessary. An SN40L, on the other hand, can host every model at once, making it much faster. Here is the video where you can see how fast these chips are:

This is the fastest I've seen Llama 3.3 running anywhere! Llama 3.3 70B running at 652 t/s is lightning fast. And if you want Llama 3.1, here are the speeds I was able to get: • Llama 3.1 8B: 1006 t/s • Llama 3.1 70B: 709 t/s • Llama 3.1 405B: 206 t/s (You can access all of these models for free! See the link below.) This speed is incredible, but the interesting part is what's happening behind the scenes: These models aren't running on a GPU! In this video, I'm using the SambaNova cloud to access these models. They built a custom chip (SN40L) optimized for AI workflows. A single SN40L chip can hold hundreds of models (trillions of parameters) in memory! The speed alone is a huge deal, but the big advantage is for agentic workflows running multiple specialized models. A GPU can only host a single model and switch (unload and load) to a different model if necessary. An SN40L, on the other hand, can host every model at once, making it much faster. Here is the video where you can see how fast these chips are:

Santiago

97,505 次观看 • 1 年前

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

Andrew Ng

131,755 次观看 • 1 年前

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

Andrew Ng

162,833 次观看 • 2 年前

DeepSeek R1 671B is ridiculously good, but running at 158 tokens per second is something out of this world. A model *this good* running *this fast* is 3 years ahead of this time. The secret: DeepSeek in this short video isn't running on GPUs. It's running on custom-made chips! This is running on SambaNova and their SN40L chips, which are designed and optimized for running AI workflows. SambaNova is sponsoring my work. We designed GPUs to run games and later realized we could use them for AI. This is very different from creating a chip specifically for AI from the ground up. Just think about this for a second: One single SNL40 chip can simultaneously hold 100+ models (trillions of parameters) in memory! An agentic workflow that uses multiple models simultaneously won't need to swap models from memory even once! By the way, the Llama family of models *screams* on the SambaNova Cloud. Here are some numbers from my latest tests: • Llama 3.3 70B: 367.32 t/s • Llama 3.2 1B: 2381.82 t/s • Llama 3.2 3B: 1335.71 t/s • Llama 3.1 8B: 929.58 t/s • Llama 3.1 70B: 369.16 t/s • Llama 3.1 405B: 103.37 t/s You can access all of these models for free!

DeepSeek R1 671B is ridiculously good, but running at 158 tokens per second is something out of this world. A model this good running this fast is 3 years ahead of this time. The secret: DeepSeek in this short video isn't running on GPUs. It's running on custom-made chips! This is running on SambaNova and their SN40L chips, which are designed and optimized for running AI workflows. SambaNova is sponsoring my work. We designed GPUs to run games and later realized we could use them for AI. This is very different from creating a chip specifically for AI from the ground up. Just think about this for a second: One single SNL40 chip can simultaneously hold 100+ models (trillions of parameters) in memory! An agentic workflow that uses multiple models simultaneously won't need to swap models from memory even once! By the way, the Llama family of models screams on the SambaNova Cloud. Here are some numbers from my latest tests: • Llama 3.3 70B: 367.32 t/s • Llama 3.2 1B: 2381.82 t/s • Llama 3.2 3B: 1335.71 t/s • Llama 3.1 8B: 929.58 t/s • Llama 3.1 70B: 369.16 t/s • Llama 3.1 405B: 103.37 t/s You can access all of these models for free!

Santiago

130,934 次观看 • 1 年前

Exciting update for AI developers! The Hugging Face Hub is now more natively integrated into Google Cloud Vertex AI Model Garden. Search through thousands of open Generative AI models from Hugging Face models & deploy them with one click to Vertex AI or GKE. 🤯 What's new: 🔎 Browse and search thousands of Hugging Face models directly within the Vertex AI Model Garden and filter based on what is currently trending in the community. 🚀 Accelerate your AI projects by leveraging readily available one-click deploy your model to Vertex AI or Google Kubernetes Engine (GKE). ⭐️ Featuring popular open models from BlackForestLabsAI - Unofficial FLUX.1, AI at Meta Llama 3.1, Mistral AI, Google DeepMind Gemma, and countless others. Get Started:

Exciting update for AI developers! The Hugging Face Hub is now more natively integrated into Google Cloud Vertex AI Model Garden. Search through thousands of open Generative AI models from Hugging Face models & deploy them with one click to Vertex AI or GKE. 🤯 What's new: 🔎 Browse and search thousands of Hugging Face models directly within the Vertex AI Model Garden and filter based on what is currently trending in the community. 🚀 Accelerate your AI projects by leveraging readily available one-click deploy your model to Vertex AI or Google Kubernetes Engine (GKE). ⭐️ Featuring popular open models from BlackForestLabsAI - Unofficial FLUX.1, AI at Meta Llama 3.1, Mistral AI, Google DeepMind Gemma, and countless others. Get Started:

Philipp Schmid

34,754 次观看 • 1 年前

Our Llama-3.1-Nemotron-70B-Instruct model is a leading model on the 🏆 Arena Hard benchmark (85) from Arena. Arena Hard uses a data pipeline to build high-quality benchmarks from live data in Chatbot Arena, and is known for its predictive ability of Chatbot Arena Elo score as well as separability between helpful and less helpful models. Use our customized model Llama-3.1-Nemotron-70B to improve the helpfulness of LLM generated responses in your applications. 📥 Try on our API catalog: 📥 On GitHub: 📥 Or on Hugging Face:

Our Llama-3.1-Nemotron-70B-Instruct model is a leading model on the 🏆 Arena Hard benchmark (85) from Arena. Arena Hard uses a data pipeline to build high-quality benchmarks from live data in Chatbot Arena, and is known for its predictive ability of Chatbot Arena Elo score as well as separability between helpful and less helpful models. Use our customized model Llama-3.1-Nemotron-70B to improve the helpfulness of LLM generated responses in your applications. 📥 Try on our API catalog: 📥 On GitHub: 📥 Or on Hugging Face:

NVIDIA AI Developer

140,756 次观看 • 1 年前

As we wrap up 2024, we're sharing an update on our progress with Llama and the impact it’s having around the world. Read the full update here ➡️ A few highlights from 2024 📈 Llama has been downloaded over 650M times, doubling in just three months. 🌏 License approvals for Llama have more than doubled globally, with significant growth in emerging markets. 🤗 There are now over 85,000 Llama derivative models on Hugging Face alone — a 5x increase from the start of the year. ❤️ We’re continuing to see Llama being used across the industry with new examples and innovation from Block, Accenture and LinkedIn — among others. Open source AI is shaping the future. As we look to 2025, the pace of innovation will only increase as we work to make Llama the industry standard for building on AI.

As we wrap up 2024, we're sharing an update on our progress with Llama and the impact it’s having around the world. Read the full update here ➡️ A few highlights from 2024 📈 Llama has been downloaded over 650M times, doubling in just three months. 🌏 License approvals for Llama have more than doubled globally, with significant growth in emerging markets. 🤗 There are now over 85,000 Llama derivative models on Hugging Face alone — a 5x increase from the start of the year. ❤️ We’re continuing to see Llama being used across the industry with new examples and innovation from Block, Accenture and LinkedIn — among others. Open source AI is shaping the future. As we look to 2025, the pace of innovation will only increase as we work to make Llama the industry standard for building on AI.

AI at Meta

109,224 次观看 • 1 年前

🚨 Big news for AI innovation: Claude Opus 4 and Claude Sonnet 4, Anthropic's most advanced models, are now available in Amazon Bedrock. These powerful models offer hybrid reasoning, 200K token context windows, and are designed for AI agents. From financial analysis to high-quality writing, to enhanced reasoning, coding, agentic capabilities and more—all with the enterprise-grade security of Amazon Web Services.

🚨 Big news for AI innovation: Claude Opus 4 and Claude Sonnet 4, Anthropic's most advanced models, are now available in Amazon Bedrock. These powerful models offer hybrid reasoning, 200K token context windows, and are designed for AI agents. From financial analysis to high-quality writing, to enhanced reasoning, coding, agentic capabilities and more—all with the enterprise-grade security of Amazon Web Services.

Amazon

123,030 次观看 • 1 年前

Today is a good day for open science. As part of our continued commitment to the growth and development of an open ecosystem, today at Meta FAIR we’re announcing four new publicly available AI models and additional research artifacts to inspire innovation in the community and help advance AI in a responsible way. More in the video from Joelle Pineau. What we’re releasing: 🦎 Meta Chameleon 7B & 34B language models that support mixed-modal input and text-only outputs. 🪙 Meta Multi-Token Prediction Pretrained Language Models for code completion using Multi-Token Prediction. 🎼 Meta JASCO Generative text-to-music models capable of accepting various conditioning inputs for greater controllability. Paper available today with a pretrained model coming soon. 🗣️ Meta AudioSeal An audio watermarking model that we believe is the first designed specifically for the localized detection of AI-generated speech, available under a commercial license. 📝 Additional RAI artifacts Including research, data and code to measure and improve the representation of geographical and cultural preferences and diversity in AI systems. We believe that access to state-of-the-art AI creates opportunities for everyone – not just a small handful of Big Tech companies. We’re excited to share this work and to see how the community learns, iterates and builds using this technology. Details and access to everything released by FAIR today ➡️

Today is a good day for open science. As part of our continued commitment to the growth and development of an open ecosystem, today at Meta FAIR we’re announcing four new publicly available AI models and additional research artifacts to inspire innovation in the community and help advance AI in a responsible way. More in the video from Joelle Pineau. What we’re releasing: 🦎 Meta Chameleon 7B & 34B language models that support mixed-modal input and text-only outputs. 🪙 Meta Multi-Token Prediction Pretrained Language Models for code completion using Multi-Token Prediction. 🎼 Meta JASCO Generative text-to-music models capable of accepting various conditioning inputs for greater controllability. Paper available today with a pretrained model coming soon. 🗣️ Meta AudioSeal An audio watermarking model that we believe is the first designed specifically for the localized detection of AI-generated speech, available under a commercial license. 📝 Additional RAI artifacts Including research, data and code to measure and improve the representation of geographical and cultural preferences and diversity in AI systems. We believe that access to state-of-the-art AI creates opportunities for everyone – not just a small handful of Big Tech companies. We’re excited to share this work and to see how the community learns, iterates and builds using this technology. Details and access to everything released by FAIR today ➡️

AI at Meta

380,751 次观看 • 2 年前

Exclusive: Meta just released Llama 3.1 405B — the first-ever open-sourced frontier AI model, beating top closed models like GPT-4o across several benchmarks. I sat down with Mark Zuckerberg, diving into why this marks a major moment in AI history. Timestamps: 00:00 Intro 00:38 Meta’s Llama 3.1 rundown 03:44 Real-world use cases for Llama 3.1 06:15 Educating developers on open-source AI tools 09:43 Societal implications of open-source AI 13:00 Balancing power and managing bad actors 14:40 Open source and global competition 16:59 Accelerating innovation and economic growth 20:04 Zuck on Apple and lessons from the past 24:22 Future of AI: Llama 3 and beyond 26:43 Prediction: Billions of personalized AI agents 31:32 Factors to changing anti-AI sentiment

Exclusive: Meta just released Llama 3.1 405B — the first-ever open-sourced frontier AI model, beating top closed models like GPT-4o across several benchmarks. I sat down with Mark Zuckerberg, diving into why this marks a major moment in AI history. Timestamps: 00:00 Intro 00:38 Meta’s Llama 3.1 rundown 03:44 Real-world use cases for Llama 3.1 06:15 Educating developers on open-source AI tools 09:43 Societal implications of open-source AI 13:00 Balancing power and managing bad actors 14:40 Open source and global competition 16:59 Accelerating innovation and economic growth 20:04 Zuck on Apple and lessons from the past 24:22 Future of AI: Llama 3 and beyond 26:43 Prediction: Billions of personalized AI agents 31:32 Factors to changing anti-AI sentiment

Rowan Cheung

2,643,904 次观看 • 2 年前

Today, we’re thrilled to unveil the Arcee.ai Foundation Models, a new family of GenAI models designed from the ground up for enterprise reality. The first release—AFM-4.5B—is a 4.5-billion-parameter frontier model that delivers excellent accuracy, strict compliance, and very high cost efficiency. In short: enterprise-grade intelligence that can run anywhere—on a smartphone, at the edge, or in the cloud. For a quick taste, you can test AFM-4.5B in our playground and on For a deeper dive into the model’s training pipeline and benchmarks, details are available in our technical blog post. ➡️ Arcee AI playground: ➡️ Together AI playground: ➡️ Launch blog post: ➡️ Technical blog post: PS: As we’ll now focus on our AFM foundation models, and because we love open-source, we're opening up access to our previously closed-source language models. Details in the tech blog post!

Today, we’re thrilled to unveil the Arcee.ai Foundation Models, a new family of GenAI models designed from the ground up for enterprise reality. The first release—AFM-4.5B—is a 4.5-billion-parameter frontier model that delivers excellent accuracy, strict compliance, and very high cost efficiency. In short: enterprise-grade intelligence that can run anywhere—on a smartphone, at the edge, or in the cloud. For a quick taste, you can test AFM-4.5B in our playground and on For a deeper dive into the model’s training pipeline and benchmarks, details are available in our technical blog post. ➡️ Arcee AI playground: ➡️ Together AI playground: ➡️ Launch blog post: ➡️ Technical blog post: PS: As we’ll now focus on our AFM foundation models, and because we love open-source, we're opening up access to our previously closed-source language models. Details in the tech blog post!

Arcee.ai

21,934 次观看 • 1 年前

Introducing "Building with Llama 4." This short course is created with Meta AI at Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced the Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here:

Introducing "Building with Llama 4." This short course is created with Meta AI at Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced the Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here:

Andrew Ng

67,710 次观看 • 1 年前

Earlier this week at GTC, we announced our partnership with Nvidia. We will work with Nvidia to build strong, American open-source models that are at the frontier of scientific reasoning. These models will be essential for the US to compete with China on science in the coming decades. Jensen is committing to spend tens of billions of dollars developing open-source models, and we are excited to be a partner with them in figuring out how to benchmark, train and use those agents to accelerate scientific research. We have already open-sourced some of the work we have done with them, and are looking forward to open-sourcing more. There are few things today that are more important. See our blog post below, and watch the video to learn more, narrated by the man himself.

Earlier this week at GTC, we announced our partnership with Nvidia. We will work with Nvidia to build strong, American open-source models that are at the frontier of scientific reasoning. These models will be essential for the US to compete with China on science in the coming decades. Jensen is committing to spend tens of billions of dollars developing open-source models, and we are excited to be a partner with them in figuring out how to benchmark, train and use those agents to accelerate scientific research. We have already open-sourced some of the work we have done with them, and are looking forward to open-sourcing more. There are few things today that are more important. See our blog post below, and watch the video to learn more, narrated by the man himself.

Sam Rodriques

23,301 次观看 • 4 个月前

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from Joelle Pineau. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from Joelle Pineau. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️

AI at Meta

150,222 次观看 • 1 年前

Today we are introducing Tara. Biological datasets are a source of insights and a means to train biological AI models. As the ability to reason at scale emerges, they take on a new role: the ground truth for testing what reasoning models produce, and the environment in which those models operate, get feedback, and improve. Tara, our autonomous research agent, is embedded in our ever-expanding datasets, lab-generated and synthetic, and built to test and evolve the hypotheses frontier models generate, matching the pace at which they produce new ideas. By keeping those models grounded in a vast space of high-precision biological data, we believe we can compound biological reasoning and close the impedance mismatch between hypothesis generation and validation.

Today we are introducing Tara. Biological datasets are a source of insights and a means to train biological AI models. As the ability to reason at scale emerges, they take on a new role: the ground truth for testing what reasoning models produce, and the environment in which those models operate, get feedback, and improve. Tara, our autonomous research agent, is embedded in our ever-expanding datasets, lab-generated and synthetic, and built to test and evolve the hypotheses frontier models generate, matching the pace at which they produce new ideas. By keeping those models grounded in a vast space of high-precision biological data, we believe we can compound biological reasoning and close the impedance mismatch between hypothesis generation and validation.

Nima Alidoust

26,864 次观看 • 19 天前

We’re expanding our strategic partnership with Microsoft to: ⚡ Deliver OpenAI’s state-of-the-art models in Snowflake Cortex AI on Microsoft Microsoft Azure. Customers will soon be able to build AI-powered data agents to run analytical workflows on structured and unstructured data using OpenAI’s models. ⚡ Make Snowflake Cortex Agents available in Microsoft 365 Copilot and Microsoft 365 apps, ensuring AI insights become more accessible for users and enabling better decision-making across the enterprise. Learn more about how we’re bringing easy, efficient, and trusted AI to enterprises around the world:

We’re expanding our strategic partnership with Microsoft to: ⚡ Deliver OpenAI’s state-of-the-art models in Snowflake Cortex AI on Microsoft Microsoft Azure. Customers will soon be able to build AI-powered data agents to run analytical workflows on structured and unstructured data using OpenAI’s models. ⚡ Make Snowflake Cortex Agents available in Microsoft 365 Copilot and Microsoft 365 apps, ensuring AI insights become more accessible for users and enabling better decision-making across the enterprise. Learn more about how we’re bringing easy, efficient, and trusted AI to enterprises around the world:

Snowflake

11,011 次观看 • 1 年前

Wrapping up the year and coinciding with #NeurIPS2024, today at Meta FAIR we’re releasing a collection of nine new open source AI research artifacts across our work in developing agents, robustness & safety and new architectures. More in the video from Joelle Pineau. All of this work is part of FAIR’s continued work towards the goal of achieving advanced machine intelligence A few highlights from what we’re releasing today: • Meta Motivo: A first-of-its-kind behavioral foundation model that controls the movements of a virtual embodied humanoid agent to perform complex tasks. • Meta Video Seal: a state-of-the art comprehensive framework for neural video watermarking. • Meta Explore Theory-of-Mind: A program-guided adversarial data generation for theory of mind reasoning. • Meta Large Concept Models: A fundamentally different training paradigm for language modeling that decouples reasoning from language representation. And much more! We’re excited to share this work with the research community and look forward to seeing how it inspires new innovation across the field. Details and access to everything released by FAIR today ➡️

Wrapping up the year and coinciding with #NeurIPS2024, today at Meta FAIR we’re releasing a collection of nine new open source AI research artifacts across our work in developing agents, robustness & safety and new architectures. More in the video from Joelle Pineau. All of this work is part of FAIR’s continued work towards the goal of achieving advanced machine intelligence A few highlights from what we’re releasing today: • Meta Motivo: A first-of-its-kind behavioral foundation model that controls the movements of a virtual embodied humanoid agent to perform complex tasks. • Meta Video Seal: a state-of-the art comprehensive framework for neural video watermarking. • Meta Explore Theory-of-Mind: A program-guided adversarial data generation for theory of mind reasoning. • Meta Large Concept Models: A fundamentally different training paradigm for language modeling that decouples reasoning from language representation. And much more! We’re excited to share this work with the research community and look forward to seeing how it inspires new innovation across the field. Details and access to everything released by FAIR today ➡️

AI at Meta

156,123 次观看 • 1 年前

Today we’re celebrating 10 years of the Meta FAIR lab in Paris by sharing a collection of new models, datasets and some exciting milestones in the impacts of open source — all laddering up to our ongoing work to achieve Advanced Machine Intelligence (AMI). 1️⃣ Meta PARTNR is a framework for human-robot collaboration that builds on our existing work in this space with a new dataset and a large planning model enabling robots to accomplish complex tasks alongside humans. 2️⃣ Meta Audiobox Aesthetics enables the automatic evaluation of audio aesthetics, providing a comprehensive assessment of audio quality across speech, music and sound. 3️⃣ Open Source Machine Translation Benchmark is a carefully crafted collection with the aim of building an unprecedented multilingual machine translation benchmark for the community. 4️⃣ Two new breakthrough studies using AI to further our understanding of language in the brain.

Today we’re celebrating 10 years of the Meta FAIR lab in Paris by sharing a collection of new models, datasets and some exciting milestones in the impacts of open source — all laddering up to our ongoing work to achieve Advanced Machine Intelligence (AMI). 1️⃣ Meta PARTNR is a framework for human-robot collaboration that builds on our existing work in this space with a new dataset and a large planning model enabling robots to accomplish complex tasks alongside humans. 2️⃣ Meta Audiobox Aesthetics enables the automatic evaluation of audio aesthetics, providing a comprehensive assessment of audio quality across speech, music and sound. 3️⃣ Open Source Machine Translation Benchmark is a carefully crafted collection with the aim of building an unprecedented multilingual machine translation benchmark for the community. 4️⃣ Two new breakthrough studies using AI to further our understanding of language in the brain.

AI at Meta

85,774 次观看 • 1 年前