Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

No data, no problem introducing agentic synthetic data generation with Cosmos 3 share a few examples, generate more data, automate model training, automatically deploy the latest version with no downtime in a benchmark run with Corning Incorporated's optical fiber manufacturing engineering team, a model trained on 8 real defect... images plus synthetic examples generated by Cosmos reached 0.95 mean average precision and perfect recall on the toughest defect class, beating a baseline trained on real data alone. "The Roboflow Agent powered by NVIDIA allows us to generate the training data we need, fine-tune our models, and strengthen model performance and inspection quality while increasing the speed, scalability, and adoption of next-generation technologies,” - Jeremy Knopf, chief information officer, Corning Optical Communicationsshow more

Roboflow

13,652 subscribers

38,811 views • 18 days ago •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

A humanoid robot policy trained solely on synthetic data generated by a world model. Research Scientist Joel Jang presents NVIDIA's DreamGen pipeline: ⦿ Post-train the world model Cosmos-Predict2 with a small set of real teleoperation demos. ⦿ Prompt the world model to generate synthetic video data with verbs and scenarios not used in the world model’s post-training. ⦿ Auto-label synthetic video data with action sequences. ⦿ Train robot policies using only synthetic data. That's it. Deploy zero-shot to a real humanoid robot.

A humanoid robot policy trained solely on synthetic data generated by a world model. Research Scientist Joel Jang presents NVIDIA's DreamGen pipeline: ⦿ Post-train the world model Cosmos-Predict2 with a small set of real teleoperation demos. ⦿ Prompt the world model to generate synthetic video data with verbs and scenarios not used in the world model’s post-training. ⦿ Auto-label synthetic video data with action sequences. ⦿ Train robot policies using only synthetic data. That's it. Deploy zero-shot to a real humanoid robot.

The Humanoid Hub

20,968 views • 11 months ago

NEWS: Nvidia has just introduced "Cosmos," a world foundation model created to understand the physical world. The model can generate synthetic data to train robotics.

NEWS: Nvidia has just introduced "Cosmos," a world foundation model created to understand the physical world. The model can generate synthetic data to train robotics.

Sawyer Merritt

254,296 views • 1 year ago

We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.

We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.

bidhan

451,728 views • 22 days ago

NVIDIA just introduced Cosmos, a platform for world foundation models designed for robotics. ⦿ It features advanced tokenizers, an AI-accelerated data pipeline, and integration with NVIDIA Omniverse. Humanoid makers 1X, Figure, and Agility are among the first to adopt Cosmos. ⦿ Cosmos generates synthetic, physics-based data, accelerating model training and customization. ⦿ It also features a CUDA-accelerated data processing pipeline that enables developers to process, curate, and label 20 million hours of videos in 14 days using the NVIDIA Blackwell platform.

NVIDIA just introduced Cosmos, a platform for world foundation models designed for robotics. ⦿ It features advanced tokenizers, an AI-accelerated data pipeline, and integration with NVIDIA Omniverse. Humanoid makers 1X, Figure, and Agility are among the first to adopt Cosmos. ⦿ Cosmos generates synthetic, physics-based data, accelerating model training and customization. ⦿ It also features a CUDA-accelerated data processing pipeline that enables developers to process, curate, and label 20 million hours of videos in 14 days using the NVIDIA Blackwell platform.

The Humanoid Hub

129,383 views • 1 year ago

Revolutionizing Move Programming with OpenLedger In this demo, we showcase how Move datasets contributed by data providers to OpenLedger’s datanets are used to fine-tune specialized models with LoRA fine-tuning. As seen in the video, we showcase an example on how builders can deploy a Move-specialized model that powers Co-pilot agents using our no-code model fine-tuning platform. This is the future of AI and Web3 innovation. Watch this space to see more specialised models and data feeds being built for next generation agents on top of OpenLedger #Move

Revolutionizing Move Programming with OpenLedger In this demo, we showcase how Move datasets contributed by data providers to OpenLedger’s datanets are used to fine-tune specialized models with LoRA fine-tuning. As seen in the video, we showcase an example on how builders can deploy a Move-specialized model that powers Co-pilot agents using our no-code model fine-tuning platform. This is the future of AI and Web3 innovation. Watch this space to see more specialised models and data feeds being built for next generation agents on top of OpenLedger #Move

OpenLedger

61,662 views • 1 year ago

As robotics teams scale, collecting real-world data can quickly become slow and expensive. 📉 Lightwheel is changing that with a simulation-first platform built on our technology, helping customers generate a 100:1 simulated-to-real data ratio for training autonomous systems. Together, we’re turning synthetic and real data into more capable, more reliable robots. Read the full success story 🔗

As robotics teams scale, collecting real-world data can quickly become slow and expensive. 📉 Lightwheel is changing that with a simulation-first platform built on our technology, helping customers generate a 100:1 simulated-to-real data ratio for training autonomous systems. Together, we’re turning synthetic and real data into more capable, more reliable robots. Read the full success story 🔗

NVIDIA Robotics

10,780 views • 2 months ago

Announcing Roboflow Rapid: the first prompt-based model creation engine. Data labeling is dead. Go from idea to deployed model in minutes without labeling data. Upload a video, type a text prompt, and get an API. No data labeling teams. No manual annotation. No infra / dependency hell. If you are spending 95% of your time labeling data and only 5% building your app, you are doing it wrong.

Announcing Roboflow Rapid: the first prompt-based model creation engine. Data labeling is dead. Go from idea to deployed model in minutes without labeling data. Upload a video, type a text prompt, and get an API. No data labeling teams. No manual annotation. No infra / dependency hell. If you are spending 95% of your time labeling data and only 5% building your app, you are doing it wrong.

Roboflow

22,938 views • 6 months ago

We’re excited to share that 🥇Llama Nemotron Super 49B v1.5 -- our latest open reasoning model -- is now #1 on the Artificial Analysis Intelligence Index - a leaderboard that spans advanced math, science, and agentic tasks, in the 70B open model category. Llama Nemotron Super 49B v1.5 is trained with high-quality reasoning synthetic data generated from models like Qwen3-235B and DeepSeek R1. It delivers state-of-the-art accuracy and throughput, running on a single H100. Key features: 🎯 Leading accuracy on multi-step reasoning, math, coding, and function-calling 🏗️ Post-trained using RPO, DPO, and RLVR across 26M+ synthetic examples 📊 Fully transparent training data and techniques If you're building AI agents and want a high accuracy, fully-open, and transparent reasoning model that you can deploy anywhere, try Super v1.5 on or download from Hugging Face 🤗 ➡️ Leaderboard:

We’re excited to share that 🥇Llama Nemotron Super 49B v1.5 -- our latest open reasoning model -- is now #1 on the Artificial Analysis Intelligence Index - a leaderboard that spans advanced math, science, and agentic tasks, in the 70B open model category. Llama Nemotron Super 49B v1.5 is trained with high-quality reasoning synthetic data generated from models like Qwen3-235B and DeepSeek R1. It delivers state-of-the-art accuracy and throughput, running on a single H100. Key features: 🎯 Leading accuracy on multi-step reasoning, math, coding, and function-calling 🏗️ Post-trained using RPO, DPO, and RLVR across 26M+ synthetic examples 📊 Fully transparent training data and techniques If you're building AI agents and want a high accuracy, fully-open, and transparent reasoning model that you can deploy anywhere, try Super v1.5 on or download from Hugging Face 🤗 ➡️ Leaderboard:

NVIDIA AI Developer

100,506 views • 10 months ago

UE5.6 AI Real-Time Motion Generation Plugin — powered by a 1-billion-parameter motion model trained on NVIDIA CUDA. It runs locally in real time with 8GB of VRAM and converts webcam or video footage into real-time XYZ skeletal point data and rotation values.

UE5.6 AI Real-Time Motion Generation Plugin — powered by a 1-billion-parameter motion model trained on NVIDIA CUDA. It runs locally in real time with 8GB of VRAM and converts webcam or video footage into real-time XYZ skeletal point data and rotation values.

CYANPUPPETS

28,871 views • 5 months ago

UE5.6 AI Real-Time Motion Generation Plugin — powered by a 1-billion-parameter motion model trained on NVIDIA CUDA. It runs locally in real time with 8GB of VRAM and converts webcam or video footage into real-time XYZ skeletal point data and rotation values.

UE5.6 AI Real-Time Motion Generation Plugin — powered by a 1-billion-parameter motion model trained on NVIDIA CUDA. It runs locally in real time with 8GB of VRAM and converts webcam or video footage into real-time XYZ skeletal point data and rotation values.

CYANPUPPETS

15,968 views • 4 months ago

UE5.6 AI Real-Time Motion Generation Plugin — powered by a 1-billion-parameter motion model trained on NVIDIA CUDA. It runs locally in real time with 8GB of VRAM and converts webcam or video footage into real-time XYZ skeletal point data and rotation values.

UE5.6 AI Real-Time Motion Generation Plugin — powered by a 1-billion-parameter motion model trained on NVIDIA CUDA. It runs locally in real time with 8GB of VRAM and converts webcam or video footage into real-time XYZ skeletal point data and rotation values.

CYANPUPPETS

28,846 views • 5 months ago

Google presents Still-Moving Customized Video Generation without Customized Video Data Customizing text-to-image (T2I) models has seen tremendous progress recently, particularly in areas such as personalization, stylization, and conditional generation. However, expanding this progress to video generation is still in its infancy, primarily due to the lack of customized video data. In this work, we introduce Still-Moving, a novel generic framework for customizing a text-to-video (T2V) model, without requiring any customized video data. The framework applies to the prominent T2V design where the video model is built over a text-to-image (T2I) model (e.g., via inflation). We assume access to a customized version of the T2I model, trained only on still image data (e.g., using DreamBooth or StyleDrop). Naively plugging in the weights of the customized T2I model into the T2V model often leads to significant artifacts or insufficient adherence to the customization data. To overcome this issue, we train lightweight Spatial Adapters that adjust the features produced by the injected T2I layers. Importantly, our adapters are trained on "frozen videos" (i.e., repeated images), constructed from image samples generated by the customized T2I model. This training is facilitated by a novel Motion Adapter module, which allows us to train on such static videos while preserving the motion prior of the video model. At test time, we remove the Motion Adapter modules and leave in only the trained Spatial Adapters. This restores the motion prior of the T2V model while adhering to the spatial prior of the customized T2I model. We demonstrate the effectiveness of our approach on diverse tasks including personalized, stylized, and conditional generation. In all evaluated scenarios, our method seamlessly integrates the spatial prior of the customized T2I model with a motion prior supplied by the T2V model.

Google presents Still-Moving Customized Video Generation without Customized Video Data Customizing text-to-image (T2I) models has seen tremendous progress recently, particularly in areas such as personalization, stylization, and conditional generation. However, expanding this progress to video generation is still in its infancy, primarily due to the lack of customized video data. In this work, we introduce Still-Moving, a novel generic framework for customizing a text-to-video (T2V) model, without requiring any customized video data. The framework applies to the prominent T2V design where the video model is built over a text-to-image (T2I) model (e.g., via inflation). We assume access to a customized version of the T2I model, trained only on still image data (e.g., using DreamBooth or StyleDrop). Naively plugging in the weights of the customized T2I model into the T2V model often leads to significant artifacts or insufficient adherence to the customization data. To overcome this issue, we train lightweight Spatial Adapters that adjust the features produced by the injected T2I layers. Importantly, our adapters are trained on "frozen videos" (i.e., repeated images), constructed from image samples generated by the customized T2I model. This training is facilitated by a novel Motion Adapter module, which allows us to train on such static videos while preserving the motion prior of the video model. At test time, we remove the Motion Adapter modules and leave in only the trained Spatial Adapters. This restores the motion prior of the T2V model while adhering to the spatial prior of the customized T2I model. We demonstrate the effectiveness of our approach on diverse tasks including personalized, stylized, and conditional generation. In all evaluated scenarios, our method seamlessly integrates the spatial prior of the customized T2I model with a motion prior supplied by the T2V model.

AK

40,467 views • 1 year ago

pip install dria All you need to generate high-quality, grounded synthetic data. Experiment with different models, incorporate tool usage, and generate at scale in just a few lines of code.

pip install dria All you need to generate high-quality, grounded synthetic data. Experiment with different models, incorporate tool usage, and generate at scale in just a few lines of code.

Dria

19,351 views • 1 year ago

Scale AI's Alexandr Wang says "the data wall and the wall on progress that we've hit right now is certainly real" and synthetic data isn't living up to its promise, with more human-generated data needed

Scale AI's Alexandr Wang says "the data wall and the wall on progress that we've hit right now is certainly real" and synthetic data isn't living up to its promise, with more human-generated data needed

Tsarathustra

578,870 views • 1 year ago

Model progress is no longer constrained by architecture, but by access to high-quality, human-generated data. Scraped internet data is finite, low-signal and increasingly synthetic. Kled AI pays real people to upload authentic, real world content task by task in their mobile app. We are super excited to back Avi Patel and Kled AI as they build a massive data marketplace. If you are in need of a great data partner, reach out! More from us here (we were not investors when we filmed):

Model progress is no longer constrained by architecture, but by access to high-quality, human-generated data. Scraped internet data is finite, low-signal and increasingly synthetic. Kled AI pays real people to upload authentic, real world content task by task in their mobile app. We are super excited to back Avi Patel and Kled AI as they build a massive data marketplace. If you are in need of a great data partner, reach out! More from us here (we were not investors when we filmed):

Nichole Wischoff

36,594 views • 2 months ago

🚨 Jensen Huang says everyone panicked about the AI data when MOST training data was never REAL to begin with. Ilya Sutskever told the industry pre-training was over. "Ilya said, 'We're out of data,' or something like that. 'Pre-training is over,' or something like that," Huang says. "The industry panicked, you know, that this is the end of AI." "And of course, of course that's obviously not true. We're gonna keep on scaling the amount of data that we have to train with." "A lot of that data is probably gonna be synthetic." That's where the panic came from — synthetic data sounds like cheating. "Most of the data that we are training, that we teach each other with, inform each other with, is synthetic." "It's synthetic because it didn't come out of nature." "You created it. I'm consuming it. I modify it, augment it, I regenerate it, somebody else consumes it." The textbook in your hand is synthetic. The post you're reading is synthetic. The lecture you took is synthetic. Nature didn't make any of it. Humans did. AI just learned to do the same thing — faster. "Training is now limited by compute," Huang says. "Data is now limited by compute." The data wall wasn't a wall. It was a mirror. If you're new here, follow @AiEvolutio for the latest on ChatGPT, Claude, and the AI tools shaping how we work and create. — Jensen Huang ( NVIDIA ), NVIDIA CEO, on Lex Fridman's ( Lex Fridman ) podcast

🚨 Jensen Huang says everyone panicked about the AI data when MOST training data was never REAL to begin with. Ilya Sutskever told the industry pre-training was over. "Ilya said, 'We're out of data,' or something like that. 'Pre-training is over,' or something like that," Huang says. "The industry panicked, you know, that this is the end of AI." "And of course, of course that's obviously not true. We're gonna keep on scaling the amount of data that we have to train with." "A lot of that data is probably gonna be synthetic." That's where the panic came from — synthetic data sounds like cheating. "Most of the data that we are training, that we teach each other with, inform each other with, is synthetic." "It's synthetic because it didn't come out of nature." "You created it. I'm consuming it. I modify it, augment it, I regenerate it, somebody else consumes it." The textbook in your hand is synthetic. The post you're reading is synthetic. The lecture you took is synthetic. Nature didn't make any of it. Humans did. AI just learned to do the same thing — faster. "Training is now limited by compute," Huang says. "Data is now limited by compute." The data wall wasn't a wall. It was a mirror. If you're new here, follow @AiEvolutio for the latest on ChatGPT, Claude, and the AI tools shaping how we work and create. — Jensen Huang ( NVIDIA ), NVIDIA CEO, on Lex Fridman's ( Lex Fridman ) podcast

AI Evolution

15,565 views • 19 days ago

🆕 How to run (and finetune) open source AI models with a simple API! In 5 mins, I go over how to: ◆ Generate text with DeepSeek R1 & Llama 3 ◆ Generate code with Qwen on LlamaCoder ◆ Generate images with Flux on BlinkShot ◆ Finetune a model on your own data & run it

🆕 How to run (and finetune) open source AI models with a simple API! In 5 mins, I go over how to: ◆ Generate text with DeepSeek R1 & Llama 3 ◆ Generate code with Qwen on LlamaCoder ◆ Generate images with Flux on BlinkShot ◆ Finetune a model on your own data & run it

Hassan

30,236 views • 1 year ago

Tencent presents GameGen-O Open-world Video Game Generation We introduce GameGen-O, the first diffusion transformer model tailored for the generation of open-world video games. This model facilitates high-quality, open-domain generation by simulating a wide array of game engine features, such as innovative characters, dynamic environments, complex actions, and diverse events. Additionally, it provides interactive controllability, thus allowing for the gameplay simulation. The development of GameGen-O involves a comprehensive data collection and processing effort from scratch. We collect and build the first Open-World Video Game Dataset (OGameData), amassed extensive data from over a hundred of next-generation open-world games, employing a proprietary data pipeline for efficient sorting, scoring, filtering, and decoupled captioning. This robust and extensive OGameData forms the foundation of our model's training process. GameGen-O undergoes a two-stage training process, consisting of foundation model pretraining and instruction tuning. In the first phase, the model is pre-trained on the OGameData via the text-to-video and video continuation, endowing GameGen-O with the capability for open-domain video game generation. In the second phase, the pre-trained model is frozen, and we fine-tuned using a trainable InstructNet, which enables the production of subsequent frames based on multimodal structural instructions. This whole training process imparts the model with the ability to generate and interactively control content. In summary, GameGen-O represents a notable initial step forward in the realm of open-world video game generation via generative models. It underscores the potential of generative models to serve as an alternative to rendering techniques, which can efficiently combine creative generation with interactive capabilities.

Tencent presents GameGen-O Open-world Video Game Generation We introduce GameGen-O, the first diffusion transformer model tailored for the generation of open-world video games. This model facilitates high-quality, open-domain generation by simulating a wide array of game engine features, such as innovative characters, dynamic environments, complex actions, and diverse events. Additionally, it provides interactive controllability, thus allowing for the gameplay simulation. The development of GameGen-O involves a comprehensive data collection and processing effort from scratch. We collect and build the first Open-World Video Game Dataset (OGameData), amassed extensive data from over a hundred of next-generation open-world games, employing a proprietary data pipeline for efficient sorting, scoring, filtering, and decoupled captioning. This robust and extensive OGameData forms the foundation of our model's training process. GameGen-O undergoes a two-stage training process, consisting of foundation model pretraining and instruction tuning. In the first phase, the model is pre-trained on the OGameData via the text-to-video and video continuation, endowing GameGen-O with the capability for open-domain video game generation. In the second phase, the pre-trained model is frozen, and we fine-tuned using a trainable InstructNet, which enables the production of subsequent frames based on multimodal structural instructions. This whole training process imparts the model with the ability to generate and interactively control content. In summary, GameGen-O represents a notable initial step forward in the realm of open-world video game generation via generative models. It underscores the potential of generative models to serve as an alternative to rendering techniques, which can efficiently combine creative generation with interactive capabilities.

AK

366,948 views • 1 year ago

Learn to train an LLM with distributed data while ensuring privacy using federated learning in a new two-part short course, Intro to Federated Learning and Federated Fine-tuning of LLMs with Private Data, created with Flower and taught by Daniel J. Beutel and nic lane. Federated learning allows a single model to be trained across multiple devices, such as phones, or multiple organizations, such as hospitals, without the need to share data to a central server. This two-part course gives you an introduction to federated learning, and then teaches you how to fine-tune your large language model with distributed data using Flower Lab’s open source federated learning framework. You’ll learn: - How to use federated learning to train a variety of models, ranging from speech and vision models to LLMs, across distributed data while offering data privacy options to users and organizations. - Privacy Enhancing Technologies like differential privacy (DP), which obscures individual data by adding calibrated noise to query results. - Two variants of differential privacy - Central and Local - and how to choose depending on your use case. - How to measure and decrease bandwidth usage to make federated learning more practical and efficient with techniques like using pre-trained models and Parameter-Efficient Fine-Tuning - How federated LLM fine-tuning reduces the risk of leaking training data. Sign up here!

Learn to train an LLM with distributed data while ensuring privacy using federated learning in a new two-part short course, Intro to Federated Learning and Federated Fine-tuning of LLMs with Private Data, created with Flower and taught by Daniel J. Beutel and nic lane. Federated learning allows a single model to be trained across multiple devices, such as phones, or multiple organizations, such as hospitals, without the need to share data to a central server. This two-part course gives you an introduction to federated learning, and then teaches you how to fine-tune your large language model with distributed data using Flower Lab’s open source federated learning framework. You’ll learn: - How to use federated learning to train a variety of models, ranging from speech and vision models to LLMs, across distributed data while offering data privacy options to users and organizations. - Privacy Enhancing Technologies like differential privacy (DP), which obscures individual data by adding calibrated noise to query results. - Two variants of differential privacy - Central and Local - and how to choose depending on your use case. - How to measure and decrease bandwidth usage to make federated learning more practical and efficient with techniques like using pre-trained models and Parameter-Efficient Fine-Tuning - How federated LLM fine-tuning reduces the risk of leaking training data. Sign up here!

Andrew Ng

64,538 views • 1 year ago