正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

Can data owners & LM developers collaborate to build a strong shared model while each retaining data control? Introducing FlexOlmo💪, a mixture-of-experts LM enabling: • Flexible training on your local data without sharing it • Flexible inference to opt in/out your data anytime At 37B parameters, FlexOlmo is competitive... show more

Weijia Shi

9,881 subscribers

93,434 次观看 • 11 个月前 •via X (Twitter)

健康养生科学技术教育

Anya Rossi• Live Now

Private livecam show

10 条评论

Weijia Shi 的头像

Weijia Shi11 个月前

❓Why FlexOlmo? The current "monolithic" pretraining paradigm centralizes all data during training & requires one-time decisions on data inclusion/exclusion. Once data is used for training, it's difficult to add and remove. This creates challenges: For data owners: • Required to share raw data for model training • Loss of control once they give the data away For LM developers: • Valuable data remains locked behind closed doors • No straightforward way to update models with new data without catastrophic forgetting

Weijia Shi 的头像

Weijia Shi11 个月前

💡FlexOlmo Recipe 1️⃣ Each data owner trains an expert locally using a shared anchor model 2️⃣ Expert modules from different data owners merge into a single MoE without joint training 3️⃣ At inference, you can control which expert modules along with their data serve particular users or queries.

Premium 的头像

Premium11 个月前

Go ad-free on X with Premium+ It's the highest return on investment you can make.

Weijia Shi 的头像

Weijia Shi11 个月前

📑Paper: ✍️Blog: 💻Code: 🤗Models:

Weijia Shi 的头像

Weijia Shi11 个月前

📊 FlexOlmo Performance Evaluated across 31 tasks with models up to 37B parameters (20B active) 🔧 Training: Start with 7B public model (pretrained on 1T tokens), then each data owner continues pretraining for 50B tokens on simulated closed data before combining experts. Key results: • 41% improvement brought by leveraging the closed data sources • 10.1% better than existing model merging methods • Even outperforms standard MoE with unrestricted data access

Weijia Shi 的头像

Weijia Shi11 个月前

⚔️ Data Extraction Attack Can shared expert modules leak your private data? We tested training data extraction attacks on FlexOlmo to find out: • FlexOlmo: 0.7% extraction rate • Overfitted model (100 epochs) on the data: 60% extraction rate

FreeMind 的头像

FreeMind11 个月前

Can this method be applied to other models? Aren't routers all trained?

Weijia Shi 的头像

Weijia Shi11 个月前

It can be applied to other base models as well. The router is not jointly trained. Each expert is associated with a corresponding router embedding that is learned independently.

carlo 的头像

carlo11 个月前

@ShirleyYXWu wow, super impressive! gotta check the minimal hardware for running it 🤓

Jdjf 的头像

Jdjf11 个月前

Fix You’re a fucking AI bitch you’re the reason why our accounts are being suspended

相关视频

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵

Ai2

369,259 次观看 • 11 个月前

Introducing Mobi-π: Mobilizing Your Robot Learning Policy. Our method: ✈️ enables flexible mobile skill chaining 🪶 without requiring additional policy training data 🏠 while scaling to unseen scenes 🧵↓

Introducing Mobi-π: Mobilizing Your Robot Learning Policy. Our method: ✈️ enables flexible mobile skill chaining 🪶 without requiring additional policy training data 🏠 while scaling to unseen scenes 🧵↓

Jingyun Yang

58,829 次观看 • 1 年前

1/ Felt 3.0 is here with GIS superpowers for your entire organization. Get your data out of data silos, and empower teams to build geospatial apps and dashboards in seconds without heavy development. 💪

1/ Felt 3.0 is here with GIS superpowers for your entire organization. Get your data out of data silos, and empower teams to build geospatial apps and dashboards in seconds without heavy development. 💪

Felt

84,147 次观看 • 1 年前

Your data is being harvested for ads and AI training while you click 'I agree' on EULAs. The alternative? ICP puts data sovereignty in YOUR hands without the headache of running infrastructure. Store code and data while maintaining control. Take back ownership. Watch the full video:

Your data is being harvested for ads and AI training while you click 'I agree' on EULAs. The alternative? ICP puts data sovereignty in YOUR hands without the headache of running infrastructure. Store code and data while maintaining control. Take back ownership. Watch the full video:

DFINITY Foundation

13,213 次观看 • 1 年前

GP practices are patient data protectors The new Health Bill will remove the GP & move the data controller (protector) role to the Secretary of State for Health Your sharing data ‘opt-out’ may not be protected in the same way I have ‘opted out’ but doesn’t stop sharing data

GP practices are patient data protectors The new Health Bill will remove the GP & move the data controller (protector) role to the Secretary of State for Health Your sharing data ‘opt-out’ may not be protected in the same way I have ‘opted out’ but doesn’t stop sharing data

Dr Steve Taylor

41,138 次观看 • 26 天前

Today we're announcing that hybrid agentic inference is coming to Perplexity Computer. Computer can split tasks between a local model running on your machine and frontier models in the cloud. This keeps private data on your device and maximizes token efficiency. Coming soon.

Today we're announcing that hybrid agentic inference is coming to Perplexity Computer. Computer can split tasks between a local model running on your machine and frontier models in the cloud. This keeps private data on your device and maximizes token efficiency. Coming soon.

Perplexity

345,906 次观看 • 14 天前

"The way our product is set up, I don't have access to your data" Palantir CEO Alex Karp says that his data analytics company "wouldn't be able to" sell NHS patient data on to third parties, and he believes people will opt in to data sharing #BBCLauraK

"The way our product is set up, I don't have access to your data" Palantir CEO Alex Karp says that his data analytics company "wouldn't be able to" sell NHS patient data on to third parties, and he believes people will opt in to data sharing #BBCLauraK

BBC Politics

301,621 次观看 • 2 年前

While politicians do nothing, we’re taking on Meta. They hoover up your data and force targeted paid content on you. We’ve filed a complaint to Meta — and now you can join us. We've made a simple tool for you to email Meta’s data protection officer and take control of your data 💪 ✍

While politicians do nothing, we’re taking on Meta. They hoover up your data and force targeted paid content on you. We’ve filed a complaint to Meta — and now you can join us. We've made a simple tool for you to email Meta’s data protection officer and take control of your data 💪 ✍

Good Law Project

15,303 次观看 • 1 年前

Perplexity CEO Aravind Srinivas on the biggest threat to the data center industry: It's not competition. It's not regulation. It's decentralisation. "The biggest threat to a data center is if the intelligence can be packed locally on a chip that's running on the device and then there's no need to inference all of it on like one centralized data center." He outlines how this could work in practice. Personalisation doesn't necessarily require on-device model training. Retrieval augmented generation, tool calls, and local data can already tailor AI to individual users. But the real unlock? Test time training. Aravind Srinivas describes a future where AI lives on your device, watches how you work and gradually automates your repetitive tasks. "Imagine we crack test time training where the AI watches tasks you repeatedly do on your local system, adapts to you over time and starts automating a lot of the things you do." The key insight: in this model, the intelligence belongs to you. It's your data, your device, your personalised AI brain. And if that future arrives, the economics of centralised infrastructure start to collapse. "That really disrupts the whole data center industry. It doesn't make sense to spend all this money, 500 billion, 5 trillion, whatever on building all the centralized data centers across the world that do a lot of the intelligence workloads for people." The companies spending trillions on centralised infrastructure may want to rethink where intelligence actually needs to live.

Perplexity CEO Aravind Srinivas on the biggest threat to the data center industry: It's not competition. It's not regulation. It's decentralisation. "The biggest threat to a data center is if the intelligence can be packed locally on a chip that's running on the device and then there's no need to inference all of it on like one centralized data center." He outlines how this could work in practice. Personalisation doesn't necessarily require on-device model training. Retrieval augmented generation, tool calls, and local data can already tailor AI to individual users. But the real unlock? Test time training. Aravind Srinivas describes a future where AI lives on your device, watches how you work and gradually automates your repetitive tasks. "Imagine we crack test time training where the AI watches tasks you repeatedly do on your local system, adapts to you over time and starts automating a lot of the things you do." The key insight: in this model, the intelligence belongs to you. It's your data, your device, your personalised AI brain. And if that future arrives, the economics of centralised infrastructure start to collapse. "That really disrupts the whole data center industry. It doesn't make sense to spend all this money, 500 billion, 5 trillion, whatever on building all the centralized data centers across the world that do a lot of the intelligence workloads for people." The companies spending trillions on centralised infrastructure may want to rethink where intelligence actually needs to live.

Big Brain AI

90,102 次观看 • 3 个月前

Sharing one person's Social Security data without consent is a felony. DOGE shared EVERYONE'S PRIVATE DATA. 340 million felonies.

Sharing one person's Social Security data without consent is a felony. DOGE shared EVERYONE'S PRIVATE DATA. 340 million felonies.

Social Security Works ❌👑

148,727 次观看 • 4 个月前

Still following your human intuition to mix corpora from different sources for language model pre-training 🧠? Everyone says that data mixture has a big impact on model performance, but how - and why🕵️? Did you know that web corpora are actually highly impactful for downstream tasks 🏆? Let's check out our preprint "RegMix: Data Mixture as Regression for Language Model Pre-training" 📄 🔬In this paper, we've proposed an automatic data mixture method RegMix that achieves a 6.3% improvement over human selection on the widely used HellaSwag benchmark - and it only needs a 2% extra training FLOPs! 📈 Details in the thread 🧵

Still following your human intuition to mix corpora from different sources for language model pre-training 🧠? Everyone says that data mixture has a big impact on model performance, but how - and why🕵️? Did you know that web corpora are actually highly impactful for downstream tasks 🏆? Let's check out our preprint "RegMix: Data Mixture as Regression for Language Model Pre-training" 📄 🔬In this paper, we've proposed an automatic data mixture method RegMix that achieves a 6.3% improvement over human selection on the widely used HellaSwag benchmark - and it only needs a 2% extra training FLOPs! 📈 Details in the thread 🧵

Qian Liu

54,778 次观看 • 1 年前

Did you know that Facebook made $134.9 billion in 2023 from selling your data? Our society has come to the point where we accepted an unfair model in which privacy is being exploited and data gifted to big tech that is selling it for billions. We intend to break the norm of how data is shared on the internet because it is outdated and unfair. You should be empowered with more control and monetization models for your data!

Did you know that Facebook made $134.9 billion in 2023 from selling your data? Our society has come to the point where we accepted an unfair model in which privacy is being exploited and data gifted to big tech that is selling it for billions. We intend to break the norm of how data is shared on the internet because it is outdated and unfair. You should be empowered with more control and monetization models for your data!

Solana ID 🪷

30,447 次观看 • 1 年前

Heard of Deepseek and want to try it out? But afraid of giving your data to China? Made a quick vid on how you can run Deepseek locally on your computer so you can keepo all your data using fullmoon.

Heard of Deepseek and want to try it out? But afraid of giving your data to China? Made a quick vid on how you can run Deepseek locally on your computer so you can keepo all your data using fullmoon.

Alex Hugh Sam

13,511 次观看 • 1 年前

You don’t have to take your data to the AI— you can bring AI out to your data. Flynn Maloy, CMO of Lenovo Data Center Infrastructure Solutions Group, breaks down how hybrid AI drives success across the enterprise - from devices to edge to private data centers. #AdvancingAI

You don’t have to take your data to the AI— you can bring AI out to your data. Flynn Maloy, CMO of Lenovo Data Center Infrastructure Solutions Group, breaks down how hybrid AI drives success across the enterprise - from devices to edge to private data centers. #AdvancingAI

AMD

15,427 次观看 • 11 个月前

Sharing one person's Social Security data without consent is a felony. DOGE shared everyone's private data. 340 MILLION FELONIES. Nicole Sandler

Sharing one person's Social Security data without consent is a felony. DOGE shared everyone's private data. 340 MILLION FELONIES. Nicole Sandler

Social Security Works ❌👑

19,042 次观看 • 3 个月前

📡 Want to build with real-time crypto data? Introducing Developer Academy — your technical learning hub for: 🔸 REST & WebSocket APIs 🔸 Market data integration 🔸 Trading tools & data feeds Perfect for developers, analysts, and crypto builders. 👉

📡 Want to build with real-time crypto data? Introducing Developer Academy — your technical learning hub for: 🔸 REST & WebSocket APIs 🔸 Market data integration 🔸 Trading tools & data feeds Perfect for developers, analysts, and crypto builders. 👉

Binance Academy

99,632 次观看 • 10 个月前

You can get preeeeetty close to a columns and row flexible data table using grid now

You can get preeeeetty close to a columns and row flexible data table using grid now

luis.

41,132 次观看 • 1 年前

Besimple (Besimple AI) helps you spin up your own data annotation platform in 60 seconds, so you can build robust evaluation and training data without the hassle of looking at complex spreadsheets. Congrats on the launch, Yi Zhong & Bill Wang!

Besimple (Besimple AI) helps you spin up your own data annotation platform in 60 seconds, so you can build robust evaluation and training data without the hassle of looking at complex spreadsheets. Congrats on the launch, Yi Zhong & Bill Wang!

Y Combinator

17,163 次观看 • 1 年前

Data is the foundation of any AI training. All of the GPUs in the world can't train an AI model if they don't have data to train it on. Don't forget: Grass is the data layer of AI.

Data is the foundation of any AI training. All of the GPUs in the world can't train an AI model if they don't have data to train it on. Don't forget: Grass is the data layer of AI.

Grass

211,923 次观看 • 2 年前

Shapefiles have officially landed on Earth. By simply uploading a .zip file, you can render features and attributes as flexible, cloud-native data layers. This is the ultimate silo-breaker for professionals who need to combine local zoning data, property boundaries, and more to get a complete geospatial picture. Add your first Shapefile to Google Earth now.

Shapefiles have officially landed on Earth. By simply uploading a .zip file, you can render features and attributes as flexible, cloud-native data layers. This is the ultimate silo-breaker for professionals who need to combine local zoning data, property boundaries, and more to get a complete geospatial picture. Add your first Shapefile to Google Earth now.

Google Earth

170,008 次观看 • 1 个月前