Loading video...

Video Failed to Load

Go Home

You cannot really train all these models to cater to different preferences. Can you have one model that caters to all? Furong Huang unveils a technique to customize AI models on-the-fly to user goals, reducing the computational cost of tailoring AI systems to individual needs.

410,541 views • 1 year ago •via X (Twitter)

1 Comments

FAR.AI's profile picture
FAR.AI1 year ago

Follow us for AI safety insights and watch the full video

Related Videos

Small Language Models (SML) are the future of AI. "Small" (SML) instead of "Large" (LLM). These small models are highly specialized models with superhuman abilities on specific tasks. Here are two techniques to build these models: • Spectrum • Model Merging I give you a short introduction in the attached video, but here is a quick summary: Spectrum helps us identify the most relevant layers to solve one specific task. We can ignore everything else and focus on fine-tuning these layers. Using Spectrum, we can fine-tune models in a heartbeat. Model Merging combines multiple models into a unique, much better model than any of the individual input models. You can also combine models specialized in different tasks and get a model with multiple abilities. This is the state of the art of productizing models. It's what Arcee.ai's platform does behind the scenes. Arcee collaborated with me on this post and is sponsoring it. There are three main steps to produce a model for your particular use case: 1. You create a dataset by uploading your data. 2. You train a model. At this step, Arcee uses Spectrum and Model Merging to produce a highly specialized model for your task. 3. You can deploy that model to any environment you want. Three important notes: • Training process is 2x faster and 2x cheaper than regular fine-tuning. • Resultant models are smaller and have higher accuracy. • They create these specialized models from open-source models. Check this site so you can fully appreciate how this works: If you want to fine-tune an open-source model, consider Arcee's platform. This is the state of the art.

Santiago

164,162 views • 1 year ago

DeepSeek-R1 shattered the assumption that performant AI models must be built closed source with loss-leading computational costs. This is the reality that Web3 x Crypto firms have been waiting for, leading me to believe that the most performant AI models in the future will be built on-chain. Resource Requirements DeepSeek R1 (671 billion parameters), which took over a billion dollars, 2,000 Nvidia H800 GPUs, and over 55 days, beat benchmarks held by OpenAI’s o1 mode (near 2 trillion parameters)l, which required hundreds of billions of dollars to develop along with over 16,000 advanced GPUs. The idea that AI models must be closed-source and have loss-leading computational costs to succeed is crumbling. The Existing Decentralized AI Narrative AI x Crypto projects believed that crowdsourced, public, decentralized AI would eventually create better models than their centralized counterparts. This had thus far not been true, as the highest-performing models had come from closed-source companies like OpenAI and Anthropic. Crypto x AI companies have adapted to this by specializing in infrastructure rather than model-building. For example, GPU marketplaces like , The Render Network, io.net, and Exabits have developed sustainable revenues. Companies that allow users to share their network bandwidth like touch grass and Gradient have found their niche in supplying services, like distributed web scraping, to web2 clients. Storage networks like Arweave Ecosystem, Filecoin, and Ocean Protocol have also done well by being the platform on which these projects are built. Supply networks have flourished because of their ability to tailor their cheaper and more scalable services to off-chain customers. Renewed Focus Now that GPU and financial resources are no longer limitations to creating quality AI models, web3 AI companies can focus on replicating DeepSeek’s effectiveness while offering new benefits like modality, user ownership, censorship resistance, privacy, and more. Pantera Capital has funded companies in this space like and Sentient that believe they can match or exceed the performance of traditional AI companies while offering additional services or benefits. , for example, is building a platform where anyone can monetize AI models, data sets, and applications in a collaborative space. Users can permissionlessly train models manually, provide training data, and create tailored AI models with no-code tools. They are only able to cater to all these stakeholders (AI developers, users, resource providers) because everything is tied to their native Sahara blockchain. We invested in them precisely for this reason. The Future of AI will be built with Web3 Infrastructure I believe that supply-side projects will continue to grow, while consumer-facing projects can begin competing with web2 competitors by taking advantage of their ability to build networks that invite community involvement. and Sentient, for example, have begun setting up systems for users to train models based on the users’ expertise. These platforms will allow users to pick and choose the data and integrations to whatever they are applying the model towards. Sahara already has over 780,000 users on their waitlist while Sentient has over 1 million interactions. In the near future, I believe that the most performant AI models will be built on-chain. For the full blog post, read my newsletter.

paul.nft

32,461 views • 1 year ago