Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

Tune Studio is an end-to-end platform for developing applications using Large Language Models. So far, I haven't seen any other platform like this one. You can do everything here: 1. You can curate your data. 2. Use the playground to play with different models and try your ideas. 3.... Fine-tune an open-source model on your data. 4. Deploy the model when you are done. This is awesome for anyone building generative AI applications. You can use Tune Studio to work with any of the open-source models out there. They were one of the few companies to host Llama 2 and Llama 3 before anyone else. Here is a link to check it out: One of their main selling points is that Tune Studio scales! You don't have to worry about serving your model to lots of users. They also have built-in user management, authentication, on-prem support, user context management, and pretty much everything you need to build generative AI applications. Thanks to the Tune team for collaborating with me on this post. We are living through the best years of development tools for AI developers. The field is unstoppable.show more

Santiago

448,157 subscribers

39,101 просмотров • 2 лет назад •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

Комментарии: 5

Фото профиля Anshuman

Anshuman2 лет назад

🔥🔥🔥

Фото профиля Assist

Assist2 лет назад

Do you need a social media manager for your business or personal brand? We’re here to help! Send us a message to discuss a plan that works for you. We can manage and create content for any platform and any industry. Platforms including Facebook, Instagram, Twitter X, YouTube, TikTok, Pinterest, LinkedIn, and more! Learn more ➡️

Фото профиля dkg://Lyvn

dkg://Lyvn2 лет назад

is similar and opensource. Will give tune studio a try and see how they compare. Thanks

Фото профиля Derek Truesdell

Derek Truesdell2 лет назад

I tried to sign up but got "response 404 (backend NotFound), service rules for the path non-existent" when I tried to follow the process. That is not a great sign =)

Фото профиля Anton Ansalmar

Anton Ansalmar2 лет назад

👍

Похожие видео

You can now try Llama 3.1 405B for free (link below)! This is the largest open-source model out there, and for the first time, an open model is competitive with closed models. This time around, Meta did something new: Llama 3.1 has a license that allows developers to use it to enhance other models. For the first time, you can distill Llama 3.1 405B's capabilities into a smaller, more practical model for your use case. First, here is the link where you can play with Llama 3.1 for free: The model is hosted in Tune Studio, an end-to-end platform for developing applications using Large Language Models. They are sponsoring this post. Take a look at the attached video. It will show you how you can fine-tune a simple model using Llama 3.1 without leaving the platform: 1. You can create an empty dataset 2. Use the playground to generate and record interactions with Llama 3.1 3. Modify the dataset directly using the playground 4. Export the data and fine-tune a smaller model Fast and easy! As long as you have a web browser, you can start experimenting with fine-tuning and Llama 3.1. That's all it takes!

You can now try Llama 3.1 405B for free (link below)! This is the largest open-source model out there, and for the first time, an open model is competitive with closed models. This time around, Meta did something new: Llama 3.1 has a license that allows developers to use it to enhance other models. For the first time, you can distill Llama 3.1 405B's capabilities into a smaller, more practical model for your use case. First, here is the link where you can play with Llama 3.1 for free: The model is hosted in Tune Studio, an end-to-end platform for developing applications using Large Language Models. They are sponsoring this post. Take a look at the attached video. It will show you how you can fine-tune a simple model using Llama 3.1 without leaving the platform: 1. You can create an empty dataset 2. Use the playground to generate and record interactions with Llama 3.1 3. Modify the dataset directly using the playground 4. Export the data and fine-tune a smaller model Fast and easy! As long as you have a web browser, you can start experimenting with fine-tuning and Llama 3.1. That's all it takes!

Santiago

55,609 просмотров • 1 год назад

Small Language Models (SML) are the future of AI. "Small" (SML) instead of "Large" (LLM). These small models are highly specialized models with superhuman abilities on specific tasks. Here are two techniques to build these models: • Spectrum • Model Merging I give you a short introduction in the attached video, but here is a quick summary: Spectrum helps us identify the most relevant layers to solve one specific task. We can ignore everything else and focus on fine-tuning these layers. Using Spectrum, we can fine-tune models in a heartbeat. Model Merging combines multiple models into a unique, much better model than any of the individual input models. You can also combine models specialized in different tasks and get a model with multiple abilities. This is the state of the art of productizing models. It's what Arcee.ai's platform does behind the scenes. Arcee collaborated with me on this post and is sponsoring it. There are three main steps to produce a model for your particular use case: 1. You create a dataset by uploading your data. 2. You train a model. At this step, Arcee uses Spectrum and Model Merging to produce a highly specialized model for your task. 3. You can deploy that model to any environment you want. Three important notes: • Training process is 2x faster and 2x cheaper than regular fine-tuning. • Resultant models are smaller and have higher accuracy. • They create these specialized models from open-source models. Check this site so you can fully appreciate how this works: If you want to fine-tune an open-source model, consider Arcee's platform. This is the state of the art.

Small Language Models (SML) are the future of AI. "Small" (SML) instead of "Large" (LLM). These small models are highly specialized models with superhuman abilities on specific tasks. Here are two techniques to build these models: • Spectrum • Model Merging I give you a short introduction in the attached video, but here is a quick summary: Spectrum helps us identify the most relevant layers to solve one specific task. We can ignore everything else and focus on fine-tuning these layers. Using Spectrum, we can fine-tune models in a heartbeat. Model Merging combines multiple models into a unique, much better model than any of the individual input models. You can also combine models specialized in different tasks and get a model with multiple abilities. This is the state of the art of productizing models. It's what Arcee.ai's platform does behind the scenes. Arcee collaborated with me on this post and is sponsoring it. There are three main steps to produce a model for your particular use case: 1. You create a dataset by uploading your data. 2. You train a model. At this step, Arcee uses Spectrum and Model Merging to produce a highly specialized model for your task. 3. You can deploy that model to any environment you want. Three important notes: • Training process is 2x faster and 2x cheaper than regular fine-tuning. • Resultant models are smaller and have higher accuracy. • They create these specialized models from open-source models. Check this site so you can fully appreciate how this works: If you want to fine-tune an open-source model, consider Arcee's platform. This is the state of the art.

Santiago

164,162 просмотров • 2 лет назад

You can now fine-tune Llama 3 without writing a single line of code! We are moving at breakneck speed. I recorded a video to show you how to fine-tune any open-source model in a few minutes. I'm using a GPT capable of taking a problem and turning it into a fine-tuned model that will solve it. You don't have to write any code. You only need to explain to a GPT what problem you want to solve and tell it you want to use Llama 3. For example, "fine-tune Llama 3" or "deploy zephyr." It feels magic. The system will recommend a dataset and fine-tune the model for you. I'm using Monster API, a platform that specializes in making fine-tuning and deploying open-source models easy and fast. Their stack is well-optimized to maximize fine-tuning efficiency using techniques like Q-Lora and vLLM. They are behind the GPT. Here is what you need to do: 1. Create an account at 2. Load the GPT with the link below This is as simple as it gets. When you are done, you can click a button to deploy the model and start using it. I have 10,000 free credits for anyone using the code "SANTIAGO" in the dashboard. You can use these credits to access, fine-tune, and deploy these open-source models. You can also keep up with their latest updates, and get free credits and special offers on their Discord server:

You can now fine-tune Llama 3 without writing a single line of code! We are moving at breakneck speed. I recorded a video to show you how to fine-tune any open-source model in a few minutes. I'm using a GPT capable of taking a problem and turning it into a fine-tuned model that will solve it. You don't have to write any code. You only need to explain to a GPT what problem you want to solve and tell it you want to use Llama 3. For example, "fine-tune Llama 3" or "deploy zephyr." It feels magic. The system will recommend a dataset and fine-tune the model for you. I'm using Monster API, a platform that specializes in making fine-tuning and deploying open-source models easy and fast. Their stack is well-optimized to maximize fine-tuning efficiency using techniques like Q-Lora and vLLM. They are behind the GPT. Here is what you need to do: 1. Create an account at 2. Load the GPT with the link below This is as simple as it gets. When you are done, you can click a button to deploy the model and start using it. I have 10,000 free credits for anyone using the code "SANTIAGO" in the dashboard. You can use these credits to access, fine-tune, and deploy these open-source models. You can also keep up with their latest updates, and get free credits and special offers on their Discord server:

Santiago

324,602 просмотров • 2 лет назад

99% of AI applications are cool-looking demos. Impressive, but don't get fooled by the hype. It takes a lot to build enterprise-grade products that deliver real value. I have at least three weekly conversations with companies that want to use a Large Language Model with their data. The demand is huge! Here is one idea about what you can do to help. The use cases that most of these companies want to solve are similar: They have an extensive knowledge base and want to build a simple application that uses that information to answer questions. In other words, they need help building Retrieval Augmented Generation (RAG) applications they can use in many different scenarios: 1. To train new employees 2. To help their support team 3. To search old meetings and documents 4. To help with their research However, building these systems is not straightforward. Yes, there's a lot of information online, but there aren't enough people who know how to create solutions that work. Here is the idea: Today, you can build an enterprise-grade RAG application without writing code. A couple of MIT PhDs with 10+ years of experience building AI applications created . It's a no-code platform for building applications using Large Language Models. They are partnering with me on this post. You can use Stack AI to create, test, and deploy an end-to-end production-ready AI system. It's SOC-2, HIPAA, and GDPR compliant and offers SSO, role management, access control, and on-premise deployments. Of course, you can use the platform with any LLM on the market now. It's the whole nine yards for building AI applications. Check them out here: 2023 was about models. 2024 is about the tools using these models to build production-ready applications. That's where I'd start.

99% of AI applications are cool-looking demos. Impressive, but don't get fooled by the hype. It takes a lot to build enterprise-grade products that deliver real value. I have at least three weekly conversations with companies that want to use a Large Language Model with their data. The demand is huge! Here is one idea about what you can do to help. The use cases that most of these companies want to solve are similar: They have an extensive knowledge base and want to build a simple application that uses that information to answer questions. In other words, they need help building Retrieval Augmented Generation (RAG) applications they can use in many different scenarios: 1. To train new employees 2. To help their support team 3. To search old meetings and documents 4. To help with their research However, building these systems is not straightforward. Yes, there's a lot of information online, but there aren't enough people who know how to create solutions that work. Here is the idea: Today, you can build an enterprise-grade RAG application without writing code. A couple of MIT PhDs with 10+ years of experience building AI applications created . It's a no-code platform for building applications using Large Language Models. They are partnering with me on this post. You can use Stack AI to create, test, and deploy an end-to-end production-ready AI system. It's SOC-2, HIPAA, and GDPR compliant and offers SSO, role management, access control, and on-premise deployments. Of course, you can use the platform with any LLM on the market now. It's the whole nine yards for building AI applications. Check them out here: 2023 was about models. 2024 is about the tools using these models to build production-ready applications. That's where I'd start.

Santiago

197,675 просмотров • 2 лет назад

Web scraping is a critical skill, and yet nobody talks about it. How do you think companies are training their Large Language Models? Where do you think the data comes from? But web scraping goes beyond all of that. Imagine giving an AI agent access to any public online data in real time! I like to call this "web-scrapping on demand", and I'm pretty sure it's going to unlock unlimited power for AI applications. I recorded a quick video to show you how you can do this using Apify. I've talked about them before, and they are collaborating with me on this post. They have one of the best open-source web scraping and browser automation libraries out there: But it gets much better than this! You can use MCP to connect your AI Agents and applications to the Apify platform and use any specialized actor on demand to scrape and process online data. In the video, I used Cursor to scrape LinkedIn posts with the words "Machine Learning" in real time. Worked like a charm with no code needed! Here is a link to the platform: Think about this: You can now feed your AI applications with any public data on demand! We aren't ready for what's coming.

Web scraping is a critical skill, and yet nobody talks about it. How do you think companies are training their Large Language Models? Where do you think the data comes from? But web scraping goes beyond all of that. Imagine giving an AI agent access to any public online data in real time! I like to call this "web-scrapping on demand", and I'm pretty sure it's going to unlock unlimited power for AI applications. I recorded a quick video to show you how you can do this using Apify. I've talked about them before, and they are collaborating with me on this post. They have one of the best open-source web scraping and browser automation libraries out there: But it gets much better than this! You can use MCP to connect your AI Agents and applications to the Apify platform and use any specialized actor on demand to scrape and process online data. In the video, I used Cursor to scrape LinkedIn posts with the words "Machine Learning" in real time. Worked like a charm with no code needed! Here is a link to the platform: Think about this: You can now feed your AI applications with any public data on demand! We aren't ready for what's coming.

Santiago

101,473 просмотров • 1 год назад

Building with AI gets easier every day. Here is an open-source library that makes integrating AI into an application extremely easy: Star the repository! This library alone can make React the best front-end framework out there! There are a bunch of cool things I like about CopilotKit. Here are 3 of them: 1. It allows you to take any -powered agent and bring it into your application. (This is a brand-new feature!) 2. You can build an AI-powered chatbot in your application. The chatbot will have access to your context and can act on the application. 3. You can build a RAG workflow to process and answer questions from a real-time knowledge base. I recorded a video to show you how simple it is to make some of this happen. A few lines of code, and you are in business. Here is a link to the sample application: CopilotKit is open-source. You can self-host it. You can use it with any LLM. Thanks to the team for showing me their tool and collaborating with me on this post!

Santiago

108,824 просмотров • 2 лет назад

Choose a model (any model) and build your application with it. Do not spend time swapping models early on. Do not try to optimize before you have a working system. This is one of the first recommendations I make to every new team I consult with. Eventually, it will be time to optimize the model. • You may need a cheaper model • You may need a faster model • You might need a smarter model Good luck if you stitched together 12 different APIs and SDKs from 7 different vendors. Over half of the companies I consult for run on Microsoft software and have access to Microsoft Foundry. Microsoft Foundry is a complete agentic ecosystem. If you're in that world and building AI applications, Microsoft Foundry is where everything lives: • Models (largest selection in the market) • Agentic SDK (Python, C#, JavaScript/TypeScript) • Tools • Evaluations • Monitoring They are fully integrated with GitHub and Visual Studio Code. The best part: Their agentic platform is fully agnostic of the models you use. You can integrate with any model using the same OpenAI-style API. Swapping one model for another takes 1 second.

Choose a model (any model) and build your application with it. Do not spend time swapping models early on. Do not try to optimize before you have a working system. This is one of the first recommendations I make to every new team I consult with. Eventually, it will be time to optimize the model. • You may need a cheaper model • You may need a faster model • You might need a smarter model Good luck if you stitched together 12 different APIs and SDKs from 7 different vendors. Over half of the companies I consult for run on Microsoft software and have access to Microsoft Foundry. Microsoft Foundry is a complete agentic ecosystem. If you're in that world and building AI applications, Microsoft Foundry is where everything lives: • Models (largest selection in the market) • Agentic SDK (Python, C#, JavaScript/TypeScript) • Tools • Evaluations • Monitoring They are fully integrated with GitHub and Visual Studio Code. The best part: Their agentic platform is fully agnostic of the models you use. You can integrate with any model using the same OpenAI-style API. Swapping one model for another takes 1 second.

Santiago

12,014 просмотров • 5 месяцев назад

50% of my consulting work right now is helping companies use open-source models at scale. Everyone knows how to use an open-source LLM on their computers, but it's really hard to do this at scale for thousands of users. Here is how this plays out: 1. A team builds a prototype using DeepSeek. 2. Everything looks good. It works! 3. They follow an online guide to deploy the model online. 4. They ask 10 users to try the app. 5. Latency spikes everywhere. 6. The entire system halts. 7. They blame DeepSeek and try again using a new model. The problem is always with scaling inference, not the model. Here is one recommendation I give companies: Check out Nebius Token Factory if you don't want to ever think about deploying an open-source model again. This is a managed inference platform for deploying open-source LLMs at scale. This is not for prototypes or research experiments. This is for when you have a real application with real users. Three important notes about Token Factory: • You have complete control over how inference runs. • You have predictable tail latency (P99, not averages). • No surprise costs when you scale up. You can preplan your budget. Check it out here: Here are two codes you can use to get 100 hours of GPU usage on me: ymJLFa2ARYSKEdqb AdckcZaYjm7KqYY7 Thanks to the Nebius team for their continuous partnership.

50% of my consulting work right now is helping companies use open-source models at scale. Everyone knows how to use an open-source LLM on their computers, but it's really hard to do this at scale for thousands of users. Here is how this plays out: 1. A team builds a prototype using DeepSeek. 2. Everything looks good. It works! 3. They follow an online guide to deploy the model online. 4. They ask 10 users to try the app. 5. Latency spikes everywhere. 6. The entire system halts. 7. They blame DeepSeek and try again using a new model. The problem is always with scaling inference, not the model. Here is one recommendation I give companies: Check out Nebius Token Factory if you don't want to ever think about deploying an open-source model again. This is a managed inference platform for deploying open-source LLMs at scale. This is not for prototypes or research experiments. This is for when you have a real application with real users. Three important notes about Token Factory: • You have complete control over how inference runs. • You have predictable tail latency (P99, not averages). • No surprise costs when you scale up. You can preplan your budget. Check it out here: Here are two codes you can use to get 100 hours of GPU usage on me: ymJLFa2ARYSKEdqb AdckcZaYjm7KqYY7 Thanks to the Nebius team for their continuous partnership.

Santiago

48,443 просмотров • 5 месяцев назад

108 workflow templates you can use to build AI applications without writing any code. You can use these templates with n8n. I recorded the attached video to show you how it works. n8n is the workhorse behind an open-source, self-hosted AI starter kit you can install on your computer. They are sponsoring this post. Here is the link to the starter kit repository: And here is the spreadsheet with the 108 templates: Whatever idea you have, search for something similar in the list of templates, and you'll save a ton of time. Lately, I've talked to many non-coders who want to start using AI more seriously to build things. n8n is perfect for that.

108 workflow templates you can use to build AI applications without writing any code. You can use these templates with n8n. I recorded the attached video to show you how it works. n8n is the workhorse behind an open-source, self-hosted AI starter kit you can install on your computer. They are sponsoring this post. Here is the link to the starter kit repository: And here is the spreadsheet with the 108 templates: Whatever idea you have, search for something similar in the list of templates, and you'll save a ton of time. Lately, I've talked to many non-coders who want to start using AI more seriously to build things. n8n is perfect for that.

Santiago

78,133 просмотров • 1 год назад

Tired of generic, hallucinating LLMs? We use AI for almost everything these days, whether it’s helping with math homework or looking up remedies when we’re sick. It's become part of our daily routines. Now imagine if Einstein helped you solve that homework, or Eminem wrote the lyrics for your next rap song. What if your AI was built specifically for your world? Trained on your domain, your data, you become the master of it. Introducing OpenLedger AI Studio. A powerful platform to build and use specialized models with built-in explainability and attribution. Fine-tune using your data through our Model Factory, and deploy them seamlessly. Your models won’t just reason - With OpenLedger, you can explain every decision, trace its data, and reward contributors. Truly Open, Verified, and Explainable AI begins here.

Tired of generic, hallucinating LLMs? We use AI for almost everything these days, whether it’s helping with math homework or looking up remedies when we’re sick. It's become part of our daily routines. Now imagine if Einstein helped you solve that homework, or Eminem wrote the lyrics for your next rap song. What if your AI was built specifically for your world? Trained on your domain, your data, you become the master of it. Introducing OpenLedger AI Studio. A powerful platform to build and use specialized models with built-in explainability and attribution. Fine-tune using your data through our Model Factory, and deploy them seamlessly. Your models won’t just reason - With OpenLedger, you can explain every decision, trace its data, and reward contributors. Truly Open, Verified, and Explainable AI begins here.

OpenLedger

172,497 просмотров • 11 месяцев назад

The future of AI is open-source. And ollama is the easiest way to build AI applications with open-source LLMs. Here's how to build a free, private RAG app using open-source tools. We'll use: - Ollama for LLMs and embedding models - PostgreSQL for data storage and retrieval - pgai Vectorizer for embedding creation and sync (I use Nomic for embeddings and tinnyllama as my LLM but you can substitute them for any models on Ollama)

The future of AI is open-source. And ollama is the easiest way to build AI applications with open-source LLMs. Here's how to build a free, private RAG app using open-source tools. We'll use: - Ollama for LLMs and embedding models - PostgreSQL for data storage and retrieval - pgai Vectorizer for embedding creation and sync (I use Nomic for embeddings and tinnyllama as my LLM but you can substitute them for any models on Ollama)

Avthar

34,261 просмотров • 1 год назад

Knowledge graphs for representing information are unbeatable. After this, you will never build a RAG system without knowledge graphs. It will take you five lines of code to build a knowledge graph with your data. I recorded a video to show you how you can do this. I used Cognee, an open-source library that outperforms any basic vector search approach in terms of retrieval relevance. They are collaborating with me on this post. Cognee is: • Easy to use • Reduces hallucinations • Open-source Here is a link to the repository: They also offer a comprehensive platform and UI with Python notebooks you can utilize to manage your data. Here is the link:

Knowledge graphs for representing information are unbeatable. After this, you will never build a RAG system without knowledge graphs. It will take you five lines of code to build a knowledge graph with your data. I recorded a video to show you how you can do this. I used Cognee, an open-source library that outperforms any basic vector search approach in terms of retrieval relevance. They are collaborating with me on this post. Cognee is: • Easy to use • Reduces hallucinations • Open-source Here is a link to the repository: They also offer a comprehensive platform and UI with Python notebooks you can utilize to manage your data. Here is the link:

Santiago

125,928 просмотров • 9 месяцев назад

The Studio Beta is Live ⚡️ After weeks of onboarding calls & preparations - all Beta Testers now have exclusive access to the Beta Studio For the next 2 weeks, we challenged them to train & fine-tune as many AI models as they can with their own data This phase is a crucial part of our iterative process towards the V1, and by involving real users from the start we ensure the final product is intuitive and packed with the functionalities you need We're not just building this for you - we're building this with you.

The Studio Beta is Live ⚡️ After weeks of onboarding calls & preparations - all Beta Testers now have exclusive access to the Beta Studio For the next 2 weeks, we challenged them to train & fine-tune as many AI models as they can with their own data This phase is a crucial part of our iterative process towards the V1, and by involving real users from the start we ensure the final product is intuitive and packed with the functionalities you need We're not just building this for you - we're building this with you.

Vertical AI

90,886 просмотров • 1 год назад

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

Andrew Ng

162,842 просмотров • 2 лет назад

How can you solve complex tasks using a Large Language Model? Here is a 2-minute introduction to everything you need to know to 10x the quality of your results. Let's talk about three techniques, in order of complexity, starting with the easiest one: • In-Context Learning • Indexing + In-Context Learning • Fine-tuning In-Context Learning The team that trained GPT-3 found something they couldn't explain: You can condition a model using examples of how you want it to behave. I included an example prompt in the attached video. You can "teach" the model how you want it to interpret questions, select the correct answers, and format the results by giving a few examples. You can also give specific knowledge to the model that will be helpful when formulating answers. We call this approach "grounding the model." There's another example in the video. Indexing + In-Context Learning Unfortunately, there is a limit to how much data you can include in a prompt. We call this the "context size." One version of GPT-4 supports a context of approximately 6,000 words, while the other supports 25,000 words. Although this sounds like a lot, many applications need more than that. Imagine you wrote a book and want to build an application to answer any questions about your story. What happens if your book is longer than the context? That's where Indexing comes in. Using a model, you can turn every book passage into an embedding. These are vectors, numbers that "encode" the passage's text. You can then store these embeddings in a particular database that supports fast retrieval of these vectors. You can then turn any question into an embedding and search the database for the list of passages that are similar to that query. Instead of using the entire book to ask the model, you can now use the relevant passages as in-context information, effectively working around the context size limitation. Fine-tuning Fine-tuning can give you an extra boost to get reliable outputs from your LLM. It is, however, the most complex approach on the list. There are different approaches to fine-tuning a model with your data. A popular technique is to process your data with your LLM and use the outputs to train a new classifier that solves your specific task. Notice that here you aren't modifying the LLM. Instead, you are chaining it with your trained classifier. Another approach is to modify the parameters of the LLM using your data. Think of this as "rewiring" the model in a way that solves your particular task. The results and costs will vary depending on how many layers you want to fine-tune from the original model. Many companies think that fine-tuning is the solution to their problems. In my experience, many will benefit from exploring the other two approaches. I love explaining Machine Learning and Artificial Intelligence ideas. If you enjoy in-depth content like this, follow me Santiago so you don't miss what comes next.

How can you solve complex tasks using a Large Language Model? Here is a 2-minute introduction to everything you need to know to 10x the quality of your results. Let's talk about three techniques, in order of complexity, starting with the easiest one: • In-Context Learning • Indexing + In-Context Learning • Fine-tuning In-Context Learning The team that trained GPT-3 found something they couldn't explain: You can condition a model using examples of how you want it to behave. I included an example prompt in the attached video. You can "teach" the model how you want it to interpret questions, select the correct answers, and format the results by giving a few examples. You can also give specific knowledge to the model that will be helpful when formulating answers. We call this approach "grounding the model." There's another example in the video. Indexing + In-Context Learning Unfortunately, there is a limit to how much data you can include in a prompt. We call this the "context size." One version of GPT-4 supports a context of approximately 6,000 words, while the other supports 25,000 words. Although this sounds like a lot, many applications need more than that. Imagine you wrote a book and want to build an application to answer any questions about your story. What happens if your book is longer than the context? That's where Indexing comes in. Using a model, you can turn every book passage into an embedding. These are vectors, numbers that "encode" the passage's text. You can then store these embeddings in a particular database that supports fast retrieval of these vectors. You can then turn any question into an embedding and search the database for the list of passages that are similar to that query. Instead of using the entire book to ask the model, you can now use the relevant passages as in-context information, effectively working around the context size limitation. Fine-tuning Fine-tuning can give you an extra boost to get reliable outputs from your LLM. It is, however, the most complex approach on the list. There are different approaches to fine-tuning a model with your data. A popular technique is to process your data with your LLM and use the outputs to train a new classifier that solves your specific task. Notice that here you aren't modifying the LLM. Instead, you are chaining it with your trained classifier. Another approach is to modify the parameters of the LLM using your data. Think of this as "rewiring" the model in a way that solves your particular task. The results and costs will vary depending on how many layers you want to fine-tune from the original model. Many companies think that fine-tuning is the solution to their problems. In my experience, many will benefit from exploring the other two approaches. I love explaining Machine Learning and Artificial Intelligence ideas. If you enjoy in-depth content like this, follow me Santiago so you don't miss what comes next.

Santiago

384,510 просмотров • 3 лет назад

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

Andrew Ng

131,767 просмотров • 1 год назад

This is a pretty wild model! You can use it to turn an image into a 3D object with texture. The quality is out of this world! I'm not even a designer, and I've been using this nonstop for the last 2 hours. The model is Hunyuan 3D 2.1. It's open source. You'll find model weights, training/inference code, data pipelines, and architecture on their repository. You can even fine-tune it if you want! GitHub Repository: By the way, the model runs on consumer-grade GPUs. You don't need a datacenter for this! I've been using the model from the HuggingFace demo page: To use it, go to the link and upload an image. That's it! Check out the video I recorded for a couple of examples.

This is a pretty wild model! You can use it to turn an image into a 3D object with texture. The quality is out of this world! I'm not even a designer, and I've been using this nonstop for the last 2 hours. The model is Hunyuan 3D 2.1. It's open source. You'll find model weights, training/inference code, data pipelines, and architecture on their repository. You can even fine-tune it if you want! GitHub Repository: By the way, the model runs on consumer-grade GPUs. You don't need a datacenter for this! I've been using the model from the HuggingFace demo page: To use it, go to the link and upload an image. That's it! Check out the video I recorded for a couple of examples.

Santiago

44,783 просмотров • 1 год назад

This is next-level smart: An open-source platform that evaluates your prompts and automatically refines them based on the results. Of course, it feels obvious after you see it: • You write a prompt • The system evaluates it across different scenarios • Based on the results, it refines it to improve results I recorded a quick video to show you how it works. It's pretty cool stuff! Here are some of the problems and best practices for teams building AI applications: 1. Testing your prompts manually doesn't scale 2. Prompts should not be spread throughout the codebase 3. Non-technical people need easy access to your prompts 4. Prompts can always use a version history to track changes 5. Monitoring the performance of prompts overtime is critical Evaluating the prompts is what keeps me up at night from this list. Of all the conversations I've had with companies and people building AI applications, this is the area that's causing the most pain. Testing a prompt is difficult. Think about how you'd test the response of a model subjectively. What do you account for, "tone," "objectivity," "completeness," "creativity," "readability," etc.? Last week, I met the developers behind Latitude, an open-source prompt engineering platform trying to solve all of these issues. You can try the platform in two ways: • You can self-host the platform. Free and open-source. • If you want to try their online product, their free tier is huge. Here is the link: Thanks to the Latitude team for collaborating with me on this post, and congratulations on going live with their product!

This is next-level smart: An open-source platform that evaluates your prompts and automatically refines them based on the results. Of course, it feels obvious after you see it: • You write a prompt • The system evaluates it across different scenarios • Based on the results, it refines it to improve results I recorded a quick video to show you how it works. It's pretty cool stuff! Here are some of the problems and best practices for teams building AI applications: 1. Testing your prompts manually doesn't scale 2. Prompts should not be spread throughout the codebase 3. Non-technical people need easy access to your prompts 4. Prompts can always use a version history to track changes 5. Monitoring the performance of prompts overtime is critical Evaluating the prompts is what keeps me up at night from this list. Of all the conversations I've had with companies and people building AI applications, this is the area that's causing the most pain. Testing a prompt is difficult. Think about how you'd test the response of a model subjectively. What do you account for, "tone," "objectivity," "completeness," "creativity," "readability," etc.? Last week, I met the developers behind Latitude, an open-source prompt engineering platform trying to solve all of these issues. You can try the platform in two ways: • You can self-host the platform. Free and open-source. • If you want to try their online product, their free tier is huge. Here is the link: Thanks to the Latitude team for collaborating with me on this post, and congratulations on going live with their product!

Santiago

64,146 просмотров • 1 год назад

Forget about HTML, JavaScript, and CSS. You can build AI and data applications using Python alone! Python is the best programming language out there. You can now use it across the stack. Take a look at Taipy, an open-source Python library to build end-to-end production applications. Star their repository: I recorded a quick video showing how to build a simple chat interface to talk to OpenAI GPT-3.5 using Taipy alone. Something important to keep in mind: Taipy's goal is not to replace web developers but to provide an alternative to those who need to build applications without web experience. If you are a data scientist or someone dealing with data, Taipy will simplify your life considerably. Thanks to the team behind Taipy for collaborating with me on this post. Here is a link to the full demo. Download this code and run it. It takes 30 seconds:

Forget about HTML, JavaScript, and CSS. You can build AI and data applications using Python alone! Python is the best programming language out there. You can now use it across the stack. Take a look at Taipy, an open-source Python library to build end-to-end production applications. Star their repository: I recorded a quick video showing how to build a simple chat interface to talk to OpenAI GPT-3.5 using Taipy alone. Something important to keep in mind: Taipy's goal is not to replace web developers but to provide an alternative to those who need to build applications without web experience. If you are a data scientist or someone dealing with data, Taipy will simplify your life considerably. Thanks to the team behind Taipy for collaborating with me on this post. Here is a link to the full demo. Download this code and run it. It takes 30 seconds:

Santiago

233,878 просмотров • 2 лет назад

$I just added the new Llama 3.2 1B and 3B models to LitGPT, the open-source LLM library I help develop (focused on efficiency and code readability). LitGPT allows you to fine-tune and use these models on the cloud or a laptop. So, if you are looking for something to play with this weekend: # 1) Finetune the model litgpt finetune_lora meta-llama/Llama-3.2-1B \ --data JSON \ --data.json_path my_custom_dataset.json \ --train.epochs 1 \ --out_dir out/llama-3.2-finetuned \ --precision bf16-true # 2) Chat with the model litgpt chat out/llama-3.2-finetuned/final # 3) Serve the model via an API endpoint litgpt serve out/llama-3.2-finetuned/final$

I just added the new Llama 3.2 1B and 3B models to LitGPT, the open-source LLM library I help develop (focused on efficiency and code readability). LitGPT allows you to fine-tune and use these models on the cloud or a laptop. So, if you are looking for something to play with this weekend: # 1) Finetune the model litgpt finetune_lora meta-llama/Llama-3.2-1B \ --data JSON \ --data.json_path my_custom_dataset.json \ --train.epochs 1 \ --out_dir out/llama-3.2-finetuned \ --precision bf16-true # 2) Chat with the model litgpt chat out/llama-3.2-finetuned/final # 3) Serve the model via an API endpoint litgpt serve out/llama-3.2-finetuned/final

Sebastian Raschka

65,529 просмотров • 1 год назад