Muratcan Koylan's banner

Muratcan Koylan

@koylanai • 21,600 subscribers

Member of Technical Staff (Agents), Research - @sullyai | ex-99Ravens AI

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Most "agentic" failures happen because the model lacks specific domain knowledge. Here I'm showing how loading a Skills Plugin solves that for dataset generation. I turned a research paper (shows fine-tuning dataset creation from books) into a Skill and just gave a book link and asked it to generate the dataset. Claude Code produced a clean SFT-ready dataset from a single URL. The same process can be applied to anything.

Most "agentic" failures happen because the model lacks specific domain knowledge. Here I'm showing how loading a Skills Plugin solves that for dataset generation. I turned a research paper (shows fine-tuning dataset creation from books) into a Skill and just gave a book link and asked it to generate the dataset. Claude Code produced a clean SFT-ready dataset from a single URL. The same process can be applied to anything.

Muratcan Koylan

104,519 Aufrufe • vor 6 Monaten

Ollama WebUI is amazing. Download with Docker, choose your open-source model, customize it without writing a single line of code.

Ollama WebUI is amazing. Download with Docker, choose your open-source model, customize it without writing a single line of code.

Muratcan Koylan

149,118 Aufrufe • vor 2 Jahren

I replaced a $2000/month predictive analysis software with Julius AI. In this video, I'll show you how to create a complex B2B marketing dataset using GPT. Then, I'll identify companies with a high likelihood of churn, determining their reasons for doing so. Julius AI achieves this using a chain of deterministic actions to leverage various machine learning models. This tool is incredibly powerful, and I'm excited to share my process with you.

I replaced a $2000/month predictive analysis software with Julius AI. In this video, I'll show you how to create a complex B2B marketing dataset using GPT. Then, I'll identify companies with a high likelihood of churn, determining their reasons for doing so. Julius AI achieves this using a chain of deterministic actions to leverage various machine learning models. This tool is incredibly powerful, and I'm excited to share my process with you.

Muratcan Koylan

53,696 Aufrufe • vor 2 Jahren

Reinforcement Learning from Human Feedback (RLHF) is gaining traction. This field aims to make AI more responsible by including human values and preferences. In this video, Nathan Lambert, a research scientist and RLHF team lead at Hugging Face explores its inner workings, applications and industry impact. RLHF has gained the spotlight in recent years. The growth of language models like Anthropic’s Claude and OpenAI's ChatGPT have increased interest in human-feedback integration. "There are some rumors that Open AI had two teams; one was doing RLHF and the other instruction fine-tuning. And the RLHF team kept getting more and more performance." Understanding RLHF The RLHF process has three main steps: Pre-training: Much like with GPT models, the journey starts with pre-training on a large corpus of data. This can range from text data, web scrapes, to specialized datasets. Reward Modeling: This is the RLHF counterpart of supervised fine-tuning in large language models. This stage involves creating a reward model that resonates with human values and preferences. RL Optimization: This stage parallels reward modeling and reinforcement learning in traditional AI models. The AI system fine-tunes itself based on the reward model, employing reinforcement learning algorithms for that extra layer of optimization. The Data Challenge Data collection and curation in RLHF closely resemble the challenges you'd encounter in large language model training. Datasets from organizations like OpenAI can serve as a useful foundation. However, the need for high-quality, task-specific data cannot be overstated. Implementing RLHF: A Practical Guide If you’re someone who loves getting hands-on with AI libraries like Hugging Face, implementing RLHF is right way to do. It’s essential to understand its limitations. Think about model stability, over-optimization, and exploration strategies, much like you would when prompt engineering. Ongoing Research and Next Steps While he suggests that some basics figured out, there are layers of complexity that still need to be unraveled: 1. New Benchmarks: How do we measure the effectiveness of RLHF? 2. Preference Modeling: How can the model be made to understand human preferences better? 3. Interpreting RLHF: Much like explainability in traditional models, how do we make RLHF more interpretable? 4. System-Wide Evaluation: Going beyond individual performance, how does RLHF affect an entire system? The Transformative Power of RLHF Whether you're an AI developer, a business analyst, or a marketer, RLHF promises to revolutionize your domain. Imagine customer service chatbots that understand human emotions better, or content generators that align more closely with human values. RLHF is an emerging field that focuses on enhancing machine learning models through human feedback. While it tackles important issues like bias and ethics, its broader goal is to improve system performance across various applications. Whether you're deeply invested in the ethics of AI or simply curious about advancements in machine learning, RLHF offers valuable insights. If you're interested in the next wave of AI development, this area is definitely one to watch.

Reinforcement Learning from Human Feedback (RLHF) is gaining traction. This field aims to make AI more responsible by including human values and preferences. In this video, Nathan Lambert, a research scientist and RLHF team lead at Hugging Face explores its inner workings, applications and industry impact. RLHF has gained the spotlight in recent years. The growth of language models like Anthropic’s Claude and OpenAI's ChatGPT have increased interest in human-feedback integration. "There are some rumors that Open AI had two teams; one was doing RLHF and the other instruction fine-tuning. And the RLHF team kept getting more and more performance." Understanding RLHF The RLHF process has three main steps: Pre-training: Much like with GPT models, the journey starts with pre-training on a large corpus of data. This can range from text data, web scrapes, to specialized datasets. Reward Modeling: This is the RLHF counterpart of supervised fine-tuning in large language models. This stage involves creating a reward model that resonates with human values and preferences. RL Optimization: This stage parallels reward modeling and reinforcement learning in traditional AI models. The AI system fine-tunes itself based on the reward model, employing reinforcement learning algorithms for that extra layer of optimization. The Data Challenge Data collection and curation in RLHF closely resemble the challenges you'd encounter in large language model training. Datasets from organizations like OpenAI can serve as a useful foundation. However, the need for high-quality, task-specific data cannot be overstated. Implementing RLHF: A Practical Guide If you’re someone who loves getting hands-on with AI libraries like Hugging Face, implementing RLHF is right way to do. It’s essential to understand its limitations. Think about model stability, over-optimization, and exploration strategies, much like you would when prompt engineering. Ongoing Research and Next Steps While he suggests that some basics figured out, there are layers of complexity that still need to be unraveled: 1. New Benchmarks: How do we measure the effectiveness of RLHF? 2. Preference Modeling: How can the model be made to understand human preferences better? 3. Interpreting RLHF: Much like explainability in traditional models, how do we make RLHF more interpretable? 4. System-Wide Evaluation: Going beyond individual performance, how does RLHF affect an entire system? The Transformative Power of RLHF Whether you're an AI developer, a business analyst, or a marketer, RLHF promises to revolutionize your domain. Imagine customer service chatbots that understand human emotions better, or content generators that align more closely with human values. RLHF is an emerging field that focuses on enhancing machine learning models through human feedback. While it tackles important issues like bias and ethics, its broader goal is to improve system performance across various applications. Whether you're deeply invested in the ethics of AI or simply curious about advancements in machine learning, RLHF offers valuable insights. If you're interested in the next wave of AI development, this area is definitely one to watch.

Muratcan Koylan

27,005 Aufrufe • vor 2 Jahren

A guy from Estonia has built one of the most creative collaborative AI games ever. It's an endless world creation and 2 days ago, the 2,000th hexagon was generated with over 700 registered players. Create here: Incredible work 🔥

A guy from Estonia has built one of the most creative collaborative AI games ever. It's an endless world creation and 2 days ago, the 2,000th hexagon was generated with over 700 registered players. Create here: Incredible work 🔥

Muratcan Koylan

11,866 Aufrufe • vor 2 Jahren

Keine weiteren Inhalte verfügbar