Загрузка видео...

Не удалось загрузить видео

На главную

🚀 Introducing Cover-Agent 🧪 An open-source tool that includes a reimplementation of Meta's TestGen-LLM for automatically enhancing test suites. Manager: "We must improve old test suites for better code coverage. Can you handle it?" Me: "Sure, my favorite task... (Not!) 🤷‍♂️" Meta's team had the idea of using LLMs...

139,227 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 10

Фото профиля Itamar Friedman
Itamar Friedman2 лет назад

Original TestGen-LLM paper: Cover-Agent open-source that reimplements TestGen-LLM by Meta:

Фото профиля Itamar Friedman
Itamar Friedman2 лет назад

Some of the excitement about TestGen-LLM: Check out our blog for more info about TestGen-LLM and Cover-Agent:

Фото профиля Itamar Friedman
Itamar Friedman2 лет назад

Cover-Agent in action: ‣ Go example: ‣ Python example:

Фото профиля Qodo (formerly Codium)
Qodo (formerly Codium)2 лет назад

Here is a short how-to and demo video:

Фото профиля Dennis
Dennis2 лет назад

this is dope

Фото профиля Eyal Cohen
Eyal Cohen2 лет назад

Will try it out!

Фото профиля Gil Elbaz
Gil Elbaz2 лет назад

Looks great! Thnx @itamar_mar

Фото профиля Itamar Friedman
Itamar Friedman2 лет назад

Thank you! Counting on you to open Issues and maybe even PRs 😃

Фото профиля Hvipublik
Hvipublik2 лет назад

Awesome tool! Can ChatGPT swapped out for some locally running LLM (Mistral of something) for added IP safety?

Фото профиля Itamar Friedman
Itamar Friedman2 лет назад

Many are asking for it, we will enable it! You are, of course, welcome to contribute a solution that enables that. There are a few examples -- we will follow the solution implemented in PR-Agent

Похожие видео

New short course: Evaluating AI Agents! Evals are important for driving AI system improvements, and in this course you'll learn to systematically assess and improve an AI agent’s performance. This is built in partnership with Arize AI and taught by John Gilhuly, Head of Developer Relations, and , Director of Product. I've often found evals to be a critical tool in the agent development process - they can be the difference between picking the right thing to work on vs. wasting weeks of effort. Whether you’re building a shopping assistant, coding agent, or research assistant, having a structured evaluation process helps you refine its performance systematically, rather than relying on random trial and error. This course shows you how to structure your evals to assess the performance of each component of an agent and its end-to-end performance. For each component, you select the appropriate evaluators, test examples, and performance metrics. This helps you identify areas for improvement both during development and in production. (If you're familiar with error analysis in supervised learning, think of this as adapting those ideas to agentic workflows.) In this course, you'll build an AI agent, and add observability to visualize and debug its steps. You’ll learn about code-based evals, in which you write code explicitly to test a certain step, as well as LLM-as-a-Judge evals, in which you prompt an LLM to efficiently come up with ways to evaluate more open-ended outputs. In detail, you’ll: - Understand key differences between evaluating LLM-based systems and traditional software testing. - Add observability to an agent by collecting traces of the steps taken by the agent and visualizing them - Choose the appropriate evaluator - code-based, LLM-as-a-Judge, human-annotation based - for each component. - Compute a convergence score to evaluate if your agent can respond to a query in an efficient number of steps. - Run structured experiments to improve the agent’s performance by exploring changes to the prompt, LLM model, or the agent’s logic. - Understand how to deploy these evaluation techniques to monitor the agent’s performance in production. By the end of this course, you’ll know how to trace AI agents, systematically evaluate them, and improve their performance. Please sign up here:

Andrew Ng

126,355 просмотров • 1 год назад