Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

🚀 Introducing Cover-Agent 🧪 An open-source tool that includes a reimplementation of Meta's TestGen-LLM for automatically enhancing test suites. Manager: "We must improve old test suites for better code coverage. Can you handle it?" Me: "Sure, my favorite task... (Not!) 🤷‍♂️" Meta's team had the idea of using LLMs...

139,227 görüntüleme • 2 yıl önce •via X (Twitter)

10 Yorum

Itamar Friedman profil fotoğrafı
Itamar Friedman2 yıl önce

Original TestGen-LLM paper: Cover-Agent open-source that reimplements TestGen-LLM by Meta:

Itamar Friedman profil fotoğrafı
Itamar Friedman2 yıl önce

Some of the excitement about TestGen-LLM: Check out our blog for more info about TestGen-LLM and Cover-Agent:

Itamar Friedman profil fotoğrafı
Itamar Friedman2 yıl önce

Cover-Agent in action: ‣ Go example: ‣ Python example:

Qodo (formerly Codium) profil fotoğrafı
Qodo (formerly Codium)2 yıl önce

Here is a short how-to and demo video:

Dennis profil fotoğrafı
Dennis2 yıl önce

this is dope

Eyal Cohen profil fotoğrafı
Eyal Cohen2 yıl önce

Will try it out!

Gil Elbaz profil fotoğrafı
Gil Elbaz2 yıl önce

Looks great! Thnx @itamar_mar

Itamar Friedman profil fotoğrafı
Itamar Friedman2 yıl önce

Thank you! Counting on you to open Issues and maybe even PRs 😃

Hvipublik profil fotoğrafı
Hvipublik2 yıl önce

Awesome tool! Can ChatGPT swapped out for some locally running LLM (Mistral of something) for added IP safety?

Itamar Friedman profil fotoğrafı
Itamar Friedman2 yıl önce

Many are asking for it, we will enable it! You are, of course, welcome to contribute a solution that enables that. There are a few examples -- we will follow the solution implemented in PR-Agent

Benzer Videolar

New short course: Evaluating AI Agents! Evals are important for driving AI system improvements, and in this course you'll learn to systematically assess and improve an AI agent’s performance. This is built in partnership with Arize AI and taught by John Gilhuly, Head of Developer Relations, and , Director of Product. I've often found evals to be a critical tool in the agent development process - they can be the difference between picking the right thing to work on vs. wasting weeks of effort. Whether you’re building a shopping assistant, coding agent, or research assistant, having a structured evaluation process helps you refine its performance systematically, rather than relying on random trial and error. This course shows you how to structure your evals to assess the performance of each component of an agent and its end-to-end performance. For each component, you select the appropriate evaluators, test examples, and performance metrics. This helps you identify areas for improvement both during development and in production. (If you're familiar with error analysis in supervised learning, think of this as adapting those ideas to agentic workflows.) In this course, you'll build an AI agent, and add observability to visualize and debug its steps. You’ll learn about code-based evals, in which you write code explicitly to test a certain step, as well as LLM-as-a-Judge evals, in which you prompt an LLM to efficiently come up with ways to evaluate more open-ended outputs. In detail, you’ll: - Understand key differences between evaluating LLM-based systems and traditional software testing. - Add observability to an agent by collecting traces of the steps taken by the agent and visualizing them - Choose the appropriate evaluator - code-based, LLM-as-a-Judge, human-annotation based - for each component. - Compute a convergence score to evaluate if your agent can respond to a query in an efficient number of steps. - Run structured experiments to improve the agent’s performance by exploring changes to the prompt, LLM model, or the agent’s logic. - Understand how to deploy these evaluation techniques to monitor the agent’s performance in production. By the end of this course, you’ll know how to trace AI agents, systematically evaluate them, and improve their performance. Please sign up here:

Andrew Ng

126,355 görüntüleme • 1 yıl önce