Sam Rodriques's banner
Sam Rodriques's profile picture

Sam Rodriques

@SGRodriques20,937 subscribers

Director and CEO at FutureHouse and Edison Scientific. Building an AI scientist. https://t.co/aNx8D1QmfN. https://t.co/rQYoPOwV8Q

Videos

SGRodriques's profile picture

Today, we’re announcing Kosmos, our newest AI Scientist, available to use now. Users estimate Kosmos does 6 months of work in a single day. One run can read 1,500 papers and write 42,000 lines of code. At least 79% of its findings are reproducible. Kosmos has made 7 discoveries so far, which we are releasing today, in areas ranging from neuroscience to material science and clinical genetics, in collaboration with our academic beta testers. Three of these discoveries reproduced unpublished findings; four are net new, validated contributions to the scientific literature. AI-accelerated science is here. Our core innovation in Kosmos is the use of a structured, continuously-updated world model. As described in our technical report, Kosmos’ world model allows it to process orders of magnitude more information than could fit into the context of even the longest-context language models, allowing it to synthesize more information and pursue coherent goals over longer time horizons than Robin or any of our other prior agents. In this respect, we believe Kosmos is the most compute-intensive language agent released so far in any field, and by far the most capable AI Scientist available today. The use of a persistent world model also enables single Kosmos trajectories to produce highly complex outputs that require multiple significant logical leaps. As with all of our systems, Kosmos is designed with transparency and verifiability in mind: every conclusion in a Kosmos report can be traced through our platform to the specific lines of code or the specific passages in the scientific literature that inspired it, ensuring that Kosmos’ findings are fully auditable at all times. We are also using this opportunity to announce the launch of Edison Scientific, a new commercial spinout of FutureHouse, which will be focused on commercializing our agents and applying them to automate scientific research in drug discovery and beyond. Edison will be taking over management of the FutureHouse platform, where you can access Kosmos alongside our Literature, Molecules, and Precedent agents (previously Crow, Phoenix, and Owl). Edison will continue to offer free tier usage for casual users and academics, while also offering higher rate limits and additional features for users who need them. You can read more about this spinout on our blog, below. A few important notes if you’re going to try Kosmos. Firstly, Kosmos is different from many other AI tools you might have played with, including our other agents. It is more similar to a Deep Research tool than it is to a chatbot: it takes some time to figure out how to prompt it effectively, and we have tried to include guidelines on this to help (see below). It costs $200/run right now (200 credits per run, and $1/credit), with some free tier usage for academics. This is heavily discounted; people who sign up for Founding Subscriptions now can lock in the $1/credit price indefinitely, but the price ultimately will probably be higher. Again, this is less chatbot and more research tool, something you run on high-value targets as needed. Some caveats are also warranted. Firstly, we find that 80% of Kosmos findings are reproducible, which also means 20% are not -- some things it says will be wrong. Also, Kosmos certainly does produce outputs that are the equivalent to several months of human labor, but it also often goes down rabbit holes or chases statistically significant yet scientifically irrelevant findings. We often run Kosmos multiple times on the same objective in order to sample the various research avenues it can take. There are still a bunch of rough edges on the UI and such, which we are working on. Finally, we are aware that the 6 month figure is much greater than estimates by other AI labs, like METR, about the length of tasks that AI Agents can currently perform. You can read discussion about this in our blog post. Huge congratulations to our team that put this together, led by Ludovico Mitchener and Michaela Hinks: Angela Yiu, Benjamin Chang, Sid Narayanan, Edwin Melville-Green, Albert Bou, Arvis Sulovari, Oz Wassie, Jon Laurent. A particular shout out to Michael Skarlinski and his team that rebuilt the platform for this launch, especially Andy Cai Andy Cai, Richard Magness, Remo Storni, Tyler Nadolski Tyler Nadolski, Mayk Caldas Mayk Caldas, Sam Cox Sam Cox and more. This work would not have been possible without significant contributions from academic collaborators Mathieu Bourdenx, Eric Landsness, Dániel Barabási, Nicky Evans, Tonio Buonassisi, Bruna Gomes, Shriya Reddy, Martha Foiani, and Randall Bateman. We also want to thank our numerous supporters, especially Eric Schmidt, who has been a tremendous ally. We will have more to say about our supporters soon!

Sam Rodriques

731,411 次观看 • 7 个月前

SGRodriques's profile picture

Today, we’re announcing the first major discovery made by our AI Scientist with the lab in the loop: a promising new treatment for dry AMD, a major cause of blindness. Our agents generated the hypotheses, designed the experiments, analyzed the data, iterated, even made figures for the paper. The resulting manuscript is a first-of-a-kind in the natural sciences, in which everything that needed to be done to write the paper was done by AI agents, apart from actually conducting the physical experiments in the lab and writing the final manuscript. We are also introducing Robin, the first multi-agent system that fully automates the in-silico components of scientific discovery, which made this discovery. This is the first time that we are aware of that hypothesis generation, experimentation, and data analysis have been joined up in closed loop, and is the beginning of a massive acceleration in the pace of scientific discovery that will be driven by these agents. We will be open-sourcing the code and data next week. Robin is a multi-agent system that uses Crow, Falcon, and Finch, the agents on our platform, to generate novel hypotheses, plan experiments, and analyze data. We asked Robin to find a new treatment for dry age-related macular degeneration. Robin considered the disease mechanisms associated with dry AMD, proposed a specific experimental assay that could be used to evaluate hypotheses in the wet lab, and proposed specific molecules we could test in that assay. We tested the molecules and gave it the resulting data, which it analyzed before proposing more experiments. In the end, it identified Ripasudil, a Rho Kinase inhibitor (ROCK inhibitor) that is approved in Japan for several other diseases, which seems very promising as potential treatment for dry AMD. It also identified specific molecular mechanisms that might underlie the effects of Ripasudil in RPE cells, from an RNA sequencing experiment it proposed. To be clear, no one has proposed using ROCK inhibitors to treat dry AMD in the literature before, as far as we can find, and I think it would have been very difficult for us to come up with this hypothesis without the agents. We have also run the proposed treatment by several experts in AMD, who confirm that it is interesting and novel. Moreover, this project was fast: with Robin in hand, the entire project took about 10 weeks, which is way shorter than it would have taken if we had been doing all of the in-silico components ourselves. Important caveats: We are real biologists at FutureHouse, so I want to be clear that although the discovery here is exciting, we are not claiming that we have cured dry AMD. Fully validating this hypothesis as a treatment for dry AMD will take human trials, which will take much longer. Also, this discovery is cool, but it is not yet a "move 37"-style discovery. At the current rate of progress, I'm sure we will get to that level soon. Congratulations to the team. Congratulations in particular to Robin, which generated the hypotheses, proposed the experiments, analyzed the data and generated the figures. And major congratulations also to the human team, which built Robin: Michaela Hinks, Ali Ghareeb, Benjamin Chang, Ludovico Mitchener, Mo Razzak, Kiki Szostkiewicz, and Angela Yiu.

Sam Rodriques

1,106,688 次观看 • 1 年前

SGRodriques's profile picture

Today, we are launching the first publicly available AI Scientist, via the FutureHouse Platform. Our AI Scientist agents can perform a wide variety of scientific tasks better than humans. By chaining them together, we've already started to discover new biology really fast. With the platform, we are bringing these capabilities to the wider community. Watch our long-form video, in the comments below, to learn more about how the platform works and how you can use it to make new discoveries, and go to our website or see the comments below to access the platform. We are releasing three superhuman AI Scientist agents today, each with their own specialization: A general-purpose agent (Crow); An agent to automate literature reviews (Falcon); and An agent to answer the question “Has anyone done X before” (Owl). We are also releasing an experimental agent, Phoenix, that has access to a wide variety of tools for planning experiments in chemistry. More on that below. The three literature search agents (Crow, Falcon, and Owl) have benchmarked superhuman performance. They also have access to a large corpus of full scientific texts, which means that you can ask them more detailed questions about experimental protocols and study limitations that general-purpose web search agents, which usually only have access to abstracts, might miss. Our agents also use a variety of factors to distinguish source quality, so that they don’t end up relying on low-quality papers or pop-science sources. Finally, and critically, we have an API, which is intended to allow researchers to integrate our agents into their workflows. Phoenix is an experimental project we put together recently just to demonstrate what can happen if you give the agents access to lots of scientific tools. It is not better than humans at planning experiments yet, and it makes a lot more mistakes than Crow, Falcon, or Owl. We want to see all the ways you can break it! The agents we are releasing today cannot yet do all (or even most!) aspects of scientific research autonomously. However, as we show in the video, you can already use them to generate and evaluate new hypotheses and plan new experiments way faster than before. Internally, we also have dedicated agents for data analysis, hypothesis generation, protein engineering, and more, and we plan to launch these on the platform in the coming months as well. Within a year or two, it is easy to imagine that the vast majority of desk work that scientists do today will be accelerated with the help of AI agents like the ones we are releasing today. The platform is currently free-to-use. Over time, depending on how people use it, we may implement pricing plans. If you want higher rate limits, especially for research projects, get in touch. Michael Skarlinski, Andrew White 🐦‍⬛, Tyler Nadolski, Remo Storni, James Braza, Ludovico Mitchener, Michaela Hinks, as well as Jason Carman and his team for making such fantastic videos of us!

Sam Rodriques

724,382 次观看 • 1 年前

SGRodriques's profile picture

Today, we’re pushing a major update to Edison Analysis, our data analysis agent, which is tuned for scientific research and SOTA across data analysis benchmarks. In contrast to Kosmos, which runs for 6-12 hours and produces tens of thousands of lines of code, Edison Analysis runs for seconds to minutes and is best for specific, well-defined computational tasks. It is available both on our platform under the Analysis tab, and via API, and costs only one credit per run, so it is available to users on both free and paid tiers. Edison Analysis is a modified version of the data analysis agent Kosmos uses in its trajectories. Try it out! One of the most important improvements over our previous data analysis agents has been the addition of a specialized data retrieval tool. Edison Analysis can either use this tool to access data, or can pull data down directly via API. To evaluate this tool, we ranked the most commonly used public data repositories across recent papers from BioRxiv, and created a new benchmark that measures the ability of a language agent system to retrieve raw data from those sources. Edison Analysis gets 71% on this benchmark, and we’ll be working to increase this over time. You can read more about our benchmarks in the our blog post, link below. Some features worth highlighting: 1. Edison Analysis produces a report on the analysis it runs, along with a Jupyter notebook that you can download to reproduce the analysis yourself. Every figure it produces is linked back to the specific lines of code used to produce the figure, to make it easy to reproduce. 2. It works well with both Python and R. 3. One of the best uses for Edison Analysis is to use it to retrieve datasets that you can then analyze with Kosmos. We have a bunch of major improvements to Edison Analysis coming in the next few months that we’re excited to share. In the meantime, congratulations to the team, especially Ludovico Mitchener, Jon Laurent, Conor Igoe , Alex Andonian, and many more.

Sam Rodriques

61,760 次观看 • 7 个月前

没有更多内容可加载