Loading video...
Video Failed to Load
When evaluated against the WebVoyager benchmark, which tests agent performance on end-to-end real world web tasks, Project Mariner achieved a state-of-the-art result of 83.5% working as a single agent setup.
23,958 views • 1 year ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here
