Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

Microsoft released a groundbreaking model that can be used for web automation, with MIT license 🔥👏 OmniParser is a state-of-the-art UI parsing/understanding model that outperforms GPT4V in parsing. 👏

473,069 Aufrufe • vor 1 Jahr •via X (Twitter)

9 Kommentare

Profilbild von merve
mervevor 1 Jahr

Model: Interesting highlight for me was Mind2Web (a benchmark for web navigation) capabilities of the model, which unlocks agentic behavior for RPA agents. no need for hefty web automation pipelines that get broken when the website/app design changes! Amazing work.

Profilbild von merve
mervevor 1 Jahr

Lastly, the authors also fine-tune this model on open-set detection for interactable regions and see if they can use it as a plug-in for VLMs and it actually outperforms off-the-shelf open-set detectors like GroundingDINO. 👏

Profilbild von merve
mervevor 1 Jahr

Here's a bunch of i/o examples for the model ⇓

Profilbild von Sar
Sarvor 1 Jahr

I saw your post and made me think of which I had just come across. Would be interested in hearing from @skyvernai about the possibility of using OmniParser to replace their current approach

Profilbild von Lisan al Gaib
Lisan al Gaibvor 1 Jahr

Incorrect, it has AGPL- 3.0 license since it is based on YOLOv8 by Ultralytics which has AGPL- 3.0 license. You can use it comercially, however your code must be publicly availabe, which makes it comercially unviable again.

Profilbild von Tarek Ayed
Tarek Ayedvor 1 Jahr

It's weird to compare to GPT-4V which is notoriously bad at image understanding and OCR, right? I'd be curious to know how it fares agains 4o, Sonnet, Gemini Flash and Pro, etc.

Profilbild von Johannes
Johannesvor 1 Jahr

Might be nice to combine with anthropic's computer use

Profilbild von Thread Reader App
Thread Reader Appvor 1 Jahr

Your thread is going viral! #TopUnroll 🙏🏼@dl4senses for 🥇unroll

Profilbild von Didier Lacroix
Didier Lacroixvor 1 Jahr

@threadreaderapp unroll

Ähnliche Videos