Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

launching our open source OCR tool today! try it out with some terrible pdfs and let me know how it goes:

Tyler Maran

1,428 subscribers

342,540 Aufrufe • vor 1 Jahr •via X (Twitter)

Wissenschaft & Technologie Bildung

Anya Rossi• Live Now

Private livecam show

10 Kommentare

Profilbild von Karthik Kannan

Karthik Kannanvor 1 Jahr

how are you dealing with hallucinations in the text if you're using GPT to OCR the text? while great for some use cases i've noticed hallucinations when it comes to dense text at scale. is there some benchmarks you're comparing against

Profilbild von Tyler Maran

Tyler Maranvor 1 Jahr

hey Karthik. we're working pretty hard on a benchmark right now. Noticed the same issues with dense text (sideway text too). It's one of the big steps we need to take as we start on a fine tuning dataset. I'll be sharing as we make progress!

Profilbild von Bryan Lee

Bryan Leevor 1 Jahr

Waited a few minutes and nothing happened.

Profilbild von Tyler Maran

Tyler Maranvor 1 Jahr

hey Bryan! sorry got more traffic than we expected and we had to up our limits. should be good to go now!

Profilbild von Dan Manastireanu

Dan Manastireanuvor 1 Jahr

You should check out General OCR Theory, a specialized llm for OCR 2.0 Unfortunately, they don't have a gpt-4o benchmark for comparisson, but the results are pretty impressive

Profilbild von Daniel Tenner

Daniel Tennervor 1 Jahr

@ycombinator I’m literally just about to implement structured document OCR for accounts, tax computations and other similar docs, for my biz’s internal CRM. I thought “oh great I’ll use this instead of Textract”. But I can’t because of the “book a demo” thing. And no pricing. 😕

Profilbild von Harshil Prajapati

Harshil Prajapativor 1 Jahr

@ycombinator Make it open source

Profilbild von Tyler Maran

Tyler Maranvor 1 Jahr

@ycombinator ... it is open source 👆

Profilbild von Tom Osman 🐦‍⬛

Tom Osman 🐦‍⬛vor 1 Jahr

trying now with some different one's but doesn't seem to be working? Using Arc if that helps at all.

Profilbild von Tyler Maran

Tyler Maranvor 1 Jahr

we got a lot of traffic and had to up our limits! should be good to go now!

Ähnliche Videos

End of the weekend built this invoice builder using bolt.new and Supabase Used supabase edge functions to send emails (didn’t know bolt could handle that so well!) Still figuring out how to generate exact PDFs based on selected templates if you know any good libraries let me know in comments. Planning to open source it soon after a few more fixes and features let me know if you’d like to try it out!

End of the weekend built this invoice builder using bolt.new and Supabase Used supabase edge functions to send emails (didn’t know bolt could handle that so well!) Still figuring out how to generate exact PDFs based on selected templates if you know any good libraries let me know in comments. Planning to open source it soon after a few more fixes and features let me know if you’d like to try it out!

Dhruval

54,999 Aufrufe • vor 1 Jahr

Someone try this and let me know how it turns out

Someone try this and let me know how it turns out

Clown World ™ 🤡

226,603 Aufrufe • vor 4 Monaten

Let me know how it goes

Let me know how it goes

Scott Jet Set Trainer

85,658 Aufrufe • vor 1 Jahr

🚨🤖 Today, I'm launching an AI agent that gets things done across iPhone apps. It's powered by OpenAI GPT 4.1 and is open source. Try it out!

🚨🤖 Today, I'm launching an AI agent that gets things done across iPhone apps. It's powered by OpenAI GPT 4.1 and is open source. Try it out!

Rounak Jain

170,225 Aufrufe • vor 1 Jahr

Shuuuu this is a tough one, but truuussst me you’ll feel really good and accomplished after smashing it! Try out this Functional Workout and let me know how it goes! 🫶🏾🙌🏾 Song: LaTique - Sawela #ShedKilosWithPhume #Team5am #FetchYourBody2025

Shuuuu this is a tough one, but truuussst me you’ll feel really good and accomplished after smashing it! Try out this Functional Workout and let me know how it goes! 🫶🏾🙌🏾 Song: LaTique - Sawela #ShedKilosWithPhume #Team5am #FetchYourBody2025

Phume

31,204 Aufrufe • vor 11 Monaten

The new editor is live! Build any page layout as easily as dragging things around, and get an automatically responsive website that you can publish instantly. Try it out and let me know how it goes.

The new editor is live! Build any page layout as easily as dragging things around, and get an automatically responsive website that you can publish instantly. Try it out and let me know how it goes.

Hernán Sartorio

69,886 Aufrufe • vor 1 Jahr

I made a thing --> A visual builder to help you master that one-shot prompt, so give it a try and let me know how it goes 👇

I made a thing --> A visual builder to help you master that one-shot prompt, so give it a try and let me know how it goes 👇

Paul Gosnell

38,359 Aufrufe • vor 1 Jahr

Everyone is sleeping on this new OCR model! dots-ocr is a new 1.7B vision-language model that achieves SOTA performance on multilingual document parsing. - Supports 100+ languages - Works with both images and PDFs - Handles text, tables, formulas seamlessly 100% open-source.

Everyone is sleeping on this new OCR model! dots-ocr is a new 1.7B vision-language model that achieves SOTA performance on multilingual document parsing. - Supports 100+ languages - Works with both images and PDFs - Handles text, tables, formulas seamlessly 100% open-source.

Akshay 🚀

251,968 Aufrufe • vor 10 Monaten

Today we're launching Paxel: a free tool that analyzes your Claude, Codex, and Cursor coding sessions and gives you a profile of how you build with AI. It runs locally inside Docker, and your code never leaves your machine. Try it at

Y Combinator

586,432 Aufrufe • vor 15 Tagen

You can now chat locally with AI and your PDFs completely offline. Blueshell runs in your browser and doesn't communicate with any server. No installation and open source. Here's how to access it:

You can now chat locally with AI and your PDFs completely offline. Blueshell runs in your browser and doesn't communicate with any server. No installation and open source. Here's how to access it:

Paul Couvert

42,315 Aufrufe • vor 1 Jahr

Yasyf Mohamedali and I built Summ, an open-source tool that provides intelligent search and question-answering across large sets of transcripts. ⚡️ We turn your unstructured transcripts into queryable SQL and JSON! Try it out and read how we did it here:

Yasyf Mohamedali and I built Summ, an open-source tool that provides intelligent search and question-answering across large sets of transcripts. ⚡️ We turn your unstructured transcripts into queryable SQL and JSON! Try it out and read how we did it here:

Markie Wagner

54,646 Aufrufe • vor 3 Jahren

Today I’m launching Open Interpreter, an open-source Code Interpreter that runs locally. Summarize PDFs, visualize datasets, and control your browser — all from a ChatGPT-like interface in your terminal. ● $ pip install open-interpreter $ interpreter

Today I’m launching Open Interpreter, an open-source Code Interpreter that runs locally. Summarize PDFs, visualize datasets, and control your browser — all from a ChatGPT-like interface in your terminal. ● $ pip install open-interpreter $ interpreter

killian

2,055,410 Aufrufe • vor 2 Jahren

OMG! YES! 🥳 We just went totally BANANAS 🍌 with our new image retouch tool! You HAVE to try it! It's available NOW at Trust me, you're going to love it! 😉 Let me know what masterpieces you create! 🎨

OMG! YES! 🥳 We just went totally BANANAS 🍌 with our new image retouch tool! You HAVE to try it! It's available NOW at Trust me, you're going to love it! 😉 Let me know what masterpieces you create! 🎨

Lia

11,627 Aufrufe • vor 9 Monaten

I’ll let you know how it goes

I’ll let you know how it goes

Husk

368,867 Aufrufe • vor 2 Monaten

You don't need to pay $200 for AI. We're launching Open Operator - an open source reference project that shows how easy it is to add web browsing capabilities to your existing AI tool. It's early, slow, and might not work everywhere. But it's free and open source! 🔗👇

You don't need to pay $200 for AI. We're launching Open Operator - an open source reference project that shows how easy it is to add web browsing capabilities to your existing AI tool. It's early, slow, and might not work everywhere. But it's free and open source! 🔗👇

Paul Klein IV

349,171 Aufrufe • vor 1 Jahr

🚨 Introducing Dragula 🧛‍♂️ An idiot-proof drag-and-drop engine for SwiftUI. A lot of you asked me to open source it - so here you go :) Try it. Break it. Let me know! I’ll be building and open-sourcing more fun tools like this. Follow along for the ride! 🛠️

🚨 Introducing Dragula 🧛‍♂️ An idiot-proof drag-and-drop engine for SwiftUI. A lot of you asked me to open source it - so here you go :) Try it. Break it. Let me know! I’ll be building and open-sourcing more fun tools like this. Follow along for the ride! 🛠️

Mustafa Yusuf

116,508 Aufrufe • vor 1 Jahr

New prototype for my horde shooter up on Itch, now with some actual UI! Try it out and let me know what you think :) Link in the reply ⬇️ #gamedev #indiedev

New prototype for my horde shooter up on Itch, now with some actual UI! Try it out and let me know what you think :) Link in the reply ⬇️ #gamedev #indiedev

taikwy

15,866 Aufrufe • vor 11 Monaten

One Galandra isn't enough for you? Let her multiply thanks to the echoes of time. ⏳ Try the Mania Ordeal and let us know how it goes!

One Galandra isn't enough for you? Let her multiply thanks to the echoes of time. ⏳ Try the Mania Ordeal and let us know how it goes!

Absolum | Now Available on XBOX!

18,831 Aufrufe • vor 3 Monaten

Don’t worry KALYJAY 😁… Try it out and let me know if it works🫠. SPREADING TECH WISDOM EVERYWHERE 🚀❤️

Don’t worry KALYJAY 😁… Try it out and let me know if it works🫠. SPREADING TECH WISDOM EVERYWHERE 🚀❤️

areyouAgod?

284,831 Aufrufe • vor 1 Jahr