Загрузка видео...

Не удалось загрузить видео

На главную

we are excited to launch an experimental API focused on data extraction today Induced - send a URL and natural language query - get structured data back no custom scraping scripts required. supports csv, json and markdown with more to come. free to use. examples below 👇

87,513 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 11

Фото профиля aryan sharma
aryan sharma2 лет назад

1/ our extraction API docs are live on you can receive your API key on every extraction request includes a URL and natural language query. you can optionally pass column names, output format and count of rows to be captured.

Фото профиля aryan sharma
aryan sharma2 лет назад

2/ once a request is sent, the API returns an ID for your extraction job. you can use the ID to poll status and get structured data output back when the job is completed. example: extracting all products on @producthunt with their name, maker and upvotes.

Фото профиля aryan sharma
aryan sharma2 лет назад

3/ this API is great for extracting structured data from unstructured web pages. - extract trending repositories from github. - extract most active stocks on google finance. - extract top 5 videos from youtube trending. 40-60 seconds per task on average.

Фото профиля aryan sharma
aryan sharma2 лет назад

4/ we don't handle pagination or authenticated pages yet - but we'll be releasing a more configurable version soon. browser agents are super powerful for data extraction tasks and we want to help more devs use them. please share feedback! discord:

Фото профиля Saurabh Kumar
Saurabh Kumar2 лет назад

Really nice work. But, how do you do data validation, meaning, validating if it actually got the data you requested. I mean here there's a "name" field with "trending repos list", what if it fetched something like "popular repos" instead of trending, despite trending being available(but rather in a separate route/behind a click event). Data extraction has to be deterministic, cause the only thing you have to be absolutely sure about is data. N runs of the same script shouldn't also return N scraping outputs, as they can with stochastic embedding.

Фото профиля Alessio Fanelli
Alessio Fanelli2 лет назад

@inducedai @AlexReibman

Фото профиля Musthaq
Musthaq2 лет назад

@inducedai I can see the video is actually clipped from the time it takes to process the request. Assuming you are launching a headless browser, capturing a screenshot, parsing the HTML, AI request to query it, how much time does it usually take to complete this request?

Фото профиля aryan sharma
aryan sharma2 лет назад

@inducedai 30-60s on avg, sometimes more depending on the data. but we run this is as an async process so you can poll for completion status instead of waiting.

Фото профиля Harsh Agrawal | itsharshag.com
Harsh Agrawal | itsharshag.com2 лет назад

@inducedai can we do this with PDFs?

Фото профиля calix
calix2 лет назад

@inducedai love this

Фото профиля aryan sharma
aryan sharma2 лет назад

@inducedai thanks calix!

Похожие видео