Загрузка видео...

Не удалось загрузить видео

На главную

This assistant has 169 lines of code: • Gemini Flash • OpenAI Whisper • OpenAI TTS API • OpenCV GPT-4o is slower than Flash, more expensive, chatty, and very stubborn (it doesn't like to stick to my prompts). Next week, I'll post a step-by-step video on how to build this.

90,296 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 10

Фото профиля Santiago
Santiago2 лет назад

The first request takes longer (warming up), but things work faster from that point. Few opportunities to improve this: 1. Stream answers from the model (instead of waiting for the full answer.) 2. Add the ability to interrupt the assistant. 3. Whisper running on GPU

Фото профиля Santiago
Santiago2 лет назад

Unfortunately, no local modal supports text+images (as far as I know,) so I'm stuck running online models. The TTS API (synthesizing text to audio) can also be replaced by a local version. I tried, but the available voices suck (too robotic), so I kept OpenAI's.

Фото профиля Santiago
Santiago2 лет назад

I wonder if OpenAI's assistant uses the people's API or if they have a special, secret, much faster version powering their app. I wouldn't be surprised if they have VIP access. They can have ++ bandwidth with the model for faster responses.

Фото профиля Hesam
Hesam2 лет назад

169 lines of code is what we used to have to just begin with coding 😁 great job 👌🏻

Фото профиля Santiago
Santiago2 лет назад

We've come far!

Фото профиля BluFor
BluFor2 лет назад

Time to put this into a plastic gadget and raise $100 million

Фото профиля Santiago
Santiago2 лет назад

I'm posting it for free online for those who want to raise the money. Remember me when you make it!

Фото профиля Nicolaj B Andersen
Nicolaj B Andersen2 лет назад

Out of curiosity, how did it know what “small text” you were referring to? There is also some small text above the small figure

Фото профиля Santiago
Santiago2 лет назад

No idea. It picked one and read it. On a different test, I pointed at a line of text (the top one) and asked to read that one and it worked.

Фото профиля ZAZO
ZAZO2 лет назад

Amazing work @svpino

Похожие видео