Vaibhav (VB) Srivastav

@reach_vb • 52,627 subscribers

founder mode @OpenAI | ex @huggingface | F1 fan | Here for @at_sofdog’s wisdom | *opinions my own

Shorts

Let’s fucking goo!! DeepSeek R1 1.5B running FULLY LOCALLY in your browser at 60 tok/ sec powered by WebGPU🔥 Intelligence truly is too cheap to meter! ⚡️

973,147 просмотров

Codex is taking over the world 🌍 We’ve got 15 community events in the next 10 days: 13 Jun - Hyderabad, Jakarta 14 - Pune, Tel Aviv 18 - Athens, Paris x2, Sydney, Warsaw 19 - Amsterdam, Singapore 20 - Hanoi, Miami, Vienna 22 - Ghent Where should we go next? ;)

66,349 просмотров

Sensitive content

HOLY SHITT, Sesame Labs just dropped CSM (Conversational Speech Model) - Apache 2.0 licensed! 💥 > Trained on 1 MILLION hours of data 🤯 > Contextually aware, emotionally intelligent speech > Voice cloning & watermarking > Ultra fast, real-time synthesis > Based on llama architecture & Mimi like decoder > Apache 2.0 licensed > Weights on the Hub So cool to see such a strong Speech backbone out in the wild! Kudos Sesame team! 🤗

684,874 просмотров

LMAO Qwen 2.5 VL can perform Computer Use, out of the box, taking on OpenAI Operator HEAD ON! 🐐

192,950 просмотров

HOLY SHIT - generate 3D mesh from a single image in LESS THAN A SECOND 🤯

154,108 просмотров

That's an ElevenLabs-level TTS, fully open-source, running on consumer devices!

142,488 просмотров

Fuck yeah! MaskGCT - New open SoTA Text to Speech model! 🔥 > Zero-shot voice cloning > Emotional TTS > Trained on 100K hours of data > Long form synthesis > Variable speed synthesis > Bilingual - Chinese & English > Available on Hugging Face Fully non-autoregressive architecture: > Stage 1: Predicts semantic tokens from text, using tokens extracted from a speech self-supervised learning (SSL) model > Stage 2: Predicts acoustic tokens conditioned on the semantic tokens. Synthesised: "Would you guys personally like to have a fake fireplace, an electric one, in your house? Or would you rather have a real fireplace? Let me know down below. Okay everybody, that's all for today's video and I hope you guys learned a bunch of furniture vocabulary!" TTS scene keeps getting lit! 🐐

139,085 просмотров

Google released an app that allows you to run LLMs from Hugging Face, fully privately and 100% local 🔥 > Generate code on-the-fly > Chat with images > Supports multi-turn conversations > Choose any model from Hugging Face > Based on LiteRT 🔥 > Sign in with HF Support for iOS coming soon! - exciting times for LiteRT and LocalLlama community! 💥

64,514 просмотров

Introducing Distil-Whisper v3 ⚡ > ~50% less parameters and 6x faster than Large-v3. > More accurate than large-v3 on long-form synthesis. Available with 🦀 WebGPU, Whisper.cpp, Transformers, Faster-Whisper and Transformers.js support! Drop in; no changes are required! 🔥

90,651 просмотров

Whisper running on WatchOS! 🔥 > Powered by WhisperKit by > Supports up to Whisper base > Leverages Neural Engine ⚡ > Three lines of code ;) > Works real-time! > MIT license Quite amazed by the speed with which Argmax is shipping. Possibly the fastest & reliable way to run Whisper on Apple devices!

85,363 просмотров

This is RICULOUSLY good, TRELLIS 3D Generation model by Microsoft! 🔥 Generate high-quality 3D assets from text or image prompts. Supports various formats like Radiance Fields, 3D Gaussians, and meshes Available for FREE on Hugging Face!

62,437 просмотров

BOOOOM! you can now use the latest DeepSeek Prover V2 directly on the model page powered by Novita AI 🔥 Open Source FTW! 💥

31,260 просмотров

PaliGemma 2 running 100% local, on-device, powered by MLX 🔥

30,370 просмотров

Let’s fucking goooo, starting today you can directly try out AI models on FREE Colab notebooks from Hugging Face 🔥 Continuing with our mission to make AI accessible to the masses - we’re excited to support Colaboratory for fast exploration and rapid prototyping! BONUS: you can put a custom “notebook.ipynb” in your model repo and we’ll serve that directly!

21,171 просмотров

Videos

LIVE

1.2k

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Streaming Now

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

HD live stream

Exclusive private shows

1.2k viewers online

Current Status

Live

Private Show

Join now for exclusive access

Free preview available • Premium content

0:39

Starting today you can use Codex in Claude Code 👀 /plugin marketplace add openai/codex-plugin-cc Try it out today with: /codex:review for a normal read-only Codex review /codex:adversarial-review for a steerable challenge review /codex:rescue to let codex rescue your code Enjoy Codex-ing!

Vaibhav (VB) Srivastav

953,951 просмотров • 3 месяцев назад

0:52

GPT 5.6 Sol is pretty GOATED at making games, it made an entire rollercoaster simulator with textures and assets included! It all started with a simple prompt in a /goal and with few artistic directions from my side, it's still not complete and there are some UI inconsistencies But you can go pretty far with a broad idea and feedback along the way! Goes to say, you're truly bounded by your own ambition Try it out yourself 🤗

Vaibhav (VB) Srivastav

49,626 просмотров • 7 дней назад

0:57

Stoked to announce that the Chrome, Computer Use, Memory, Chronicle and more are rolling out across the EU, EEA and UK this week! 🔥 Codex can now use apps across your Mac, automate workflows in Chrome and remember context across your work. Open your Codex and prompt away!

Vaibhav (VB) Srivastav

94,838 просмотров • 1 месяц назад

0:24

Super excited to host our first OpenAI Developer Office Hours tomorrow! We’ll cover everything new across Codex and the OpenAI platform - /goal, mobile, plugins, Amazon Bedrock and more! Followed by a live AMA. Come with questions. Come with ideas. Come one, come all!

Vaibhav (VB) Srivastav

47,768 просмотров • 1 месяц назад

1:54

NEW: Kokoro 82M - APACHE 2.0 licensed, Text to Speech model, trained on < 100 hours of audio 🔥

Vaibhav (VB) Srivastav

330,034 просмотров • 1 год назад

1:21

HOLY FUCK! Zyphra just dropped Zonos - Apache 2.0 licensed, Multilingual, Text to Speech model with INSTANT voice cloning! 🔥 > Zero-shot TTS with Voice Cloning: Input text and a 10-30 second speaker sample to generate high-quality text-to-speech output > Audio Prefix Inputs: Enhance speaker matching by adding an audio prefix to the text, enabling behaviors like whispering that are hard to achieve with voice cloning alone > Multilingual Support: Supports English, Japanese, Chinese, French, and German > Audio Quality & Emotion Control: Fine-tune speaking rate, pitch, frequency, audio quality, and emotions (e.g., happiness, anger, sadness, fear) > Fast Performance: Runs at ~2x real-time speed on an RTX 4090 > Available on the Hugging Face Hub 🤗

Vaibhav (VB) Srivastav

298,858 просмотров • 1 год назад

$Fuck it! You can now run *any* GGUF on the Hugging Face Hub directly with ollama 🔥 This has been a constant ask from the community, starting today you can point to any of the 45,000 GGUF repos on the Hub* *Without any changes whatsoever! ⚡ All you need to do is: ollama run hf. co/{username}/{reponame}:latest For example, to run the Llama 3.2 1B, you can run: ollama run hf. co/bartowski/Llama-3.2-1B-Instruct-GGUF:latest If you want to run a specific quant, all you need to do is specify the Quant type: ollama run hf. co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q8_0 That's it! We'll work closely with Ollama to continue developing this further! ⚡$

0:38

Fuck it! You can now run any GGUF on the Hugging Face Hub directly with ollama 🔥 This has been a constant ask from the community, starting today you can point to any of the 45,000 GGUF repos on the Hub* *Without any changes whatsoever! ⚡ All you need to do is: ollama run hf. co/{username}/{reponame}:latest For example, to run the Llama 3.2 1B, you can run: ollama run hf. co/bartowski/Llama-3.2-1B-Instruct-GGUF:latest If you want to run a specific quant, all you need to do is specify the Quant type: ollama run hf. co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q8_0 That's it! We'll work closely with Ollama to continue developing this further! ⚡

Vaibhav (VB) Srivastav

317,790 просмотров • 1 год назад

1:22

Fuck yeah! Llama 3.2 3B running on your browser! 100% local, powered by WebGPU & MLC 🦙

Vaibhav (VB) Srivastav

282,006 просмотров • 1 год назад

1:21

idk what your AGI definition is but subagents & computer use in codex is pretty close!! *video in realtime

Vaibhav (VB) Srivastav

48,332 просмотров • 3 месяцев назад

0:29

Excited to announce the Codex App: run multiple projects and threads in one focused app! 🔥 The app natively packs a lot of features making it easier to maximise your productivity: > Worktree mode keeps changes isolated - parallel tasks without touching your checkout > Automations run in background worktrees and drop findings into your inbox > Built‑in Git review: diff, stage/revert hunks, inline comments > Integrated terminal for test, lint, git - no need to switch apps > Voice dictation: hold Ctrl+M and speak your prompt > Skills + slash commands for faster workflows. > IDE sync with auto context - ask about files you’re viewing > Local / Worktree / Cloud modes - choose where tasks run > Shared MCP config across app/CLI/IDE Bonus: For a limited time, we've doubled the rate limits across the tiers from Free all the way to Enterprise! Enjoy! 🤗

Vaibhav (VB) Srivastav

72,678 просмотров • 5 месяцев назад

5:00

Fuck it, 685B parameter, DeepSeek V3 0324 running locally on M3 Ultra, fully private 🔥 Powered by llama.cpp & dynamic quants from Unsloth AI ⚡ Step 1: brew install llama.cpp Step 2: llama-cli -hf unsloth/DeepSeek-V3-0324-GGUF:Q2_K_XL That's it! 🤗 Honestly a bit surreal to be able to chat with such a chunky model at the touch of the keyboard - future is going to be wild!!

Vaibhav (VB) Srivastav

168,141 просмотров • 1 год назад

1:28

This tweet was sent by Codex via Computer Use

Vaibhav (VB) Srivastav

42,557 просмотров • 3 месяцев назад

0:44

We got a fully open source, end-to-end, conversational AI that you can run on a MacBook Air before Multi-modal GPT4o!

Vaibhav (VB) Srivastav

209,098 просмотров • 1 год назад

0:33

ICYMI: You can use Voice transcription in both Codex App as well as the CLI! 🎙️ Press the mic button or hit `Ctrl + M` and talk away! Available to 100% of the codex users :)

Vaibhav (VB) Srivastav

52,795 просмотров • 4 месяцев назад

0:30

BOOM! Microsoft just released an upgraded VibeVoice Large ~10B Text to Speech model - MIT licensed 🔥 > Generate multi-speaker podcasts in minutes ⚡ > Works blazingly fast on ZeroGPU with H200 (FREE) Try it out today!

Vaibhav (VB) Srivastav

89,581 просмотров • 10 месяцев назад

0:41

Kyutai released their Streaming Text to Speech model, ~2B param model, ultra low latency (220ms), CC-BY-4.0 license 🔥 Trained on 2.5 Million Hours of audio, it can serve up to 32 users w/ less than 350ms latency on a SINGLE L40 🤯 Incredible release by kyutai folks, go check out their hugging face page now!

Vaibhav (VB) Srivastav

93,512 просмотров • 1 год назад

0:34

MARS5 TTS: Open Source Text to Speech with insane prosodic control! 🔥 > Voice cloning with less than 5 seconds of audio > Two stage Auto-Regressive (750M) + Non-Auto Regressive (450M) model architecture > Used BPE tokenizer to enable control over punctuations, pauses, stops etc. > AR model predicts L0 coarse tokens, refined further by the NAR DDPM model followed by the vocoder Great job Camb AI team! Kudos for open sourcing the artifacts - looking forward to what comes next ;)

Vaibhav (VB) Srivastav

162,180 просмотров • 2 лет назад

0:24

new pet, who dis?

Vaibhav (VB) Srivastav

20,519 просмотров • 2 месяцев назад

Live Cam

Vaibhav (VB) Srivastav

Shorts

Let’s fucking goo!! DeepSeek R1 1.5B running FULLY LOCALLY in your browser at 60 tok/ sec powered by WebGPU🔥 Intelligence truly is too cheap to meter! ⚡️

Codex is taking over the world 🌍 We’ve got 15 community events in the next 10 days: 13 Jun - Hyderabad, Jakarta 14 - Pune, Tel Aviv 18 - Athens, Paris x2, Sydney, Warsaw 19 - Amsterdam, Singapore 20 - Hanoi, Miami, Vienna 22 - Ghent Where should we go next? ;)

Sensitive content

LMAO Qwen 2.5 VL can perform Computer Use, out of the box, taking on OpenAI Operator HEAD ON! 🐐

HOLY SHIT - generate 3D mesh from a single image in LESS THAN A SECOND 🤯

That's an ElevenLabs-level TTS, fully open-source, running on consumer devices!

Introducing Distil-Whisper v3 ⚡ > ~50% less parameters and 6x faster than Large-v3. > More accurate than large-v3 on long-form synthesis. Available with 🦀 WebGPU, Whisper.cpp, Transformers, Faster-Whisper and Transformers.js support! Drop in; no changes are required! 🔥

This is RICULOUSLY good, TRELLIS 3D Generation model by Microsoft! 🔥 Generate high-quality 3D assets from text or image prompts. Supports various formats like Radiance Fields, 3D Gaussians, and meshes Available for FREE on Hugging Face!

BOOOOM! you can now use the latest DeepSeek Prover V2 directly on the model page powered by Novita AI 🔥 Open Source FTW! 💥

PaliGemma 2 running 100% local, on-device, powered by MLX 🔥

Videos

Watch Anya Live

Starting today you can use Codex in Claude Code 👀 /plugin marketplace add openai/codex-plugin-cc Try it out today with: /codex:review for a normal read-only Codex review /codex:adversarial-review for a steerable challenge review /codex:rescue to let codex rescue your code Enjoy Codex-ing!

Stoked to announce that the Chrome, Computer Use, Memory, Chronicle and more are rolling out across the EU, EEA and UK this week! 🔥 Codex can now use apps across your Mac, automate workflows in Chrome and remember context across your work. Open your Codex and prompt away!

Super excited to host our first OpenAI Developer Office Hours tomorrow! We’ll cover everything new across Codex and the OpenAI platform - /goal, mobile, plugins, Amazon Bedrock and more! Followed by a live AMA. Come with questions. Come with ideas. Come one, come all!

NEW: Kokoro 82M - APACHE 2.0 licensed, Text to Speech model, trained on &lt; 100 hours of audio 🔥

Fuck yeah! Llama 3.2 3B running on your browser! 100% local, powered by WebGPU &amp; MLC 🦙

idk what your AGI definition is but subagents &amp; computer use in codex is pretty close!! *video in realtime

This tweet was sent by Codex via Computer Use

We got a fully open source, end-to-end, conversational AI that you can run on a MacBook Air before Multi-modal GPT4o!

ICYMI: You can use Voice transcription in both Codex App as well as the CLI! 🎙️ Press the mic button or hit `Ctrl + M` and talk away! Available to 100% of the codex users :)

BOOM! Microsoft just released an upgraded VibeVoice Large ~10B Text to Speech model - MIT licensed 🔥 &gt; Generate multi-speaker podcasts in minutes ⚡ &gt; Works blazingly fast on ZeroGPU with H200 (FREE) Try it out today!

new pet, who dis?

NEW: Kokoro 82M - APACHE 2.0 licensed, Text to Speech model, trained on < 100 hours of audio 🔥

Fuck yeah! Llama 3.2 3B running on your browser! 100% local, powered by WebGPU & MLC 🦙

idk what your AGI definition is but subagents & computer use in codex is pretty close!! *video in realtime

BOOM! Microsoft just released an upgraded VibeVoice Large ~10B Text to Speech model - MIT licensed 🔥 > Generate multi-speaker podcasts in minutes ⚡ > Works blazingly fast on ZeroGPU with H200 (FREE) Try it out today!