Xenova's banner
Xenova's profile picture

Xenova

@xenovacom17,207 subscribers

Bringing the power of machine learning to the web. Currently working on Transformers.js (@huggingface 🤗)

Shorts

Opus 4.7 just wrote a custom WebGPU kernel that runs Qwen3.5 up to 13x faster using a fused LinearAttention op! 🤯 Agentic kernel optimization is the future. Now live in 🤗 Transformers.js v4.2.0! P.S. I've updated all our previous demos to use this new version. Enjoy!

Opus 4.7 just wrote a custom WebGPU kernel that runs Qwen3.5 up to 13x faster using a fused LinearAttention op! 🤯 Agentic kernel optimization is the future. Now live in 🤗 Transformers.js v4.2.0! P.S. I've updated all our previous demos to use this new version. Enjoy!

77,700 Aufrufe

Introducing 🤗 Transformers.js v4: state-of-the-art machine learning for the web! 🚀 New WebGPU backend (browser, Node.js, Bun, Deno) ⚡️ Huge performance improvements 🤯 Support for over 200 architectures 🛠️ Complete codebase refactor Learn more about our biggest release yet! 👇

Introducing 🤗 Transformers.js v4: state-of-the-art machine learning for the web! 🚀 New WebGPU backend (browser, Node.js, Bun, Deno) ⚡️ Huge performance improvements 🤯 Support for over 200 architectures 🛠️ Complete codebase refactor Learn more about our biggest release yet! 👇

69,311 Aufrufe

There has been a huge debate recently about the best approach for image background removal. Here's my attempt: - In-browser inference w/ 🤗 Transformers.js - WebGPU accelerated (fast!) - Costs $0 (no image hosting or server processing) - No data leaves your device (privacy!)

There has been a huge debate recently about the best approach for image background removal. Here's my attempt: - In-browser inference w/ 🤗 Transformers.js - WebGPU accelerated (fast!) - Costs $0 (no image hosting or server processing) - No data leaves your device (privacy!)

417,357 Aufrufe

Inspired by Andrej Karpathy's microgpt, I built microgpt.js: a JavaScript port that runs entirely in your browser! It's an exact numerical implementation, so the randomness and outputs match bit-for-bit! Try it out yourself and train your own GPT by simply opening a webpage! 👇

Inspired by Andrej Karpathy's microgpt, I built microgpt.js: a JavaScript port that runs entirely in your browser! It's an exact numerical implementation, so the randomness and outputs match bit-for-bit! Try it out yourself and train your own GPT by simply opening a webpage! 👇

48,286 Aufrufe

Run OpenAI's new Whisper Turbo model 100% locally in your browser with Transformers.js! ⚡️ Transcribe 2 minutes of audio in ~12 seconds! 🤯 Demo + source code 👇

Run OpenAI's new Whisper Turbo model 100% locally in your browser with Transformers.js! ⚡️ Transcribe 2 minutes of audio in ~12 seconds! 🤯 Demo + source code 👇

137,002 Aufrufe

IBM just released Granite 4.0 1B Speech, a compact and efficient speech-language model, designed for multilingual speech recognition and bidirectional speech translation. New #1 on the OpenASR leaderboard! It can even run in your browser on WebGPU, thanks to 🤗 Transformers.js

IBM just released Granite 4.0 1B Speech, a compact and efficient speech-language model, designed for multilingual speech recognition and bidirectional speech translation. New #1 on the OpenASR leaderboard! It can even run in your browser on WebGPU, thanks to 🤗 Transformers.js

21,231 Aufrufe

Meta's Segment Anything Model (SAM) can now run in your browser w/ WebGPU (+ fp16), meaning up to 8x faster image encoding (10s → 1.25s)! 🤯⚡️ Video is not sped up! Everything runs 100% locally thanks to 🤗 Transformers.js and onnxruntime-web! 🔗 Demo:

Meta's Segment Anything Model (SAM) can now run in your browser w/ WebGPU (+ fp16), meaning up to 8x faster image encoding (10s → 1.25s)! 🤯⚡️ Video is not sped up! Everything runs 100% locally thanks to 🤗 Transformers.js and onnxruntime-web! 🔗 Demo:

120,352 Aufrufe

I'm excited to announce that Transformers.js V3 is finally available on NPM! 🔥 State-of-the-art Machine Learning for the web, now with WebGPU support! 🤯⚡️ Install it from NPM with: 𝚗𝚙𝚖 𝚒 @𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎/𝚝𝚛𝚊𝚗𝚜𝚏𝚘𝚛𝚖𝚎𝚛𝚜 or via CDN (example below) 👇

I'm excited to announce that Transformers.js V3 is finally available on NPM! 🔥 State-of-the-art Machine Learning for the web, now with WebGPU support! 🤯⚡️ Install it from NPM with: 𝚗𝚙𝚖 𝚒 @𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎/𝚝𝚛𝚊𝚗𝚜𝚏𝚘𝚛𝚖𝚎𝚛𝚜 or via CDN (example below) 👇

87,338 Aufrufe

New Andrej Karpathy video just dropped! 😍🔥 After watching, if you want to learn more about how different models (e.g., GPT4, Llama, T5, BERT) tokenize text, check out "The Tokenizer Playground": a web-app I built a few months ago with 🤗 Transformers.js! 🔗

New Andrej Karpathy video just dropped! 😍🔥 After watching, if you want to learn more about how different models (e.g., GPT4, Llama, T5, BERT) tokenize text, check out "The Tokenizer Playground": a web-app I built a few months ago with 🤗 Transformers.js! 🔗

75,909 Aufrufe

WOW! 😍 Sakana AI just released TinySwallow, a 1.5B Japanese LLM created through TAID (Temporally Adaptive Interpolated Distillation), a new knowledge distillation technique. 🐦 hardmaru even spoke about in-browser use on Bloomberg, "we show that [TinySwallow] works entirely inside of a smartphone or entirely inside your web browser, without it calling an API." Love it! 🔥 I recreated their demo with 🤗 Transformers.js, and it's up to 2x faster (50-60 tokens per second on an M3 Max)! Check it out! 👇

WOW! 😍 Sakana AI just released TinySwallow, a 1.5B Japanese LLM created through TAID (Temporally Adaptive Interpolated Distillation), a new knowledge distillation technique. 🐦 hardmaru even spoke about in-browser use on Bloomberg, "we show that [TinySwallow] works entirely inside of a smartphone or entirely inside your web browser, without it calling an API." Love it! 🔥 I recreated their demo with 🤗 Transformers.js, and it's up to 2x faster (50-60 tokens per second on an M3 Max)! Check it out! 👇

17,269 Aufrufe

WOW! 🤯 Language models are becoming smaller and more capable than ever! Here's SmolLM2 running 100% locally in-browser w/ WebGPU on a 6-year-old GPU. Look at that speed! ⚡️😍 Powered by 🤗 Transformers.js and ONNX Runtime Web! How many tokens/second do you get? Let me know! 👇

WOW! 🤯 Language models are becoming smaller and more capable than ever! Here's SmolLM2 running 100% locally in-browser w/ WebGPU on a 6-year-old GPU. Look at that speed! ⚡️😍 Powered by 🤗 Transformers.js and ONNX Runtime Web! How many tokens/second do you get? Let me know! 👇

12,557 Aufrufe

Videos