Ettore Di Giacinto's banner
Ettore Di Giacinto's profile picture

Ettore Di Giacinto

@mudler_it3,215 subscribers

dad, creator of LocalAI(https://t.co/ReVYw5Pf4D) and Kairos (https://t.co/R6M51FYVs7) , ex @SUSE/@Rancher, ex-Gentoo Dev.

Shorts

NVIDIA just dropped Nemotron-3.5-ASR: one 0.6B model, 40+ languages, streaming. parakeet.cpp already runs it. On a plain CPU, 2.5x faster than NVIDIA AI 's Nemo runtime, output byte-for-byte identical (WER 0). No GPU needed. Offline or real-time. Pick a language with --lang, or auto. GPU numbers are coming to compare with Nemo framework.

NVIDIA just dropped Nemotron-3.5-ASR: one 0.6B model, 40+ languages, streaming. parakeet.cpp already runs it. On a plain CPU, 2.5x faster than NVIDIA AI 's Nemo runtime, output byte-for-byte identical (WER 0). No GPU needed. Offline or real-time. Pick a language with --lang, or auto. GPU numbers are coming to compare with Nemo framework.

76,296 просмотров

parakeet.cpp: native C++/ggml (ggml) inference for NVIDIA AI Developer's Parakeet, one of the best speech-to-text models out there, from the LocalAI team. Every Parakeet model (TDT/CTC/RNNT/hybrid + cache-aware streaming), byte-for-byte identical output to NeMo, now running anywhere with no Python and even a bit faster, on CPU and GPU. Quantized GGUF on Hugging Face 🤗 Huge thanks to Georgi Gerganov for ggml and to NVIDIA AI Developer for releasing Parakeet! 🧵

parakeet.cpp: native C++/ggml (ggml) inference for NVIDIA AI Developer's Parakeet, one of the best speech-to-text models out there, from the LocalAI team. Every Parakeet model (TDT/CTC/RNNT/hybrid + cache-aware streaming), byte-for-byte identical output to NeMo, now running anywhere with no Python and even a bit faster, on CPU and GPU. Quantized GGUF on Hugging Face 🤗 Huge thanks to Georgi Gerganov for ggml and to NVIDIA AI Developer for releasing Parakeet! 🧵

55,136 просмотров