
Kevin Lin
@KevinQHLin • 2,415 subscribers
multimodal x agent x next postdoc @UniofOxford visiting @Stanford phd @NUSingapore | ex @Meta @Microsoft
Shorts
Videos

🌟Introducing🎻Violin — an Open-source Video Translation Skill. 📹Video is the dominant medium on the internet, yet most high-quality content (lecture, talk, podcast) is locked behind a single language, leaving global audiences behind. So we built Violin: a video skill that combines speech recognition, LLM translation, and speech synthesis into one seamless pipeline. 🌐 Demo: 📝 Blog: 🔗 GitHub: ✨Key Features: 🎙️High-quality multilingual ASR & Translation & TTS. 🗣️Personalize translation & voice (turn an academic talk into something children can follow). 💬Chat with the video — ask any questions grounded in the video. 🧩Support Web app, CLI, and Agent skill 🍃Fully open-source under MIT. ❤️Built with the wonderful Shang Zhu and advised by James Zou ! All features powered by Together AI . Try it and let us know what you think! 🎻
Kevin Lin138,241 次观看 • 1 个月前

can AI write engaging news that people can trust? introducing ✨Data2Story: a data journalist agent. give it raw data, it generate a verifiable, multimodal article. 🔍verifiable: every claim is evidence-grounded, traces back to data, code, or a cited source. 🔮multimodal: the article is a generative UI — images, videos, audio, interactive charts. not just readable, but trustworthy and playable. 🧵1/N
Kevin Lin25,744 次观看 • 16 天前

Thanks AK for sharing our work!! 🤔Today’s video generation models (e.g., Veo3, SoRA) are great at realism, but they still struggle to convey structured knowledge and logical teaching. 🌟Code2Video🌟takes a different path: starting from Python Manim code, it renders project-level programs into educational videos—bridging coding, visualization, and knowledge! 📷 Code: 🏠 Website: 📄 arXiv: We want to share our gratitude to Grant Sanderson and @manim_community !!! Thanks to the great team Anno Yanzhe Chen and Mike Shou ! #VIDEO #education #Sora2
Kevin Lin32,332 次观看 • 9 个月前
没有更多内容可加载