
机器之心 JIQIZHIXIN
@jiqizhixin • 18,258 subscribers
China's leading media & information provider for #AI & #MachineLearning
Videos

Xmax X1, the first real-time interactive video model, is here. Powered by autoregressive streaming generation, X1 achieves millisecond ultra-low latency and infinite-length generation, enabling truly natural spatial interaction. Camera is redefined. It’s no longer just a lens, but a magic wand that breaks the barrier between dimensions. Summon virtual beings into your reality and interact with them in real-time. Live implementation video is below, truely amazing work! Definitely one to watch.
机器之心 JIQIZHIXIN91,687 görüntüleme • 3 ay önce

This is huge! A UCLA team managed to build an optical generative model that runs on light instead of GPUs. In their demo, a shallow encoder maps noise into phase patterns, which a free-space optical decoder then transforms into images—digits, fashion, butterflies, faces, even Van Gogh–style art—without any computation during synthesis. ⚡ The results rival digital diffusion models, pointing to ultra-fast, energy-efficient AI powered by photonics. Optical generative models | Nature Paper:
机器之心 JIQIZHIXIN173,487 görüntüleme • 8 ay önce

What if you could train AI agents on a laptop as easily as on a GPU cluster? Researchers from UIUC's U Lab, led by Prof. Jiaxuan You, just open-sourced OpenTinker. It's a new "Reinforcement-Learning-as-a-Service" (RLaaS) system that decouples the complex training pipeline into simple, distributed services with friendly APIs. The result? It breaks down the major engineering barriers to RL, outperforming traditional frameworks in accessibility and ease of deployment, finally making agent training viable for more developers and teams. Project: Code: U Lab: Our report: 📬 #PapersAccepted by Jiqizhixin
机器之心 JIQIZHIXIN15,862 görüntüleme • 4 ay önce

Speed and quality can finally coexist in diffusion-based language generation. Introducing DiDi-Instruct, a Discrete Diffusion Divergence Instruct method that distills a pre-trained discrete diffusion language model (dLLM) into a few-step student for ultra-fast generation. Built on integral KL-divergence minimization, DiDi-Instruct achieves up to 64× faster decoding, surpasses both its teacher and GPT-2, and cuts training time by 20×. Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct Paper: Code: Project: Our report: 📬 #PapersAccepted by Jiqizhixin
机器之心 JIQIZHIXIN18,126 görüntüleme • 7 ay önce

Wow, we can steer diffusion models at inference time! Introducing Diffusion Tree Sampling (DTS): a search-based approach inspired by Monte Carlo Tree Search that turns inference into an anytime, reward-guided optimization process. Diffusion Tree Sampling (DTS) produces asymptotically exact samples from the target distribution in the limit of infinite rollouts, and its greedy variant, Diffusion Tree Search (DTS⋆), performs a global search for high reward samples. The results are pretty impressive: - On MNIST and CIFAR-10 class-conditional generation, DTS matches the FID of the best-performing baseline with up to 10× less compute. - In text-to-image generation and language completion tasks, DTS⋆ effectively searches for high reward samples that match best-of-N with up to 5× less compute.
机器之心 JIQIZHIXIN19,037 görüntüleme • 11 ay önce
Daha fazla içerik yok.