
Wildminder
@wildmindai • 10,137 subscribers
Physicist, Programmer, Designer
Shorts
Videos

Another cool stuff from NVIDIA. LocateAnything - high-speed visual search engine. You provide a text prompt and it instantly pinpoints that object's exact location in an image. - 10x speedup for dense object detection - Qwen2.5-3B + Moon-ViT - Fast/Slow/Hybrid modes - trained on 138M samples for UI, docs, generic grounding.
Wildminder50,143 次观看 • 8 天前

PanoWorld. An interesting way to use Qwen-Edit. It converts 2D floor plans into photorealistic, consistent VR home tours. Great for real estate and interior designers. It lets you walk through a home that hasn’t been built or furnished yet. Ensures seamless 360 views via CPRoPE
Wildminder20,103 次观看 • 14 天前

FLUX.2 Klein in pure pixel space! No VAE. AsymFlow - hyper-realistic images by working directly in pixel space rather than using compressed latent representations. - sharper textures, superior visual fidelity - 40% faster - low-rank noise parameterization to solve high-dimensional bottlenecks ComfyUI support incoming
Wildminder26,976 次观看 • 21 天前

CS:GO + Adobe + Wan2.1 = WorldCam. Interactive autoregressive 3D gaming worlds. > AI now generates playable 3D worlds live > You move the mouse, the AI builds the map instantly > Turn around, and it remembers exactly what was there No traditional game engines. Just neural networks hallucinating reality.
Wildminder67,670 次观看 • 2 个月前

LTX is cooking hard. Just-Dub-It IC LoRA, and you can dub anything. - Instead of separate steps for voice and lips, it generates them simultaneously - handles occlusions and profile views - beats HeyGen/MuseTalk - maintains bg noise and speaker identity also it stretches the video and audio together so the pacing matches the new language perfectly. Just awesome!
Wildminder19,732 次观看 • 24 天前