
Wildminder
@wildmindai • 10,137 subscribers
Physicist, Programmer, Designer
Shorts
Videos

Another cool stuff from NVIDIA. LocateAnything - high-speed visual search engine. You provide a text prompt and it instantly pinpoints that object's exact location in an image. - 10x speedup for dense object detection - Qwen2.5-3B + Moon-ViT - Fast/Slow/Hybrid modes - trained on 138M samples for UI, docs, generic grounding.
Wildminder50,143 просмотров • 8 дней назад

Awesome. NVIDIA dropped PiD - fast high-res latent decoding via pixel diffusion! - replace VAE - 4/8x upsampling - 2k decoding in <1s on RTX 5090 - works with FLUX.1/SD3/Z - rapid generation previews sharper details, much lower hardware lag compared to standard methods.
Wildminder23,205 просмотров • 10 дней назад

PanoWorld. An interesting way to use Qwen-Edit. It converts 2D floor plans into photorealistic, consistent VR home tours. Great for real estate and interior designers. It lets you walk through a home that hasn’t been built or furnished yet. Ensures seamless 360 views via CPRoPE
Wildminder20,103 просмотров • 14 дней назад

FLUX.2 Klein in pure pixel space! No VAE. AsymFlow - hyper-realistic images by working directly in pixel space rather than using compressed latent representations. - sharper textures, superior visual fidelity - 40% faster - low-rank noise parameterization to solve high-dimensional bottlenecks ComfyUI support incoming
Wildminder26,976 просмотров • 21 дней назад

CS:GO + Adobe + Wan2.1 = WorldCam. Interactive autoregressive 3D gaming worlds. > AI now generates playable 3D worlds live > You move the mouse, the AI builds the map instantly > Turn around, and it remembers exactly what was there No traditional game engines. Just neural networks hallucinating reality.
Wildminder67,670 просмотров • 2 месяцев назад

Pixal3D by Tencent. 3D models from images with unprecedented accuracy. - Explicit 2D-3D correspondence - Aggregates multi-view feature volumes for global consistency - 93.57% IoU on Toys4K benchmark - Outperforms TRELLIS/Hunyuan3D-2.1 based on Direct3D-S2 + DINOv2
Wildminder21,766 просмотров • 23 дней назад

Sweet! Kijai has added standalone ComfyUI nodes for SCAIL pose processing.
Wildminder119,435 просмотров • 5 месяцев назад

BiRefNet background removal now native in ComfyUI, no custom nodes needed Actually fast and clean.
Wildminder22,618 просмотров • 26 дней назад

LTX is cooking hard. Just-Dub-It IC LoRA, and you can dub anything. - Instead of separate steps for voice and lips, it generates them simultaneously - handles occlusions and profile views - beats HeyGen/MuseTalk - maintains bg noise and speaker identity also it stretches the video and audio together so the pacing matches the new language perfectly. Just awesome!
Wildminder19,732 просмотров • 24 дней назад

Good. ComfyUI just got native MoGe support - 3D geometry from monocular images.
Wildminder15,988 просмотров • 21 дней назад