Wildminder's banner

Wildminder

@wildmindai • 10,320 subscribers

Physicist, Programmer, Designer

Shorts

NVIDIA says: no more "brute force every pixel" of video understanding. AutoGaze- identifies and removes redundant video patches before they enter a Vision Transformer. Now we can processes 4K long-video in real-time. Works with SigLIP2 and NVILA.

NVIDIA says: no more "brute force every pixel" of video understanding. AutoGaze- identifies and removes redundant video patches before they enter a Vision Transformer. Now we can processes 4K long-video in real-time. Works with SigLIP2 and NVILA.

296,789 просмотров

Klein 9B Sun Direction LoRA Nice LoRA that lets you move the sun to a specific spot in the sky just by pointing at a reference ball.

Klein 9B Sun Direction LoRA Nice LoRA that lets you move the sun to a specific spot in the sky just by pointing at a reference ball.

11,441 просмотров

LGTM from Apple: 4K feed-forward 3D Gaussian Splatting. instant 4K 3D scenes without massive GPUs.. - predicts a few lightweight 3D shapes, wraps them in ultra-high-res 2D textures. - low Memory usage You take two normal photos of a room. Instantly walk around it in flawless 3D.

LGTM from Apple: 4K feed-forward 3D Gaussian Splatting. instant 4K 3D scenes without massive GPUs.. - predicts a few lightweight 3D shapes, wraps them in ultra-high-res 2D textures. - low Memory usage You take two normal photos of a room. Instantly walk around it in flawless 3D.

46,278 просмотров

Video diffusion models are just overqualified depth estimators! Deterministic single-pass depth estimation based on WanV2.1. - SOTA 5.5 AbsRel on ScanNet - data-efficient than baselines; - no temporal flicker + infinite-length estimation w/ zero scale drift.

Video diffusion models are just overqualified depth estimators! Deterministic single-pass depth estimation based on WanV2.1. - SOTA 5.5 AbsRel on ScanNet - data-efficient than baselines; - no temporal flicker + infinite-length estimation w/ zero scale drift.

49,391 просмотров

is it time to delete DaVinci Resolve?

is it time to delete DaVinci Resolve?

35,069 просмотров

CogOmniControl by Tencent. Reasoning-driven controllable video gen. CogVLM + CogOmniDiT to translate sparse storyboards/sketches into production-quality video. beats VINO, VACE-Wan2.1

CogOmniControl by Tencent. Reasoning-driven controllable video gen. CogVLM + CogOmniDiT to translate sparse storyboards/sketches into production-quality video. beats VINO, VACE-Wan2.1

24,123 просмотров

LTX2.3 ReStyle LoRA Transfers simpler styles (flat 2D, cel-shaded, monochrome line art). struggles with complex styles (texture, intricate detail, strong material/lighting effects).

LTX2.3 ReStyle LoRA Transfers simpler styles (flat 2D, cel-shaded, monochrome line art). struggles with complex styles (texture, intricate detail, strong material/lighting effects).

24,442 просмотров

3D modeling entirely replaced by stick figures. SK-Adapter brings skeleton-based structural control for native 3D. > Feed it a basic skeleton > Type what you want to see > Get a fully rendered 3D character in under 15 seconds > Already rigged and ready for animation > Zero Blender experience required Game devs are happy.

3D modeling entirely replaced by stick figures. SK-Adapter brings skeleton-based structural control for native 3D. > Feed it a basic skeleton > Type what you want to see > Get a fully rendered 3D character in under 15 seconds > Already rigged and ready for animation > Zero Blender experience required Game devs are happy.

26,931 просмотров

LTX 2.3 Creative Upscale IC-LoRA. - Generative second-pass refiner for soft or low-resolution video; - enhances detail and clarity without standard upscaling; - output varies based on workflow/settings.

LTX 2.3 Creative Upscale IC-LoRA. - Generative second-pass refiner for soft or low-resolution video; - enhances detail and clarity without standard upscaling; - output varies based on workflow/settings.

17,151 просмотров

ComfyUI-WanVideoWrapper now supports SteadyDancer: like WanAnimate - human image animation framework; produces high-fidelity, coherent motion

ComfyUI-WanVideoWrapper now supports SteadyDancer: like WanAnimate - human image animation framework; produces high-fidelity, coherent motion

42,246 просмотров

Unsloth dropped new LTX-2.3 GGUFs. > Dev/distilled UD-Q2/Q5

Unsloth dropped new LTX-2.3 GGUFs. > Dev/distilled UD-Q2/Q5

24,502 просмотров

Thanks to Kijai, One-to-All Animation has already been added to ComfyUI.

Thanks to Kijai, One-to-All Animation has already been added to ComfyUI.

33,914 просмотров

LightVAE + ComfyUI node: High-performance video VAE; runs 2–3x faster using 50% less memory; LightTAE offers a 10+x speedup on just ~0.4GB VRAM

LightVAE + ComfyUI node: High-performance video VAE; runs 2–3x faster using 50% less memory; LightTAE offers a 10+x speedup on just ~0.4GB VRAM

38,092 просмотров

As usual, Kijai has prepared the Wan-Move, and it is available in ComfyUI.

As usual, Kijai has prepared the Wan-Move, and it is available in ComfyUI.

25,589 просмотров

Capybara? 14B model for T2V, T2I, TV2V, TI2I. - based on HunyuanVideo1.5; - byt5-small, Glyph-SDXL-v2, SigLIP; - 480p-1080p; 16.7GB model, 5GB VAE.. mostly for video editing.

Capybara? 14B model for T2V, T2I, TV2V, TI2I. - based on HunyuanVideo1.5; - byt5-small, Glyph-SDXL-v2, SigLIP; - 480p-1080p; 16.7GB model, 5GB VAE.. mostly for video editing.

16,891 просмотров

AnyDepth: Lightweight zero-shot monocular depth estimation; surpasses DPT; - nicely preserves detail.

AnyDepth: Lightweight zero-shot monocular depth estimation; surpasses DPT; - nicely preserves detail.

18,610 просмотров

Your LTX-2 performance boost has arrived. NVIDIA Studio Driver (591.74 January)- optimizations for LTX2 + support for NVFP4/NVFP8 in ComfyUI.

Your LTX-2 performance boost has arrived. NVIDIA Studio Driver (591.74 January)- optimizations for LTX2 + support for NVFP4/NVFP8 in ComfyUI.

13,803 просмотров

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Netflix dropped some useful stuff. VOID -video object and interaction deletion. - removes objects while realistically simulating physical consequences; - beats Runway/ProPainter; - CogVideoX-5B + SAM 2; looks good, no smudges/artifacts

Netflix dropped some useful stuff. VOID -video object and interaction deletion. - removes objects while realistically simulating physical consequences; - beats Runway/ProPainter; - CogVideoX-5B + SAM 2; looks good, no smudges/artifacts

365,564 просмотров • 3 месяцев назад

llmfit. Useful tool that probes hardware and tells you exactly which LLMs will actually run. - handles MoE expert offloading, picks the best quantization for your RAM, estimates tokens/sec before you even pull the weights. Essential for local dev.

llmfit. Useful tool that probes hardware and tells you exactly which LLMs will actually run. - handles MoE expert offloading, picks the best quantization for your RAM, estimates tokens/sec before you even pull the weights. Essential for local dev.

247,800 просмотров • 4 месяцев назад

LUNA: Universal 3D human animation via LBS-free mapping - driving from images, sketches, keypoints - 3DGS + Sapiens for high-fidelity cross-identity animation - low temporal jitter - captures loose clothing dynamics

LUNA: Universal 3D human animation via LBS-free mapping - driving from images, sketches, keypoints - 3DGS + Sapiens for high-fidelity cross-identity animation - low temporal jitter - captures loose clothing dynamics

25,026 просмотров • 20 дней назад

A totally new level of pose control in ComfyUI - VNCCS. - full 3D Pose Studio for character posing & lighting, - multi-pose, body generator, pose gallery; - vision-guided QWEN Detailer + camera controls.

A totally new level of pose control in ComfyUI - VNCCS. - full 3D Pose Studio for character posing & lighting, - multi-pose, body generator, pose gallery; - vision-guided QWEN Detailer + camera controls.

164,255 просмотров • 5 месяцев назад

HOT! SCAIL-2 just dropped! End-to-end character animation via in-context conditioning - no skeleton middleman, it copies pixels directly - no glitches, no messy hands - 512p/704p - unified architecture for character replacement and multi-character tasks. - zero-shot generalization to animal-driven and mesh-based control.

HOT! SCAIL-2 just dropped! End-to-end character animation via in-context conditioning - no skeleton middleman, it copies pixels directly - no glitches, no messy hands - 512p/704p - unified architecture for character replacement and multi-character tasks. - zero-shot generalization to animal-driven and mesh-based control.

45,052 просмотров • 1 месяц назад

and impressive video model.. LingBot-Video. Sparse MoE for physically consistent video gen. prioritizes physical realism and action-consequence logic -DiT + Qwen3-VL-4B conditioning + Wan2.1-VAE. - 120B params - spatiotemporal geometry stability - T2I, T2V, TI2V - 1080p

and impressive video model.. LingBot-Video. Sparse MoE for physically consistent video gen. prioritizes physical realism and action-consequence logic -DiT + Qwen3-VL-4B conditioning + Wan2.1-VAE. - 120B params - spatiotemporal geometry stability - T2I, T2V, TI2V - 1080p

12,211 просмотров • 12 дней назад

SUPIR upscaler is outdated. ASASR- turns blurry, low-quality photos into sharp, high-res images. Prevents the fake hallucinated details. - improves OCR - high segmentation accuracy - based on FLUX.1 dev this looks sweet

SUPIR upscaler is outdated. ASASR- turns blurry, low-quality photos into sharp, high-res images. Prevents the fake hallucinated details. - improves OCR - high segmentation accuracy - based on FLUX.1 dev this looks sweet

41,951 просмотров • 1 месяц назад

LTX2.3 Obscura Remove LoRA. Some kind of a "digital X-ray" for video cleanup. - Strips foreground junk. Peel back smoke, haze, or clutter - Reconstructs hidden scenes. Describe the background to fill holes.

LTX2.3 Obscura Remove LoRA. Some kind of a "digital X-ray" for video cleanup. - Strips foreground junk. Peel back smoke, haze, or clutter - Reconstructs hidden scenes. Describe the background to fill holes.

56,257 просмотров • 2 месяцев назад

Sweet! Kijai has added standalone ComfyUI nodes for SCAIL pose processing.

Sweet! Kijai has added standalone ComfyUI nodes for SCAIL pose processing.

119,613 просмотров • 7 месяцев назад

Soprano: An instant, ultra-lightweight TTS model for realistic speech; generates 10 hours of 32kHz audio in <20s; streams with <15ms latency using just 80M params & <1GB VRAM. Has some limitations and drawbacks.

Soprano: An instant, ultra-lightweight TTS model for realistic speech; generates 10 hours of 32kHz audio in <20s; streams with <15ms latency using just 80M params & <1GB VRAM. Has some limitations and drawbacks.

111,542 просмотров • 6 месяцев назад

MotionBricks by NVIDIA. the all-in-one brain for character movement. No manual rigging/animation, you just let this AI handle the physics and style in real-time. - 15k FPS on RTX 5090! You could run an entire city of unique NPCs without breaking a sweat. - Smart Primitives. The AI fills in the natural-looking motion automatically. - Zero-Shot Skills. - Glitch-Free. Keeps movements stable.

MotionBricks by NVIDIA. the all-in-one brain for character movement. No manual rigging/animation, you just let this AI handle the physics and style in real-time. - 15k FPS on RTX 5090! You could run an entire city of unique NPCs without breaking a sweat. - Smart Primitives. The AI fills in the natural-looking motion automatically. - Zero-Shot Skills. - Glitch-Free. Keeps movements stable.

23,296 просмотров • 1 месяц назад

CS:GO + Adobe + Wan2.1 = WorldCam. Interactive autoregressive 3D gaming worlds. > AI now generates playable 3D worlds live > You move the mouse, the AI builds the map instantly > Turn around, and it remembers exactly what was there No traditional game engines. Just neural networks hallucinating reality.

CS:GO + Adobe + Wan2.1 = WorldCam. Interactive autoregressive 3D gaming worlds. > AI now generates playable 3D worlds live > You move the mouse, the AI builds the map instantly > Turn around, and it remembers exactly what was there No traditional game engines. Just neural networks hallucinating reality.

67,807 просмотров • 4 месяцев назад

LTX-2.3 Foley LoRA - It adds Foley that actually matches the action on screen. - stops the model from defaulting to background scores when you just want real-world noise. - makes generations feel like actual raw footage

LTX-2.3 Foley LoRA - It adds Foley that actually matches the action on screen. - stops the model from defaulting to background scores when you just want real-world noise. - makes generations feel like actual raw footage

13,864 просмотров • 20 дней назад

HappyHorse-1.0 vs All. - 720p, 24fps - crisp - pleasant colors, no AI-ish vibrancy - nice textures, good skin - dynamic - good prompt following If this model runs on consumer GPUs - it'll be an absolute nuke

HappyHorse-1.0 vs All. - 720p, 24fps - crisp - pleasant colors, no AI-ish vibrancy - nice textures, good skin - dynamic - good prompt following If this model runs on consumer GPUs - it'll be an absolute nuke

49,905 просмотров • 3 месяцев назад

No way. Adobe's new node editor is a straight-up ComfyUI clone! The killer feature is encapsulating an entire graph into a simple tool that can be dropped into Photoshop.

No way. Adobe's new node editor is a straight-up ComfyUI clone! The killer feature is encapsulating an entire graph into a simple tool that can be dropped into Photoshop.

110,443 просмотров • 8 месяцев назад

Klein 9B Sun Direction LoRA Nice LoRA that lets you move the sun to a specific spot in the sky just by pointing at a reference ball.

Klein 9B Sun Direction LoRA Nice LoRA that lets you move the sun to a specific spot in the sky just by pointing at a reference ball.

11,441 просмотров • 17 дней назад

Thinking in Boxes- 3D editing for images with simple box interface. - FLUX-Kontext + LoRA - precise translation, rotation, scaling - good object preservation - geometric fidelity ideal for virtual staging and e-commerce.

Thinking in Boxes- 3D editing for images with simple box interface. - FLUX-Kontext + LoRA - precise translation, rotation, scaling - good object preservation - geometric fidelity ideal for virtual staging and e-commerce.

15,451 просмотров • 29 дней назад

FLUX.2 Klein in pure pixel space! No VAE. AsymFlow - hyper-realistic images by working directly in pixel space rather than using compressed latent representations. - sharper textures, superior visual fidelity - 40% faster - low-rank noise parameterization to solve high-dimensional bottlenecks ComfyUI support incoming

FLUX.2 Klein in pure pixel space! No VAE. AsymFlow - hyper-realistic images by working directly in pixel space rather than using compressed latent representations. - sharper textures, superior visual fidelity - 40% faster - low-rank noise parameterization to solve high-dimensional bottlenecks ComfyUI support incoming

27,210 просмотров • 2 месяцев назад

Awesome. NVIDIA dropped PiD - fast high-res latent decoding via pixel diffusion! - replace VAE - 4/8x upsampling - 2k decoding in <1s on RTX 5090 - works with FLUX.1/SD3/Z - rapid generation previews sharper details, much lower hardware lag compared to standard methods.

Awesome. NVIDIA dropped PiD - fast high-res latent decoding via pixel diffusion! - replace VAE - 4/8x upsampling - 2k decoding in <1s on RTX 5090 - works with FLUX.1/SD3/Z - rapid generation previews sharper details, much lower hardware lag compared to standard methods.

23,632 просмотров • 1 месяц назад

InfiniDepth: Fine-grained depth estimation at any resolution; models depth as a neural implicit field instead of a grid; - tops DepthAnything; - recovers sharp details w/ a lightweight 15M decoder.

InfiniDepth: Fine-grained depth estimation at any resolution; models depth as a neural implicit field instead of a grid; - tops DepthAnything; - recovers sharp details w/ a lightweight 15M decoder.

59,172 просмотров • 6 месяцев назад