First fully ML-framework-free 3D Gaussian Splatting implementation in LichtFeld... Studio. I’ve completed the migration of the full training pipeline to a custom CUDA-based tensor library. No PyTorch, no LibTorch, no autograd. Every gradient is implemented by hand, either through CUDA kernels or minimal abstractions on top. This makes it the first full training setup for 3D Gaussian Splatting with zero dependencies on existing ML frameworks. It’s not just about independence, it's about control! We now manage every byte of GPU memory, which opens the door to tighter optimization and finer performance tuning. The framework footprint is minimal, without pulling in gigabytes of ML runtime code that was never designed for real-time or graphics-driven applications. A few modules, such as the metrics and 3DGUT interfaces, are still being ported, and some operations are temporarily naïve, so performance is not yet on par with master. But this refactor lays the groundwork for: - A fully self-contained binary - Fine-grained memory optimization - Easier experimentation without the weight of an ML stack We’re getting close.show more

MrNeRF
50,487 просмотров • 7 месяцев назад
Apple just trained a 3D Gaussian head reconstruction model... on 10,000+ subjects. Feed-forward. No test-time optimization. New identity in, reconstructed Gaussian head out. The UV-parameterized Gaussian representation decouples the number of Gaussians from the number and resolution of input images, making it practical to train with many high resolution views. And the heads are not just static either: text-conditioned identity generation, plus blendshape-driven latent animation across identities. We've been building in the 3D Gaussian Splatting space for a while. The gap between "research demo" and "works on real people at scale" is closing fast.show more

KIRI Engine - 3D Scanner App
12,013 просмотров • 1 месяц назад
In the actual account of meeting Brig ML Khetarpal... and Brig Naseer, there is no cowardice, there is no repentance, there is no anti-war sentiment. Only Soldierly pride and mutual respect for professionalism. Brig ML Khetarpal was a third generation soldier. He could have never said things like, why his son did not yield (in the face of enemy fire). This fictionalized account is a disgrace to the memory of the paramveer Arun Khterpal and a disservice to his father Brig ML Khetarpal. Today, Brig Khetarpal is not there to refute this mockery, but are we all dead to accept this farce?show more

We, the people of India
82,415 просмотров • 5 месяцев назад
Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction... Note: Check below for full video. Abstract (cited): "In this paper, we present a self-calibrating framework that jointly optimizes camera parameters, lens distortion, and 3D Gaussian representations, enabling accurate and efficient scene reconstruction. Our technique is particularly effective for high-quality scene reconstruction from large field-of-view (FOV) imagery taken with wide-angle lenses, allowing the scene to be modeled from a smaller number of images. We introduce a novel method for modeling complex lens distortions using a hybrid network that combines invertible residual networks with explicit grids. This design effectively regularizes the optimization process, achieving greater accuracy than conventional camera models. Additionally, we propose a cubemap-based resampling strategy to support large FOV images without sacrificing resolution or introducing distortion artifacts. Our method is compatible with the fast rasterization of Gaussian Splatting, adaptable to a wide variety of camera lens distortions, and demonstrates state-of-the-art performance on both synthetic and real-world datasets."show more

MrNeRF
17,206 просмотров • 1 год назад
F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian... Splatting Contributions: • We pioneer 3D-aware generation using generalizable feed-forward Gaussian Splatting representation, achieving significant efficiency and favorable rendering quality on monocular datasets. • We significantly advance the capability of pixel-aligned Gaussian Splatting representations by designing a self-supervised cycle training strategy specifically tailored for monocular datasets. • We further mitigate the artifacts of 3D-aware representations caused by large viewpoint shifts by introducing geometry-aware video priors.show more

MrNeRF
14,229 просмотров • 1 год назад
MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting Contributions:... • MoE-GS: the first dynamic Gaussian splatting framework employing a Mixture-of-Experts architecture, enabling robust and adaptive reconstruction across diverse dynamic scenes. • A novel Volume-aware Pixel Router integrates expert outputs through differentiable weight splatting, achieving spatially and temporally coherent adaptive blending. • Efficiency of MoE-GS is improved through single-pass multi-expert rendering and gate-aware Gaussian pruning. A separate knowledge distillation strategy trains individual experts with pseudo-labels from the MoE model, enhancing quality without modifying the architecture.show more

MrNeRF
10,346 просмотров • 8 месяцев назад
Two weeks ago I fixed one of my teeth... with algorithms I wrote a couple of years ago! I got hooked by 3D scanning when I started to work for a software shop in Zurich that was programming 3D computational geometry algorithms for denture scanning to produce crowns (and more). Back then, a typical reconstruction pipeline was like: scan the patient’s teeth using an intraoral scanner, reconstruct the surface mesh, design the restoration digitally, and finally mill the crown out of ceramic. We were working mostly with point clouds and meshes, but it wasn’t just math, it was craftsmanship translated into a digital process. Every micron mattered. You could literally see how a good algorithm meant a better fit in someone’s mouth. Gaussian Splatting isn’t about surface reconstruction, it’s about appearance reconstruction. It doesn’t care about explicit topology, it captures how light interacts with the scene. In a sense, it’s the opposite philosophy of the dental world: instead of modeling what the object is, it models how the object looks. 3D Gaussian Splatting enables applications like training self driving cars, teaching robots to understand their environment, creating virtual worlds, or monitoring real sites. It represents scenes as millions of small Gaussians rendered in real time without the need for meshes or textures. Coming from a world where precision geometry was everything, this shift felt natural. It’s still about reconstruction, but with a different goal: not manufacturing a perfect object, but reproducing how the world actually looks. Two weeks ago I got my first dental crown, made with the same software, reconstruction algorithms, and Swiss precision I once helped develop. I haven’t worked there in two years, but sitting in that chair and seeing the process from the other side was a proud moment. It reminded me why I love this field.show more

MrNeRF
289,948 просмотров • 7 месяцев назад
Segment Any 3D Gaussians paper page: Interactive 3D segmentation... in radiance fields is an appealing task since its importance in 3D scene understanding and manipulation. However, existing methods face challenges in either achieving fine-grained, multi-granularity segmentation or contending with substantial computational overhead, inhibiting real-time interaction. In this paper, we introduce Segment Any 3D GAussians (SAGA), a novel 3D interactive segmentation approach that seamlessly blends a 2D segmentation foundation model with 3D Gaussian Splatting (3DGS), a recent breakthrough of radiance fields. SAGA efficiently embeds multi-granularity 2D segmentation results generated by the segmentation foundation model into 3D Gaussian point features through well-designed contrastive training. Evaluation on existing benchmarks demonstrates that SAGA can achieve competitive performance with state-of-the-art methods. Moreover, SAGA achieves multi-granularity segmentation and accommodates various prompts, including points, scribbles, and 2D masks. Notably, SAGA can finish the 3D segmentation within milliseconds, achieving nearly 1000x acceleration compared to previous SOTA.show more

AK
69,542 просмотров • 2 лет назад
FastMap: Revisiting Dense and Scalable Structure from Motion "FASTMAP,... a redesigned SfM framework, achieves fast, high-accuracy dense structure from motion. On large scenes with thousands of images, FASTMAP is up to one to two orders of magnitude faster than GLOMAP and COLMAP. ... Importantly, FASTMAP achieves efficiency improvements while keeping comparable performance. Extensive experiments on eight datasets demonstrate pose estimation accuracy and novel view synthesis quality close to GLOMAP and COLMAP. " Contributions: 1. For all the iterative nonlinear optimization problems involved, we design algorithms such that the computational complexity of each iteration is only linear in the number of image pairs, not keypoint pairs or 3D points. This includes replacing the traditional bundle adjustment [50] present in previous SfM frameworks with a novel re-weighting epipolar adjustment algorithm, which is much more efficient. 2. Throughout the entire framework, we formulate as many steps as possible as GPU-friendly dense tensor operations. This allows us to implement the entire method in PyTorch [39], which provides seamless GPU acceleration.show more

MrNeRF
15,233 просмотров • 1 год назад
More coming soon ! I manage to integrate for... the first time volumetric 3D video made with Kartel.ai inside the AI world generated in gaussian splatting by World Labs !! I coded all in three.js! And it's possible to integrate elements on the fly ;) And so you can imagine soon what we will produce with that en AI : consistency of character and environment, relighting etc...show more

Lovis Odin
33,984 просмотров • 1 год назад
In a healthy adult male, the bladder doesn’t have... a fixed size, it’s a dynamic organ. Most people start feeling the first urge to urinate around 150–250 mL, a stronger urge around 300–400 mL, and the usual functional capacity is roughly 400–600 mL. It can stretch beyond that, sometimes up to 700–800 mL, but that’s uncomfortable and not considered normal everyday function. Now, once a urinary catheter is in place, the physiology changes. The bladder is no longer “storing” urine the way it normally does. Instead, urine drains continuously through the catheter into the collection bag, so the bladder usually stays relatively empty. That’s why in catheterized patients, what you see accumulating is not bladder capacity, it’s just the total urine output over time. As for the urine bags, standard ones typically hold around 1500–2000 mL, which is accurate and designed to safely collect urine over several hours. So overall, the concept is right, just important to remember that with a catheter, you’re no longer measuring how much the bladder can hold, but how much the kidneys are producing.show more

Op. Dr. Mehmet Bekir Şen
41,782 просмотров • 17 дней назад
In a healthy adult male, the bladder doesn’t have... a fixed size, it’s a dynamic organ. Most people start feeling the first urge to urinate around 150–250 mL, a stronger urge around 300–400 mL, and the usual functional capacity is roughly 400–600 mL. It can stretch beyond that, sometimes up to 700–800 mL, but that’s uncomfortable and not considered normal everyday function. Now, once a urinary catheter is in place, the physiology changes. The bladder is no longer “storing” urine the way it normally does. Instead, urine drains continuously through the catheter into the collection bag, so the bladder usually stays relatively empty. That’s why in catheterized patients, what you see accumulating is not bladder capacity, it’s just the total urine output over time. As for the urine bags, standard ones typically hold around 1500–2000 mL, which is accurate and designed to safely collect urine over several hours. So overall, the concept is right, just important to remember that with a catheter, you’re no longer measuring how much the bladder can hold, but how much the kidneys are producing.show more

Op. Dr. Mehmet Bekir Şen
34,335 просмотров • 2 месяцев назад
Getting a bunch of questions about this! It's a... full gpu rendering pipeline with built in shader editor for Rive, which can interleave 2D + 3D. Think WebGPU (minus compute, for now) that runs anywhere, not just the browser. You can implement anything at all. Cell shading. Background blurs. Deformation effects. And yes, even Rive vector components as textures. Shaders are precompiled and ship with your .riv. Minimal runtime size impact. Still WIP.show more

Guido Rosso
22,034 просмотров • 2 месяцев назад
Despite the Framer hype, we still get tons of... requests for custom Next.js + Tailwind sites. Because most founders and their teams don't want to learn another tool. They want to own their code, not rent it. Just shipped a landing page in Next.js for a SaaS. Their reasoning: - Full control over performance optimization - No vendor lock-in - Developers can maintain it without learning Framer - Custom functionality without workarounds - Better SEO control Sure, Framer is faster to build. But he no-code movement sold everyone on "anyone can build websites now" and forgot that someone still needs to maintain them. What's your take: Custom code for control or no-code for speed?show more

Namya @ Supafast
50,594 просмотров • 9 месяцев назад
Gaussian splats don't simulate light. They capture it. I... took one photograph of this room and generated it as a fully walkable 3D world in Mint. The volumetric rays, the shadow play on the rug, all preserved. This is what real-time 3D is becomingshow more

mint
15,339 просмотров • 16 дней назад
3D Gaussian Splatting for Real-Time Radiance Field Rendering paper... page: Radiance Field methods have recently revolutionized novel-view synthesis of scenes captured with multiple photos or videos. However, achieving high visual quality still requires neural networks that are costly to train and render, while recent faster methods inevitably trade off speed for quality. For unbounded and complete scenes (rather than isolated objects) and 1080p resolution rendering, no current method can achieve real-time display rates. We introduce three key elements that allow us to achieve state-of-the-art visual quality while maintaining competitive training times and importantly allow high-quality real-time (>= 30 fps) novel-view synthesis at 1080p resolution. First, starting from sparse points produced during camera calibration, we represent the scene with 3D Gaussians that preserve desirable properties of continuous volumetric radiance fields for scene optimization while avoiding unnecessary computation in empty space; Second, we perform interleaved optimization/density control of the 3D Gaussians, notably optimizing anisotropic covariance to achieve an accurate representation of the scene; Third, we develop a fast visibility-aware rendering algorithm that supports anisotropic splatting and both accelerates training and allows realtime rendering. We demonstrate state-of-the-art visual quality and real-time rendering on several established datasets.show more

AK
633,428 просмотров • 2 лет назад
The West is not dying. It is being killed,... and the names of the traitors are known. They occupy our capitals, infest our courts, pollute our newsrooms, and preach in our churches. They open the gates, kneel before the foreigner, and smirk as their own blood is driven from the land. They mock the fallen, defile the heroic, and spit on the blood that raised every city worth defending. They are not misguided. They are not mistaken. They are the enemy. They must be treated as such. For too long, we have been ruled by cowards, “men without chests,” by merchants loyal to nothing but the dollar, by liars who speak of progress while presiding over decay. A new generation now rises, armed not with apologies but with the fire of remembrance, with the memory of what we once were and the will to become greater still. We do not ask permission. We do not seek approval. We will reclaim what is ours, because no one else will. Victory will not come through debate. It will come through discipline, through will, through the unbreakable decision to endure, to outlast, and to return to the excellence and greatness that befit our people. We do not need millions. We require only a vanguard: men of loyalty, endurance, and resolve, hardened by truth and unmoved by fear. I say this not for approval, nor is it offered in hope of a reply, but in the spirit of doing what must be done. It is a promise made in full knowledge of what must come. The time of submission draws to a close. The age of reconquest begins. Let the traitors tremble. Let the weak, the feckless, and the unworthy fall away. The future belongs to those with the strength and the daring to seize it.show more

Chad Crowley
19,404 просмотров • 10 месяцев назад
We built the fastest 3D world generator capable of... creating ANY enviornment Just dropped a few pictures of the Palace of Versailles It spat out a fully navigable 3D gaussian splat in seconds Not to mention its free to startshow more

mint
13,342 просмотров • 2 месяцев назад
Introducing ml-intern, the agent that just automated the post-training... team Hugging Face It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: Web + mobile: And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.show more

Aksel
1,262,006 просмотров • 2 месяцев назад
This is not a Chinese family going on vacation... with their dog. This is a dog likely bought from a wet market for meat. They have tied the poor animal to the back of their car and are driving on the highway as if he is just luggage. He has no idea what is waiting for him. That is what makes this so heartbreaking. This dog is still alive, still trusting, still innocent — and yet he is being treated with no care, no mercy, and no dignity. Dog meat eating is still a cruel reality in parts of China. Many of these dogs are stolen pets or helpless strays. No dog deserves this fate. Dogs are companions. They are not food.show more

Sandeep Neel
10,709 просмотров • 5 дней назад
Disappointed with your ICLR paper being rejected? Ten years... ago today, Sergey and I finished training some of the first end-to-end neutral nets for robot control 🤖 We submitted the paper to RSS on January 23, 2015. It was rejected for being "incremental" and "unlikely to have much impact" Our resubmission to NeurIPS was also rejected It now has >4,000 citations (and more importantly, end-to-end training is widely accepted!) It's also cool to think about what's changed and what's the same -- - The network was 92k parameters and trained on ~15 minutes of data - The code was a combination of matlab, caffe, ROS, a custom CUDA kernel for speed, and a low-level 20 Hz controller in C++, all talking to each other. ROS+matlab was as bad as it sounds. - We pre-trained the encoder and did inference off-board on a workstation with a larger GPU. - We were paranoid about varying lighting messing up the network, so we did all the experiments after sunset (so long nights running experiments on the robot past 3 am) Now, we have manipulation policies that are far more dextrous, far more generalizable, and maybe on the cusp of breaking into the real world. :) (the paper:show more

Chelsea Finn
168,927 просмотров • 1 год назад