
Alex Goldring
@SoftEngineer • 2,288 subscribers
Building Shade — next-gen Web graphics engine | Graphics engineer | consultant
Videos

Apparently animating more than ~20 characters in modern graphics engines is a big deal😅 324 skinned characters animating independently in WebGPU (browser) Each character is playing animating with a completely separate skeleton and a timeline. No two characters sample the same time. Character has 66 bones and 28,106 triangles I want to stress that there is no instancing of any kind here, and CPU is not involved at all.
Alex Goldring57,299 Aufrufe • vor 1 Monat

"Assassin’s Creed: Odyssey" map of Greece on a single 32,768 x 32,768 texture in the Browser. Virtual Texture tech in WebGL 2.0 The source is a 804 Mb PNG. Without Virtual Texture tech you'd need to download that and wait for a few seconds for your browser to decode it before you could use it. Your browser would then crash when it tries to upload it to the GPU :) If we imagine that the upload to the GPU would succeed - it would take 4Gb of VRAM. Virtual Texture loads immediately, because the image is split into small fixed-sized tiles with full LoD chain (mips) and uses 2048x2048, which is just 16 Mb of VRAM. Article:
Alex Goldring188,343 Aufrufe • vor 4 Monaten

Octahedral impostors (WebGL) Single quad, 1024x1024 texture resolution The intention, of course, is to serve as an LoD and not to be viewed up-close The bake happens in-engine and is fully runtime capable. Bake includes full G-buffer: normals, ORM, albedo, depth Bake parameters: atlas texture resolution - how big is the overall texture, here it's 1024 grid resolution - how many individual projection views we bake, here it's 16x16 bake mode - full octahedral or hemioctahedral. Hemioct is missing the bottom projection views, but gives you about 2x resolution elsewhere engine is meep:
Alex Goldring34,172 Aufrufe • vor 25 Tagen

1 million unique objects in WebGPU. It appears to be the fashion to show off limits of one's engine. So here's "Shade" - my WebGPU engine running in the browser, rendering 1,000,000 unique meshes. That is - no instancing, and every mesh has a unique material, geometry and transform. We have full cascaded shadows here in the mix and post-processing stack. "Surely you can't have 1M instances, there's a trick, right?" - Nope. No tricks, just the state-of-the art Vizibility Buffer-based GPU-resident renderer. That is - on the CPU side we submit a fixed number of draw calls regardless of how many instances we have.
Alex Goldring86,497 Aufrufe • vor 5 Monaten

Preview of Global Illumination (Browser/WebGPU) link: entire lightmap takes up only 2MB of VRAM
Alex Goldring21,254 Aufrufe • vor 4 Monaten

Sparse Volumetric Light maps (WebGPU). Pretty happy with the current state. - Improved probe placement optimization, reducing light leaks - Worked on various smaller bugs light map stats: - VRAM Size: 20Mb - Probe samples: 16,384 - Probe count: 324,674 - Bake time: 267s - Bake hardware: RTX 4090
Alex Goldring20,350 Aufrufe • vor 4 Monaten

Working on sparse volumetric light maps for WebGPU. Got generation part sorted. The resolution is adaptive, refining near geometry and skipping through most of the empty space. The video is question uses a map that's only 10MB in size, including everything. That's less the amount of GPU memory that a single leaf takes up on one of threes in the scene, or texture on one of the sign-posts. We get about 1 sample every 30 centimeters in the world space. As a side note: gave my renderer quite the workout, rendering each probe as a separate non-instanced mesh for debugging.
Alex Goldring18,651 Aufrufe • vor 4 Monaten

Virtual Textures in the Browser. This is a WebGL 2.0 implementation I did in 2023. The car model is from Sketchfab by Pavel Matoušek. It's a worst-case for virtual textures because the texture is generated from photogrametry software and has pathologically bad UV layout. The texture is 16,384 x 16,384 pixels, which is 180 Mb of PNG data. Loading this in Blender would crash Blender for me back then. The GPU memory requirement for this texture is 1 Gb. The VT displays the model after loading just 34 Kb (one tile). The other tiles are fetched in via a smart priority queue in the background as you move camera around the scene. The physical page texture is 2k, which is 16 times smaller than the source. The solution supports multiple separate virtual textures, and has GLTF integration. Texture filtering as well as anisotropy are fully supported as well. You can try the demo yourself here: Here's a writeup I did while working on this: The video recorded on a very crummy internet connection and I intentionally reload the page in the middle. The delay comes from the geometry download, actual texture part is near-instant.
Alex Goldring19,368 Aufrufe • vor 5 Monaten

Was working on CSM (Cascaded Shadow Maps) blending. My CSM implementation is a little unusual, I switch cascades as late as possible, using the highest available resolution cascade. Most CSM implementations choose cascades based on view depth, the problem with that is - you waste a lot of high resolution shadow texels, in my experience easily upwards of 50%. I figure - GPU already spent the effort to compute those shadow texels - why not use them, and get perceptually about 20-50% resolution increase in your shadows. This is not new, and MJP showed this in his code too (see below). The problem is in blending, if you use cascade projection matrix for choosing a cascade - blending becomes non-trivial. So I spent a lot of time working on that, and finally cracked it. Video shows blending of 5 cascades, the blend margin is exaggerated for demonstration purposes and cascades are pushed to the near plane for the same reason. The result is a perfect forward-only blend at the cost of a bit of ALU. Matt Pettineo is pretty much an authority on the subject nowadays, and I found his amazing repo very useful in the past:
Alex Goldring18,775 Aufrufe • vor 5 Monaten

Sparse Volumetric Light maps (WebGPU). Worked on ensuring C0 continuity across the entire map. For now this is achieved by purging incomplete node levels, works well enough even if it's a bit of a blunt tool. Actual locations for probes during the bake are now going through an optimization phase, which allows me to push probes behind surfaces out into the open resulting in much fewer light leakage artifacts. This video here has only 28,376 probes in the map, and takes 1.8Mb of VRAM
Alex Goldring15,814 Aufrufe • vor 4 Monaten

Working on Sparse Volumetric Light-maps. Thanks to CynicatPro🎃 for pointing me at Unreal's version. In a nutshell it's just another sparse voxel data structure. My implementation is, no doubt, different from Epic Games Store's own. I'm using 4x4x4 probe grid with intermediate nodes having very wide branching factor of 64 as well (4x4x4). I liked the parameters that Unreal is using, of limiting both total memory as well as the lowest level of detail, which is common in sparse grid implementations. Here's Bistro scene with just 1Mb limit. This is roughly equivalent to a 512x512 lightmap texture in 2d, except surface light maps require unique UVs and you typically get very little detail out of 512 resolution texture with a lot of light leaking. There is also no directional response. My implementation encodes second-order spherical harmonics for each probe (9 coefficients), encoding RGB channels as RGBE9995 (4 bytes). So far only worked on the structure, actual bake is yet to come. I've been eyeing sparse voxel structures for a while now, and have been studying them roughly since the GigaVoxel paper by Cyril Crassin but never really implemented anything for the GPU before. I was always the BVH-kind of guy. It's a fascinating topic. --- Stats for the scene: --- Total memory usage: 1.000 MB Node count: 609 Unique probe count: 24,025 Probe reuse: 38.36 % Unexpanded nodes: 15,714 --- Again, note that there is no GI going on here, only the structure of the probe tree and the algorithm for building it from a given scene.
Alex Goldring11,487 Aufrufe • vor 4 Monaten
Keine weiteren Inhalte verfügbar