Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

I've been trying to implement a parallel rigid body solver. It's a tricky problem in physics to solve a huge pile of objects on multiple threads. I've tried graph coloring and other means of splitting up the problem, but so far a locking primitive per body is the most efficient..

65,477 Aufrufe • vor 1 Jahr •via X (Twitter)

11 Kommentare

Profilbild von Sebastian Aaltonen
Sebastian Aaltonenvor 1 Jahr

CPU or GPU? I used graph colorization in Claybook GPU solver. You can also accumulate corrections in a lockless way, but that requires more iterations to converge. I wrote four different solvers for GPU before I was happy about it.

Profilbild von Dennis Gustafsson
Dennis Gustafssonvor 1 Jahr

This is on CPU, I tried a Jacobi approach today, but it converges so slowly that it eats up all gains… Not sure if that is what you meant though. Anyway, I’m pretty happy with the unordered Gauss-Seidel I have now. I changed it from one mutex per body to hashing though.

Profilbild von Storm
Stormvor 1 Jahr

I run into this problem every time I start trying to build a game engine. It's a huge rabbit hole. The problem is that you're trying to represent and calculate something continuous (quantum, even) in discrete steps and that *necessarily* requires making some tradeoffs in correctness or speed. In the real world every object interacts with every other object at all times (e.g. gravity), there's not really a notion of "event propagation" because there's not really an "origin" and thus no real "direction" in which to propagate events. You end up running into fundamental issues with our human conception of space and time. (e.g. trying to answer the question of "what is time, really?") As far as I have discovered, this makes it impossible to "correctly" implement the sort of cause-and-effect physics that collision detection requires. Granted, there are plenty of acceptable well-documented tradeoffs to be made! Best of luck to you, hope you solve this!

Profilbild von Alex
Alexvor 1 Jahr

Maybe you'd want to look into the implementation of @RossNordby's Bepuphysics2 on Github. He really knows his way around multithreading and cache opt. The solver code isn't super insane (~4k LOC). For someone with your experience it's probably not too difficult to read.

Profilbild von Awesome New Year Robot
Awesome New Year Robotvor 1 Jahr

Yeah, the "pile of objects" complexity is inherently n-squared :-( This is very annoying for networking, too. Long time ago I tried a system that double-buffered; read object states from an input array, write new object states to the output array. Parallelizes perfectly if you break it by object type and then sub-ranges of each type array. You may need to run narrow-phase for the same pairs multiple times, but that's at worst twice the linear cost, and thus at 3+ threads are still a win. The drawback is that you can't run a "big matrix of constraints" type solver in that fashion, so instead you have to use penalty/damping methods, use double precision, and crank up the frame rate to 1000 Hz. But that's not necessarily a bad thing ...

Profilbild von Kiaran Ritchie
Kiaran Ritchievor 1 Jahr

Impressive pile of boxes! Crazy that a mutex per body is better than graph coloring.

Profilbild von Nicholas Chapman
Nicholas Chapmanvor 1 Jahr

Have a look at Jolt physics, which uses striped mutexes.

Profilbild von Fire Totem Games 🧡 A Webbing Journey 🕷️🍪
Fire Totem Games 🧡 A Webbing Journey 🕷️🍪vor 1 Jahr

This looks sooooo satisfying 🤩

Profilbild von SpeedKomodo
SpeedKomodovor 1 Jahr

Looks like one of those 2000s era PhysX benchmarks.

Profilbild von Charlie Shenton
Charlie Shentonvor 1 Jahr

How many bodies is this, and on what CPU?

Profilbild von Dennis Gustafsson
Dennis Gustafssonvor 1 Jahr

This is around 5000 bodies in an i9 cpu. Not optimized yet though :)

Ähnliche Videos