NOTE: This post is currently WRONG. I am not correctly comparing every particle to every other particle in the gravity calculations. So, consider the listed results and speeds as incorrect. I will update this post once I get a correct working model going.
OpenCL allows you to write programs that run on your video card processor chip (GPU). GPUs can have many hundreds of cores so can process many more things simultaneously than a CPU can. A good candidate for parallel processing is a gravity simulation.
Since I originally posted the following 3D gravity movies to YouTube…
…there were questions and some skepticism in the comments so hopefully this blog post helps clarify things.
It has been a while since I was working with the gravity code, but this should be enough info for those interested.
This is the OpenCL kernel code that does the calculation for every object each frame. I had to screen capture it as WordPress kept deleting lines no matter what tags I used.
The kernel uses the standard Newtonian gravitation formulas. No other fancy calculations are included. So no energy conservation, no collision detection, no objects clustering into planets, no dark matter, no dissipation, no energy conservation, no overall angular momentum constraint, no speed-of-light limit on propagation of gravity taken into account and no inertia of mass. Basically it is the simplest possible formulas that give gravity like results.
The pos, vel, acc and mass are float arrays that hold the position, velocity, acceleration and mass for all the objects. Every frame those arrays are filled and passed to OpenCL using clCreateBuffer and clSetKernelArg. Once the above kernel code processes the objects the values are read back from the GPU using clEnqueueReadBuffer and then OpenGL (GL not CL) is used to display the objects to the screen using transparent billboard quads. Display is currently done in software OpenGL which is why the OpenGL display time is slower than the OpenCL calculations time.
The mingravdist check makes the calculations ignore objects that are too close together. If you allow objects very close together to interact they end up flinging objects out of the simulation space at a very high speed. For a world size of 300 units I use a mingravdist of 1.
Here is a new movie I just rendered with 3,000,000 particles. The frames took 4,700 ms to render on a GeForce GTX 570. Out of that time 844 ms was the OpenCL calculations. The rest of the time was rendering and passing the data back and forth to and from the GPU.
Unfortunately YouTube’s compression ruined the movie quality a bit. Mostly due to the noisy/static like nature of the millions of particles. Not enough I frames and too many P frames. Here are a series of screenshots showing what the frames looked like before the compression.
To push it further, here is a rotating disk of 5,000,000 objects. This one took 6,950 ms per frame with 1,420 ms for OpenCL calculations.
Again, here are some uncompressed screenshots.
The GTX 570 is on the lower end of the GeForce cards these days so the simulation speeds should be faster on the newer cards.
10,000,000 objects was also possible, but until I work out better ways of display it is not worth posting as most of the detail gets lost in the cloud of billboard particles. 10,000,000 was taking 8,780 ms per frame with 2,840 ms OpenCL time.
Try It Yourself
If use Windows you can download Visions of Chaos and see the simulations run yourself. Here are a few quick steps to get you going;
1. Open Visions of Chaos
2. Select Mode->Gravity->3D Gravity to change into the 3D Gravity mode
3. The 3D Gravity Settings dialog will appear
4. Change the number of objects to 1,000,000
5. Change the Time step to 0.02
6. Change the Size of objects to 0.2
7. Check Create AVI frames if you want to create an AVI movie from the simulation
8. Click OK