For the past two months I’ve been blessed to focus fully on my 3D Sacred Geometry engine, called PsiTriangle Engine. This engine is going to be the basis for our upcoming 3D Sacred Geometry Creation program, called Geometrify: Creator.
Beginning of the year I was working at Vizor (http://vizor.io), for a while, but that lasted only for 2 months as it was not really something I truly loved doing.
Getting closer to the metal
I’ve been now focusing on improving my engine since then, getting more performance out of the GPU and calculating as little as possible on the CPU.
The CPU -> GPU bottleneck is a real issue when working with dynamic geometry, optimally keep everything in the GPU memory and not transfer it from the main RAM to the GPU, as it’s a big bottleneck for performance.
Learning the tricks of GPU programming and getting to really feel the power of the GPU has been a marathon run, but I’m finally approaching the performance I’ve been looking for.
Doing complex things is easy, but just narrowing down to the simple essentials, the least amount of calculations needed, is difficult.
Putting my engineering skills to their use
I’m an automation engineer, and working with 3D equations and math is really an area where I’m starting to see use for that education. The education basis has given me the insight that I can understand and dissect any problem, if I just keep drawing and calculating on paper long enough.
Just draw it out
I’m pretty happy now that I got that education, as without it I wouldn’t have probably have the system in place to work like this (*thx math teacher, Pirkka Peltola)
Fast line drawing
In the last weeks, I’ve been completely re-writing my line drawing algorithm to utilize the GPU as much as possible.
Previously I had ported an algorithm by Nicolas P. Rougier from Python code to C++ (based on his paper here: http://jcgt.org/published/0002/02/08/).
But this case was just too generalistic and did too many calculations, and took a long time to upload the vertexes from the CPU to the GPU, which really killed performance.
So I decided to just rewrite from the ground up. Good tool to prototype graphics drawing is http://processing.org, so I first implemented the algorithm with Processing, then when it worked and I understood the process, started porting it to GLSL shader code.
Tesselating Circular Polylines with Processing
Getting to know the Geometry Shader
There exists geometry shaders on modern GPUs. With these, one can calculate vertexes completely on the GPU, utilizing the massive parallelism of modern GPUs.
I started my line drawing re-implementation using only the geometry shader. Here you can see results for that:
Here all the lines, segments and origo points for the circles are all calculated on the GPU, nothing is done on the CPU except the origo point sent to the shader.
This is pretty great, but there are limitations. First, the geometry shader has to re-draw all the shapes, calculate all the sines and cosines for each line segment, all the time, everytime, on each frame. This is slow.
Second, the geometry shader can only output a limited amount of vertexes. With my GPU, that limit is 256 vertexes of vector4 components. So it’s not really much, can’t do deep recursion with that.
Bringing in the Transform Buffers
There also exists a thing called ‘Transform Feedback Buffer‘, which basically means you Transform (calculate geometry) and put the results in a Feedback Buffer (store), which you then use to actually draw (read buffer).
These buffers are then only updated when changes occur, and not on beginning of each frame like with purely geometry shaders.
This got me already much better performance:
Much better, but I was still calculating stuff recursively, storing each circular formation as a separate copy of my base class.
This worked well with http://GeoKone.NET, as with software rendering all the data stays in the main memory. But with GPU rendering, we really want to minimize the amount of things calculated.
Drawing as little as possible
At this point, I decided that I know what I want to do achieve, and to get there, I really need all the perfomance I need, to make it as smooth as possible.
To do that, the current model of doing things recursively, ie. where a class instance stores num_points class instances and visits each of them to draw their data, continuing down the path recursively with a parent child model, really didn’t work anymore with the GPU.
With GPUs, what seems to work best is doing things in a linear buffer. We want to have all data in a continuous pattern, so you can just loop through it when calculating and drawing, with minimum amount of branching and changing buffers when doing that.
Basically we just want to blast the data to the shaders, so they can work on it as parallel as possible, because that’s the strength of the GPUs.
I’m still seeking the best way to do this, but with this model I could finally reach dynamic geometry in 3D space with similar performance as with GeoKone.NET. This is my latest update, showcasing dynamic manipulation of 2D plane sacred geometry in 3D space, that will be the basis for Geometrify: Creator.
Getting there :)
I’m developing this engine on laptop GPUs, my faster Macbook Pro having a Nvidia GT750M 2GB card, and my home computer having an ancient Nvidia GT330M/512MB.
So I really also have to figure out how to make this fast in order to develop, which is a good thing :) But I can’t wait to test this out on modern beasts of GPUs, which are easily 30x faster than the one on my older laptop.
Anyway, development continues, if you are interested in more updates, follow me on Twitter: https://twitter.com/inDigiNeous, I’ll be updating there more frequently.
Now peace out, and bom! ^_^