Fiery fate (May 2013)
It seems to be the case that, the closer to a deadline, the more work I manage to produce. It’s not a pressure thing, I just seem to manage to produce more work as the deadline approaches.
This is problematic when trying to write up on what I’ve produced when I keep on producing. Still, what I have produced has certainly sexed up my application.
The first thing I managed to do was implement a skybox to fill out the dull darkness. Had I been able to do implement some of the technologies earlier on, and had more time to implement more pretty shit, I would have had procedurally positioned solar systems, which is similar to how the skybox images were created.
The second thing I managed was to implement more complex lighting, based on the vertex normal. The issue I had with this was I didn’t know where the neighbouring vertices are. But I then realised I could “guess”.
I basically create two more vertex positions perpendicular to the original vertex, and work out their positions on the planet. From this I can calculate the normal for the main vertex.
While this creates much more realistic terrain, basing it on the vertex normal (as opposed to in the pixel shader) creates some artefacts, including emphasising the patch seams
I simply haven’t the time to fix these issues; if I calculated the normal based on pixel position, the vertex position wouldn’t hinder the shading. However, to do this, I’d need to pass in the cube’s world matrix. This would be maddening to do for every pixel, instead I’d have to use a constant buffer. However, to do this I’d need to allocate aligned memory.
While my land looks better than the totally flat appearance when rendered with spherical normals, it’s now more noticeably unearth-like, it looks really rounded.
So I decided to try out different fractal terrain techniques to try and find a better terrain.
The first was a heterogeneous multifractal. This made the terrain more jagged inland, and smoothed out to the end. However, it needed to be lowered as the coast was too smooth. similarly, the mountains are too jagged.
The next was a ridged multifractal (the same as used for creating the nebulas in the skybox). This made more mountain-like structures, however, the land looks more like sand dunes than rock and the coast line doesn’t look correct.
I decided instead to use the original fBm but with some form of ridges. They’re not quite mountains, as they run at the same height, but they do look like mountain ranges and less like sand dunes.
I’m not 100% happy with it, however I really need to crack on with my dissertation. Although not before I implement better clipping techniques!
For I can’t really do my results section when my application doesn’t run as well as it should do.
Plan ‘Merging’ together (April 2013)
While I knew in my head I’d be able to come up with a simple solution to merging the patches, I didn’t imagine I’d be able to solve it with such little effort, and so soon after fixing my noise issues too!
The cracks aren’t 100% gone. But they are drastically reduced, enough for this project.
I wanted to take advantage of DirectX 11’s tessellation stage’s feature where you can assign a different amount of detail for a tessellated patch’s edges to the centre detail.
So I knew it should be possible to make the edges of my level of detail seamless.
My first attempt was where I acknowledged that each level of detail essentially doubles the edge detail. So if an edge had 32 vertices, the next level of detail would make the edge have 64 vertices. As the two neighbouring patches aren’t connected, these edges are opened up to cracks. So I simply came up with an algorithm that halved the outer edges of the child patches.
However, this only solved the issue for the “best case scenario” and I’d overlooked the typical scenario. As shown in a previous post, there were moments where a patch may have neighbouring patches of different levels of detail. As each edge can only have a single value of detail, I knew it was down to the child patches to make the edges seamless.
Of course, merging LOD is a solved issue, however I wished to come up with a less complex method that worked with my current set up. I wanted to try to do it without having to traverse my quadtree structure more than once, as that would considerably slow down my application.
Fortunately, I managed to do just that.
As the picture shows, the edges of each patch merge seamlessly into the surrounding patches.
However, this technique is limited. To conserve the amount of vertices on an edge means that eventually a patch edge will have a detail of 1 (the smallest possible) and so further levels of detail with result in the possibility of cracks. However, they’re drastically reduced in size and much rarer.
To maximize the guaranteed crack free levels of detail, the edge is set to have the maximum number of vertices (64) which means that there can be up to 7 seamless levels of detail.
These next pictures aim to give a sense of the difference this merging technique makes
without the method:
with the method:
The only technical thing really left to do is implement a clipping technique that reduces the amount of patches rendered, namely frustum culling.
Noisy excitement (April 2013)
My noise issue is solved.
Since it’s first implementation into my honours project, the noise hasn’t worked. I wasn’t sure why, as the implementation had worked in my previous procedural project. However, it wasn’t a large issue at first, as the other features had to implemented.
However the cause of the problem lingered. Recently however, I went over my 3rd year project and realised that while the shader was the same, the implementation in fact wasn’t. That project used the effects framework, while I’m using DirectX’s API. I then wondered about the usage of globals in the 5.0 shader model.
Sure enough, you cannot simply declare a variable in global scope, it must be static. However I knew before even doing this that it would be hazardous for performance. However, to check whether this was the issue I proceeded with using a static array for the hash values of my noise.
Sure enough, success:
However I now needed to find a way of having a static array to read from that isn’t in the global scope. The two obvious solutions are either using a constant buffer or a texture buffer. A constant buffer is the option I went with because, besides being easier to implement, constant buffers are also optimised for GPU resource updating, as the buffer will only be updated when ever the noise permutation is generated (typically once) it should run rather nice.
And run rather nice it did.
To my surprise, it actually runs better than it had been doing! Moving the permutation array out of the shader and into my application also allowed me to have some fun with generating different Earth-like planets.
I imagined that when I recreated the array, it would slow down the application. Again to my surprise, it regenerates at an exceptionally fast pace. As shown in my video
Fixing the noise leaves only one more glaring issue: Those pesky cracks.
I attempted to alleviate them with a method I came up with. But I soon realised that this method was only working for specific patches and didn’t take into account that there are other neighbouring patches which may have differing levels of detail. As shown in this image:
Where the green boxes are where there are no possible cracks, but the red boxes are where cracks can, and bloody well do, appear.
It’s more than possible to remove the cracks, I know how I could do it. However, I wish to come up with a smarter method which sorts it and still runs smoothly.
No shortcuts (April 2013)
As my planet was struggling to render in real-time I decided it was time to look into using a texture to relieve some of the computational stress in calculating the fractal Brownian motion.
I knew going into creating the texture that there would be limitations:
- The texture can’t be too large as memory is limited
- When using the texture as a basis, rounding errors will be immediately prominent as no more detail can retrieved.
However, before I even got to these issues, I hit implementation problems. Firstly I had to re-familiarise myself with DirectX11’s API for textures. Which took a good while; in it’s attempt to be diverse and efficient it’s really convoluted in what you need to set up.
It’s then not clear exactly how you fill a texture. I first shot at creating a 1080x1080 texture. This was very quickly shot down by the compiler when it informed me I’d occurred a stack overflow. So I decided to start off with a 256x256 texture.
I then ran into problems when sampling the texture: it only appeared to sample the first row of the texture. I soon managed to get it to render more of the texture, however it was clear that it still wasn’t completely correct - it didn’t appear to render the entire texture, just a section. Tests on this were inconclusive in finding out exactly what was being rendered.
These issues, and the likely hood that it would be more of a hassle to fix them and implement the textures as desired, combined with the inevitable problems that would have come after led me to the decision to revert back to calculating the fBm in the shader and instead focusing on trying to get it to run better.
One reason for it running slow is that patches are being rendered even if they’re not viewable, I’ve yet to make them clip. This was particularly a problem when my planet was relatively small as the quadtrees were splitting up on the otherside of the sphere. So when I drastically enlarged my planet, it actually managed to run smoother.
I’ve also managed to implement the distance based LOD for the fBm, where less octaves are used when far away. This makes the transition a lot smoother than when implemented through the vertices.
I’ve also improved the aesthetics in preparation for comparative images in my study. I’m still unable to get the correct transformed normals for the planet so the lighting is currently using incorrect normals. However, because I had to pass in a matrix for a constant buffer to be relevant, I’ve not been able to take advantage of it to pass in data not only for the tessellation patch detail but also the camera position and it’s look vector. Which was used for the fBm LOD and specular lighting. I use more detail in the pixel shader as it’s possible to feign more land mass.
This also makes the planet a little tidier when getting near the coasts and even allows for smaller islands to appear that wouldn’t have if only based off the vertices.
Here is a shot showing the improved lighting:
I now can record my application straight off the computer (so no more unsteady hands)
I had issue compressing the video however and Tumblr didn’t really help. So I’ll be posting my videos onto vimeo for the time being
This video shows how my application converts my cube into a sphere
Onto bigger and better things (March 2013)
So I’m back to my cube to spheres but this time with a new look:
As expected, the LOD doesn’t run fully when approaching the “corners” of the sphere, to the point where detail actually decreases as you reach them.
The next step to rectify this is to base it now on the patch’s new spherical position.
This still won’t be 100% correct as the elevated vertices will still be based on their flat patches, so if the camera is flying over the top of the mountains, the LOD won’t be as high as if the camera was flying through the mountains.
Another issue that’s arisen, though one which had occurred with my split face implementation, is that the higher LOD drop the frame rate dramatically.
As I expected, dropping the detail of each patch didn’t fix the issue, which confirmed my suspicion that the issue instead lies within the fractal calculations as each patch is recalculating the fBm. This obviously is inefficient and so I will have to work out a way of calculating the fBm for the object which the shader can then use.
The most obvious solution is to pre calculate the fBm on the CPU into a buffer which the shader can access. However, this would create a catch22 situation between the limited detail based on the size of the buffer as well as how much memory the buffer would take up, as the project is for multiple planetary bodies.
Closer still (March 2013)
On the back of my previous update, I believe the issue lies with me trying to compensate for the fact that the patch isn’t in the position it will be drawn. The patch is positioned along the face of a unit cube, however I’ll be wanting the distance from the camera to the patch based on where the patch will be once scaled to the size of the object. This would be an initial step as once turned into a sphere, the distance will be off as the camera approaches the corners.
However, I’m obviously not correctly calculating the distance. This was confirmed when I made the scale 1:1. So the distance is being based of a 2 x 2 patch even when scaled, which is why on a larger patch it doesn’t work correctly. However, it’s comforting to know it does work.
I was rather dismayed however when it was running very slow once more detail is added. I tried lowering this by lowering the detail per patch however this didn’t change the frame rate. I shamefully realised I was printing each of the patches position per frame. A foolish mistake and, as expected, once this was removed, the application ran silky smooth.
It’s also clear I can have control over how many levels of detail there are. Larger planets will need more levels than small ones. As they’re multiples of halves I could test whether the patch size was less than or equal to the limit.
tantalisingly close (March 2013)
After an initial hiccup in the implementation that disguised itself as a NULL pointer dereference, I learnt more about the xna maths library. Part of the advantage is it’s use of the SIMD instruction set. However, it must be aligned appropriately in memory to allow the operations. Is was an alignment issue that was causing the error and not a NULL pointer.
The memory alignment is conveniently taken care of during compilation when variables are created as the compiler will force the memory allocated to be aligned. However, when memory is allocated at runtime from the heap, there is no guarantee that the memory will be aligned.
With the help of my honour’s supervisor, functions were created which would allocate aligned memory specifically for when a node is created so that it will be aligned.
Once this issue was fixed, I was able to concentrate back on the main problem at hand: Quadtree based LOD.
I had initially believed my implementation to be ready to work. However I soon learned it did not perform as expected. While it seemed to change between different sizes of patches, as intended, their positions where not quite so. Instead they took a spread out diagonal placing.
Looking at the code that places them I realised the error in my algorithm was creating unwanted values for the z placement. But I was able to rectify this and created the correct algorithm. Running this revealed that the offsets needed to be half the patch size. Once this was changed I was presented something painfully close to what I desire:
At first glance, it would appear to work exactly as intended: where detail would increase near the camera as it approached the surface. However, while this holds true to some extent, there is an artefact where detail in the corner at (1,0,1) increases in detail as the camera moves to the centre.
It’s also clear to me now that I’ll need to make a alteration to my plans for the edge detail, because a patch’s neighbour may have 2 or more patches of different size next to it. Therefore, the edge should be based on it’s parent’s detail, and the parent patch cares not for it’s children’s details.
On the verge of something big (March 2013)
I’ve now over hauled my application so that I don’t waste valuable time scaling the Planets/Stars outside of the matrix multiplication as I figured out what was wrong initially.
I’ve also now introduced a second constant buffer to be used by the hull shader’s constant function. I initially tried to simply pass in a float, which through up dimension errors as apparently DirectX sees it at waste of time to send in anything smaller than 16 bytes to the GPU.
To get around this I send in an Matrix and then set the first member of the matrix to the amount I wish to tessellate by.
While implementing this, I realised I’d created a false memory of the working of the tessellation stage. It had been a while since I’d got it working and I haven’t really gone near it since. I’d done early experiments to try to get an understanding of what happens.
However, I’d been worried recently when considering my options for implementing CLOD. I’m confident I can get quadtree based level of detail implemented, but it’s those bastarding cracks that I was conscious of.
It was my belief that there was only two values for the tessellation factors, one for the inside of a patch and the other for the outside.
Instead, there are 2 values for the inside of the patch (controlling the amount of detail for the rows and columns) and 4 values for the outside of the patch (one for each of the edges).
It’s the latter feature that made me mentally jump for joy when I remembered, coming across it when implementing constant buffer controlled tessellation factor.
I was imagining I would have to create a difficult (and highly inefficient) algorithm that created extra patches designed to merge the main patches.
This will hopefully no longer be the case as I will instead only have to work out the detail of the neighbouring and then use a halfway value for the edges which will, in theory, close off those pesky cracks.
For example, if one patch has a tessellation value of 16.0 and it’s neighbouring patch has a value of 15.0, then their joining edge value will be set to (16.0 + 15.0) / 2.0 = 15.5.
One of the great advantages of the tessellation stages in DirectX is that the values don’t have to be integers, making merging different levels of detail a dream.
This video highlights the issues with the z-buffer at long distances. Because the precision being less near the back of the z-buffer, the larger I make the view frustum, the more likely it is z-fighting will occur as rounding errors protrude. This means that I cannot simply extend the far plane to a large value but use a method that holds the distance information without drawing the objects incorrectly.