Grid-Sort RayCasts
Charles Bloom
05-17-04
------------------------------------------------------------------------------
Credits
ray-cast contest won on Munch by Dave Moore with a BSP-Tree
but now he does compact KD trees
KD-Tree in Steef by Thatcher Ulrich
we thought it was slow, but actually it's really fast
Sorted Triangles idea by Drew Card
interesting old-school ray tracer idea, got me thinking
NuGrid suggested by Chris Butcher (but I never found an article)
... so I wound up with this.
------------------------------------------------------------------------------
The Basics
We want to be able to take the position of the ray query and quickly
jump to the set of triangles that might touch. Well, a straight
array-lookup is the best you can possibly do. Any tree descent is going
to be Log(N). Some sort of sorted list binary search will also be
Log(N). What's more, both of those are bad for cache as you jump around
in memory.
Now, our world is 3d, so technically we have 3 axes, but storing a big 3d
array of positions would take too much memory. So, we'll make a grid of
2d cells, and then handle the 3rd axis separately. In the ideal case,
you don't have too many triangles along the 3rd axis. Also in the ideal
case, you don't have to walk to too many different grid cells.
The cool way to handle the 3rd axis is by linking together all the
triangles in that grid cell and sorting them on the 3rd axis. That way
you can do log(N) searches on that 3rd axis. The great thing about
sorted lists is they give you log(N) searches just like a tree, but they
cost zero in memory, it's just an arrangment of data you would have
anyway.
This structure is ideal for very short rays (because they just hop into
one grid cell), and very good for rays that are aligned or nearly
aligned with the grid. That is, if you're casting primarily along some
direction D (such as the direction of the sun in a light-mapper), then
you want to make your grid perpendicular to D, so that D is your 3rd
axis. This is like making a projection plane for the view from the sun
and rasterizing all your triangles into this "image".
Sorted lists are also cool because they're excellent at non-uniform
resolution. They automatically have resolution exactly proportional to
the occupation count of a region! That is, if you have one big triangle
somewhere, it gets one element, and if you have 100 small triangles
somewhere else, they get 100 elements. Grids fail here because the
spatial subdision is uniform. The 1 big triangle might get 10 cells, and
the 100 small triangles might get 10 cells. This is all true, but the
great thing about a grid is that you can instantly reduce your test set
in two axes with just one array-lookup. It's so fast, it can't be beat.
Once you get into the grid cell, you can then do another acceleration
structure inside the cell. Obviously if there are <= 8 or so tris in the
cell, no more acceleration is warranted. One thing to remember is that a
ray-tri test is only about 100 clocks, so for 8 tris, that's 800 clocks.
You better not use any acceleration structure that takes longer than
that! A nice thing about sorted lists is that you always just choose not
to use them. In the end, it's still just a list that you can brute-force
and walk over all elements very fast. The bad thing about sorts is that
it only gives you culling in one axis; it's a one-dimensional structure,
and you need more.
In this description I'm doing a "Grid->Sort" structure where triangles
are added to a grid in two axes (nominally X and Y), and then sorted
along Z within each cell. You could sort along X or Y or anything else
in that cell, but for most purposes, the grid has already done an
excellent job of reducing the candidates in X and Y, since the grid
cells can be very small with modern PC memory budgets.
This structure is not so good for game runtime in memory-limitted
environments. Also note that this structure is very very fast to build,
unlike a BSP or KD-tree.
------------------------------------------------------------------------------
The Algorithm
Take all your triangles from all your meshes. Transform them into world
space and stick them in a big array. Compute their normals or whatever
else you need for ray-casting against them.
Now make your grid. The grid will actually be multi-resolution. We do
this just to save memory and avoid putting very big triangles into many
many cells. We make the finest level grid at 2048x2048. This takes 16M of
RAM, and is the primary fixed over-head. Then we make levels for 1024x1024,
and 512x512, down to 1x1. All the grid cells are initially empty.
We find the extents of all the triangles in the plane of the grid. This
sets the borders for the grid. This also gives us our quantization
function that takes a 3d point to a grid coordinate. Grid coordinates go
from 0,0 to 2048,2048 in the space of the primary grid.
Now we start putting triangles into the grid. For each triangle, you
transform it into grid coordinates. Now we see how big it is in grid
space. If it's "too big", we push it up to the next level of the
multiresolution grid, and keep repeating until it's not taking too many
cells. Each time you push it up, it takes roughly 1/4 the number of
cells. For "too big" I currently use "covers 12 cells or more". This is
a tweak number that doesn't affect performance too much. It should be at
least 8 - if you make it too small, triangles get pushed way up the tree
to the big cells. Also, it must be at least 4, since a triangle lying
over a grid vertex will be in four cells.
Once you decide what level the triangle goes in, you need to put it in
the actual grid cells. To do, this I rasterize the triangle into the
cells so that it only goes into the actual cells that can possibly touch
it. To "put it into cells" what I'm actually doing is building up a
variable length array on the cell. The result will be that the cell will
have a number of triangles and then a list of indeces to those
triangles. The triangles themselves just stay in the original array that
we made; they aren't duplicated around. Note that multiple cells can
point at the same triangle. To avoid multiple visits, we use a "walk
id", which is just a counter that goes up, and equality means you've
already done that triangle on the current call.
Once all the triangles are in a cell, we sort them all the "Z" axis (the
3rd axis perpendicular to the grid plane). We sort just by triangle
centers. Once the sort is done, we have to build up the accumulated
min/maxes with two passes. We build up the minimum Z of all the
triangles to the right of the current one in the list and the current
one, and we build up the maximum Z of all the triangles to the left and
the current one. Note that this "max Z left" > "min Z right".
The resulting structure looks like this :
int grid0[2048][2048];
int grid1[1024][1024];
...
vector indexes;
vector triangles;
The grid[][] has a -1 to mean the cell is empty. If the cell is not
empty, the grid[] indexes a spot in the "indexes" array. In that array,
the first member is the number of triangles, then that # of indexes
follows.
The actual grids are stored more like vector grids. Usually the
top several levels are totally empty, so I pop them off the back of the
grids list.
Note that the indexes are nicely adjacent for cache, but the triangles
may not be. One thing I haven't done that might be good would be
re-order the triangles so that nearby triangles are nearby in the
array. Often the mesh data is arranged this way to begin with.
Now, to ray-cast against this structure, we start with a ray (or
segment/edge). Clip the ray to the bounding box of all the triangles,
which makes it finite in world space. Transform the ray to grid
coordinates, so you have a 2d ray in the [0->2048] coordiantes. Start
from the origin of the ray and test the elements in that grid cell.
We're basically doing an old integer line-drawing here. Step to the next
integer coordinate in x or y and test that grid cell, repeat. If you
only want the first hit, you can return as soon as you see a hit. We're
just doing an integer line-draw and checking each cell as we go. Once
you do one level, you take all the grid coordinates and divide by 2, and
go to the next coarser level. Repeat for all the levels.
Now, to actually test a ray against a cell, we can use the sort on the
3rd axis. We look up the grid[] to find our array of indexes. If there
are none or a few, we just test all those triangles. You can test either
front-to-back or back-to-front depending on the direction of the ray if
you want to do first-hit early returns. If there are many triangles
(>=24), we can use the sort. Take your ray extents on the "Z" axis
(clipped to that cell), and use the mid-point. Do a binary search on the
triangle list to find the triangles closest to your Z. Now you can
search out from there in both directions using the accumulated "max Z
left" and "min Z right". Basically, you keep looking to the left as long
as "max Z left" > your ray's Z min, and you keep looking to the right as
long as "min Z right" < your ray's Z max.
------------------------------------------------------------------------------