Grid-Sort RayCasts Charles Bloom 05-17-04 ------------------------------------------------------------------------------ Credits ray-cast contest won on Munch by Dave Moore with a BSP-Tree but now he does compact KD trees KD-Tree in Steef by Thatcher Ulrich we thought it was slow, but actually it's really fast Sorted Triangles idea by Drew Card interesting old-school ray tracer idea, got me thinking NuGrid suggested by Chris Butcher (but I never found an article) ... so I wound up with this. ------------------------------------------------------------------------------ The Basics We want to be able to take the position of the ray query and quickly jump to the set of triangles that might touch. Well, a straight array-lookup is the best you can possibly do. Any tree descent is going to be Log(N). Some sort of sorted list binary search will also be Log(N). What's more, both of those are bad for cache as you jump around in memory. Now, or world is 3d, so technically we have 3 axes, but storing a big 3d array of positions would take too much memory. So, we'll make a grid of 2d cells, and then handle the 3rd axis separately. In the ideal case, you don't have too many triangles along the 3rd axis. Also in the ideal case, you don't have to walk to too many different grid cells. The cool way to handle the 3rd axis is by linking together all the triangles in that grid cell and sorting them on the 3rd axis. That way you can do log(N) searches on that 3rd axis. The great thing about sorted lists is they give you log(N) searches just like a tree, but they cost zero in memory, it's just an arrangment of data you would have anyway. This structure is ideal for very short rays (because they just hop into one grid cell), and very good for rays that are aligned or nearly aligned with the grid. That is, if you're casting primarily along some direction D (such as the direction of the sun in a light-mapper), then you want to make your grid perpendicular to D, so that D is your 3rd axis. This is like making a projection plane for the view from the sun and rasterizing all your triangles into this "image". Sorted lists are also cool because they're excellent at non-uniform resolution. They automatically have resolution exactly proportional to the occupation count of a region! That is, if you have one big triangle somewhere, it gets one element, and if you have 100 small triangles somewhere else, they get 100 elements. Grids fail here because the spatial subdision is uniform. The 1 big triangle might get 10 cells, and the 100 small triangles might get 10 cells. This is all true, but the great thing about a grid is that you can instantly reduce your test set in two axes with just one array-lookup. It's so fast, it can't be beat. Once you get into the grid cell, you can then do another acceleration structure inside the cell. Obviously if there are <= 8 or so tris in the cell, no more acceleration is warranted. One thing to remember is that a ray-tri test is only about 100 clocks, so for 8 tris, that's 800 clocks. You better not use any acceleration structure that takes longer than that! A nice thing about sorted lists is that you always just choose not to use them. In the end, it's still just a list that you can brute-force and walk over all elements very fast. The bad thing about sorts is that it only gives you culling in one axis; it's a one-dimensional structure, and you need more. In this description I'm doing a "Grid->Sort" structure where triangles are added to a grid in two axes (nominally X and Y), and then sorted along Z within each cell. You could sort along X or Y or anything else in that cell, but for most purposes, the grid has already done an excellent job of reducing the candidates in X and Y, since the grid cells can be very small with modern PC memory budgets. This structure is not so good for game runtime in memory-limitted environments. Also note that this structure is very very fast to build, unlike a BSP or KD-tree. ------------------------------------------------------------------------------ The Algorithm Take all your triangles from all your meshes. Transform them into world space and stick them in a big array. Compute their normals or whatever else you need for ray-casting against them. Now make your grid. The grid will actually be multi-resolution. We do this just to save memory and avoid putting very big triangles into many many cells. We make the finest level grid at 2048x2048. This takes 16M of RAM, and is the primary fixed over-head. Then we make levels for 1024x1024, and 512x512, down to 1x1. All the grid cells are initially empty. We find the extents of all the triangles in the plane of the grid. This sets the borders for the grid. This also gives us our quantization function that takes a 3d point to a grid coordinate. Grid coordinates go from 0,0 to 2048,2048 in the space of the primary grid. Now we start putting triangles into the grid. For each triangle, you transform it into grid coordinates. Now we see how big it is in grid space. If it's "too big", we push it up to the next level of the multiresolution grid, and keep repeating until it's not taking too many cells. Each time you push it up, it takes roughly 1/4 the number of cells. For "too big" I currently use "covers 12 cells or more". This is a tweak number that doesn't affect performance too much. It should be at least 8 - if you make it too small, triangles get pushed way up the tree to the big cells. Also, it must be at least 4, since a triangle lying over a grid vertex will be in four cells. Once you decide what level the triangle goes in, you need to put it in the actual grid cells. To do, this I rasterize the triangle into the cells so that it only goes into the actual cells that can possibly touch it. To "put it into cells" what I'm actually doing is building up a variable length array on the cell. The result will be that the cell will have a number of triangles and then a list of indeces to those triangles. The triangles themselves just stay in the original array that we made; they aren't duplicated around. Note that multiple cells can point at the same triangle. To avoid multiple visits, we use a "walk id", which is just a counter that goes up, and equality means you've already done that triangle on the current call. Once all the triangles are in a cell, we sort them all the "Z" axis (the 3rd axis perpendicular to the grid plane). We sort just by triangle centers. Once the sort is done, we have to build up the accumulated min/maxes with two passes. We build up the minimum Z of all the triangles to the right of the current one in the list and the current one, and we build up the maximum Z of all the triangles to the left and the current one. Note that this "max Z left" > "min Z right". The resulting structure looks like this : int grid0[2048][2048]; int grid1[1024][1024]; ... vector indexes; vector triangles; The grid[][] has a -1 to mean the cell is empty. If the cell is not empty, the grid[] indexes a spot in the "indexes" array. In that array, the first member is the number of triangles, then that # of indexes follows. The actual grids are stored more like vector grids. Usually the top several levels are totally empty, so I pop them off the back of the grids list. Note that the indexes are nicely adjacent for cache, but the triangles may not be. One thing I haven't done that might be good would be re-order the triangles so that nearby triangles are nearby in the array. Often the mesh data is arranged this way to begin with. Now, to ray-cast against this structure, we start with a ray (or segment/edge). Clip the ray to the bounding box of all the triangles, which makes it finite in world space. Transform the ray to grid coordinates, so you have a 2d ray in the [0->2048] coordiantes. Start from the origin of the ray and test the elements in that grid cell. We're basically doing an old integer line-drawing here. Step to the next integer coordinate in x or y and test that grid cell, repeat. If you only want the first hit, you can return as soon as you see a hit. We're just doing an integer line-draw and checking each cell as we go. Once you do one level, you take all the grid coordinates and divide by 2, and go to the next coarser level. Repeat for all the levels. Now, to actually test a ray against a cell, we can use the sort on the 3rd axis. We look up the grid[] to find our array of indexes. If there are none or a few, we just test all those triangles. You can test either front-to-back or back-to-front depending on the direction of the ray if you want to do first-hit early returns. If there are many triangles (>=24), we can use the sort. Take your ray extents on the "Z" axis (clipped to that cell), and use the mid-point. Do a binary search on the triangle list to find the triangles closest to your Z. Now you can search out from there in both directions using the accumulated "max Z left" and "min Z right". Basically, you keep looking to the left as long as "max Z left" > your ray's Z min, and you keep looking to the right as long as "min Z right" < your ray's Z max. ------------------------------------------------------------------------------