Visibility Notes to myself, and tech doc, June 15,2000 by Charles Bloom (cbloom@cbloom.com) ---------------------------------------------------------- 1. The class of "real-time" visibility algorithms. I'm going to describe what I think is a good implementation of the class of real-time visibility algorithms (beam tree, HOM, Teller's Coherent Vis from 96). I call these algorithms "real-time" because they do not use pre-computed areas of open-visibility (the way a portal engine does), rather they use pre-computed occluders to block out objects. The first step in all these algorithms is to find the "good occluders". I'll generalize a bit from Teller here; rather than simply using actual large polygons, the good occluders are a set of large convex volumes which are contained wholy by rendering geometry. There a couple nice things about this; these occluder volumes should be very simple, made of many fewer polygons than the render-geometry. Also, the render geometry can be allowed to be slightly dynamic (eg. wiggle a little) and the occluder volumes can be unchanged. The occluder volumes can also span multiple polygons, eliminating the problems with cracks between polygons that Teller has. There should be relatively few of these occluder volumes, and they should be large. For example, on a terrain, a good occluder volume might lie under a hill or mountain, or along a ridge, totally under the terrain surface. All polygons in the scene are assumed to be grouped into "objects". Each object has a set of polygons to render, and a bounding box around all those polygons (AABB or OBB, it's up to you). All the occluder volumes and objects in the scene are kept in an oct-tree. When an object moves, it updates it's position in the oct-tree. I use the "loose oct-trees" of Thatcher Ulrich, so that each object is kept in one and only one oct-tree node (the object's dimension must be half that of the node's). Occluder volumes should be attached to objects so that they are updated automatically when the object moves. Each frame, you take your camera frustum, and descend the oct-tree, computing clip flags for each node and doing heirarchical culling of oct-tree nodes. Nodes should be descended front-first, so that you move towards the camera (the permutation of 0123 required for front-first walking can be computed just once per frame). (the oct-tree nodes are AABB's which makes them very fast to test against the frustum planes). The first (visible) leaf you reach will be the leaf that the camera is in. You render all the objects in the leaf, and add all the occluder volumes to the active occluder list. You do the same for the next few leaves, until you reach some reasonable distance from the camera, after which point you start testing the oct-tree node against the active occluders, and then render all the objects in the node only if it's not occluded. This is a heirarchical descent. Let me make it clearer with some pseudo-code; this is only one descent of the oct-tree : global Viewer v, ActiveOccluders ao, Frustum f, int childPermutation[4]; Render( OctTreeNode n, FrustumClipFlags fcf) { // clip against frustum, and update clip flags; // if fcf == 0 , we're entire in the frustum and no more clipping is necessary if ( ClipToFrustum(f,&fcf) ) return; // test the node n against the active occluders // (we could skip this for the nodes near the viewer, but who cares?) if ( IsOccluded(n,ao,v) ) return; // now render all the objects in the node: if ( n->Has Object ) Render Objects in n // go to the children in front-back order: if ( n->Has Children ) for(i=0 to 3) Render( n->children[ childPermutation[i] ], fcf ); // now add my occluders to the active list if ( n is near v ) AddOcluders(ao,n,v) } In this structure the only reasonable occlusion algorithm seems to be the shadow-volume algorithm. In this algorithm, the AddOccluders function takes the polygonal silhouette of the occluder volume (precomputed) and runs planes from the edges through the view point to create an occlusion frustum. The frustum is that is added to the active occluders list. The nodes of the oct-tree can then be rapidly checked against that frustum in the future. The number of frustum-plane tests is limited by only adding large occluders (in screen-space) that are near the camera; this amortizes the worst-case cost of the algorithm. Teller's algorithm is sort of the inverse of this. He precomputes the planes which touch the silhouette of the potential occludee node n, and the occluder o. The result is a frustum for each (o,n) pair. If the view point lies inside this frutum (and in front of the occluder o), then n is occluded. This lets you do some cheaper computations, you only need to test for v moving across one of these frustum planes to change your current visible set. The problem with this is that you have a large amount of active frusta floating around that you must test. Of course, you should compute them on the fly and cache them out in a currently active frustum list so that they aren't stored when not in use. Teller's method is inferior to the occluder-frustum method, because it reliese so much on precomputation and coherence. With Both ways are O(n*m) where n is the number of nodes visited and m is the number of active occluders. Teller has a cool and intuitive metric for determining a "good occluder" which should be added to the list of active occluders. You simply use the value : (Occluder Area) * (Occluder Normal) dot (View normal) / (View to Occluder distance Squared) This is obviously a rough estimate of the screen-space area of the occluder. ---------------------------------------------------------- 2. The "rough BSP" algorithm, for portal rendering. The goal here is to construct a set of zones and portals. The portals are planar polygon outlines that connect pairs of zones. Rendering proceeds by finding the zone the camera is in (using a BSP or Oct-Tree), and rendering all objects in that zone. All of the portals in the zone are projected into camera space, and the 2d extent rect of that portal is found. If that rect intersects with the view rect, the view rect is reduced to the intersection and rendered into the zone that portal goes to. Note that reducing the view rect is essentially free, and all clipping can still be done by hardware or D3D or whatever (eg. no arbitrary frustum is needed), and the amount of over-draw is greatly reduced. As before, an 'object' indicates a set of polygons with a bounding box. Also note that the zones do not need to be convex volumes, but testing to find the zone that the view is in is much easier if the zone is defined by a set of planes. The best way to actually implement this is instead of having the zones point at a list of objects, have the objects find out what zones they're in. The objects could be kept in an octtree as described in section 1. First, a pass would be done as above to determine zone visibility; the 2d view rect for each zone would be stored in that zone (and it would be marked visibile for the current frame); if a zone was visitied more than once, the view rects would be union'ed together. Then, the oct-tree would be descended with frustum clipping as described in part 1, and each object (or each oct-tree node, it's up to you), would determine which zones it was in. This which-zone test could be greatly accelerated by A. caching it out on all non-moved objects, B. storing a list of zones in the oct-tree leaves. The object would then union all the (visible) view rects of the zones it's in and use them to render itself. Well, this is a very nice algorithm, the problem is just finding the zones and portals. First of all, creating the zones and portals is a big pre-computation step, so no dynamic geometry is included, so only static geometry will contribute to occlusion. Similarly, small objects are just going to muck things up, so we'll just exclude them from our visibility model as a pre-pass. The basic idea is that we want to build a very-low polygon rough BSP for our scene. Once this is done, the BSP planes define a set of convex regions (the leaves of the bsp); these become our zones. We can gather the set of planes which define a leaf and add the leaves to the octtree leaves that contain them. The planes around the leaves also define the portals between the zones. If we can get this BSP, we've got all we need. The simplest method to understand is the greedy-merge algorithm. In this algorithm, we take all our "good potential occluder" geometry and add it to the BSP. The result is then a massive number of zones and portals. We start merging the smallest pair of zones connected by a portal (the result of the merge is a non-convex zone which exits to all the portals of the two original zones, discounting the portal connecting them). This greedy merge continues until all zones are larger than some minimum size, or the total number of zones and portals is under some maximum count. Some notes : 1. as first pointed out by Seth Teller, you should be careful about the order in which you choose polygons to add them to the BSP. In particular, you want to choose polygons that have large areas, are near the middle (geometrically) of the region lie in, and that tend to balance the tree and split very few other polygons. 2. rather than just merging the smallest pairs of leaves, you can use some better heuristics. Probably the best way is to make all the leaves, and then compute the full leaf-leaf PVS table. You then merge leaves which look at the same visible set (or the most similar visible set) of other leaves. A faster heuristic is due to Angel Popov : take two neighbor zones as candidates for a merge. Find the convex hull of the two zones. If that hull contains any "good potential occluder" polygons (eg. large ones) then don't merge them. This heuristic is quite good at merging all zones and leaving zone boundaries in doorways and around corners. As described, this algorithm will prevent the merging of little alcoves onto a large hallway; the way to allow these merges is by taking the convex hull formed by two potential merges and pushing all those planes in by a small amount; this allows you to merge little dips onto a larger zone, even when you have to turn a corner around a good occluder to grab those dips. This algorithm is fine (though it suffers the usual imperfection of greedy optimization) except for the fact that building a BSP out all that geometry is A. very slow B. splits lots of polygons C. prone to severe numerical accuracy problems due to very-small and nearly-coplanar polygons. These problems are bad enough that we must discard this algorithm and try to find something which is more robust (note that I'm trying to build scenes with one million polygons plus, and the fact that BSP building is O(n^2) is unacceptable). So, we can use this algorithm is we just modify it a little bit. My favorite method is the 'minimal largest interior' method. Instead of adding one of the "good potential occluder" objects to the scene, we find a low-poly representation of that object which is as large as possible while being totally enclosed by that object. We then add this minimal version to the BSP. As before, we use the leaves of the BSP to construct our zones and portals, but in this case we already have a very small set of zones and portals. Now we're golden except for the problem of finding these low-poly representations of our geometry (they need not be convex). This isn't bad at all, there are lots of ways, I'll just talk about one of them; it's probably not optimal in the sense of largest volume for fewest polygons, but who cares? It's easiest to think about in 2d, so we'll start there. If you're lucky enough to be working on a "2.5-d" game, like "Dungeon Siege" or one of the Diablo/Baldur's Gate type games, then you can build your whole BSP in 2d, and things just work better. Anyway, in 2d we have some arbitrary closed polygonal curve, and we want to find a low-poly curve that is inside it. You simply start building a new mesh : pick a random vert to start with, then walk N verts and keep the edge between vert 0 (in the walk) and vert N. If any of the verts 1 - (N-1) is on the inside of your edge, then you must keep walking. Otherwise, look at the area of the space between your edge and the original polygonal line from verts 0 to N. Choose the N that minimize this error area (divided by N). Now start the walk again at the vert that edge went to. Once this is all done, do a little optimize by removing any vertex which is locally concave (eg. outside of the edge connecting its two neighbors) and whose removal doesn't change the area by more than a pre-defined tolerance. The result is an outline, consisting only of verts from the original model, whose edges lie totally inside the original model, and is of as few polygons as you like. This is a rather fast operation. If you are doing a 2.5-d game, then you want to create a 2d silhouette by taking all the farthest-out lying verts and connecting them to make this silhouette polygon. If you are doing a true 3d game then you'll have to do this algorithm in full 3d. In 3d it works by finding the largest triangle you can which is on the inside of the mesh and connects three points on the original model. It's very similar, but more computationally complex because of the difficult 3d connectivity of triangles. You may want to find another technique, like voxelizing and then only keeping large interior voxels (this could be done with an oct-tree), or building a BSP, or something. I'd love some tips here. ----------------------------------------------------------