Visibility
Notes to myself, and tech doc, June 15,2000
by Charles Bloom (cbloom@cbloom.com)
----------------------------------------------------------
1. The class of "real-time" visibility algorithms.
I'm going to describe what I think is a good implementation of the
class of real-time visibility algorithms (beam tree, HOM, Teller's
Coherent Vis from 96). I call these algorithms "real-time" because
they do not use pre-computed areas of open-visibility (the way a
portal engine does), rather they use pre-computed occluders to block
out objects.
The first step in all these algorithms is to find the "good occluders".
I'll generalize a bit from Teller here; rather than simply using actual
large polygons, the good occluders are a set of large convex volumes
which are contained wholy by rendering geometry. There a couple nice
things about this; these occluder volumes should be very simple, made of
many fewer polygons than the render-geometry. Also, the render geometry
can be allowed to be slightly dynamic (eg. wiggle a little) and the
occluder volumes can be unchanged. The occluder volumes can also span
multiple polygons, eliminating the problems with cracks between polygons
that Teller has. There should be relatively few of these occluder
volumes, and they should be large. For example, on a terrain, a good
occluder volume might lie under a hill or mountain, or along a ridge,
totally under the terrain surface.
All polygons in the scene are assumed to be grouped into "objects". Each
object has a set of polygons to render, and a bounding box around all
those polygons (AABB or OBB, it's up to you).
All the occluder volumes and objects in the scene are kept in an
oct-tree. When an object moves, it updates it's position in the
oct-tree. I use the "loose oct-trees" of Thatcher Ulrich, so that each
object is kept in one and only one oct-tree node (the object's dimension
must be half that of the node's). Occluder volumes should be attached to
objects so that they are updated automatically when the object moves.
Each frame, you take your camera frustum, and descend the oct-tree,
computing clip flags for each node and doing heirarchical culling of
oct-tree nodes. Nodes should be descended front-first, so that you move
towards the camera (the permutation of 0123 required for front-first
walking can be computed just once per frame). (the oct-tree nodes are
AABB's which makes them very fast to test against the frustum planes).
The first (visible) leaf you reach will be the leaf that the camera is
in. You render all the objects in the leaf, and add all the occluder
volumes to the active occluder list. You do the same for the next few
leaves, until you reach some reasonable distance from the camera, after
which point you start testing the oct-tree node against the active
occluders, and then render all the objects in the node only if it's not
occluded. This is a heirarchical descent. Let me make it clearer with
some pseudo-code; this is only one descent of the oct-tree :
global Viewer v, ActiveOccluders ao, Frustum f, int childPermutation[4];
Render( OctTreeNode n, FrustumClipFlags fcf)
{
// clip against frustum, and update clip flags;
// if fcf == 0 , we're entire in the frustum and no more clipping is necessary
if ( ClipToFrustum(f,&fcf) )
return;
// test the node n against the active occluders
// (we could skip this for the nodes near the viewer, but who cares?)
if ( IsOccluded(n,ao,v) )
return;
// now render all the objects in the node:
if ( n->Has Object )
Render Objects in n
// go to the children in front-back order:
if ( n->Has Children )
for(i=0 to 3)
Render( n->children[ childPermutation[i] ], fcf );
// now add my occluders to the active list
if ( n is near v )
AddOcluders(ao,n,v)
}
In this structure the only reasonable occlusion algorithm seems to be
the shadow-volume algorithm. In this algorithm, the AddOccluders
function takes the polygonal silhouette of the occluder volume
(precomputed) and runs planes from the edges through the view point to
create an occlusion frustum. The frustum is that is added to the active
occluders list. The nodes of the oct-tree can then be rapidly checked
against that frustum in the future. The number of frustum-plane tests is
limited by only adding large occluders (in screen-space) that are near
the camera; this amortizes the worst-case cost of the algorithm.
Teller's algorithm is sort of the inverse of this. He precomputes the
planes which touch the silhouette of the potential occludee node n, and
the occluder o. The result is a frustum for each (o,n) pair. If the view
point lies inside this frutum (and in front of the occluder o), then n
is occluded. This lets you do some cheaper computations, you only need
to test for v moving across one of these frustum planes to change your
current visible set. The problem with this is that you have a large
amount of active frusta floating around that you must test. Of course,
you should compute them on the fly and cache them out in a currently
active frustum list so that they aren't stored when not in use.
Teller's method is inferior to the occluder-frustum method, because it
reliese so much on precomputation and coherence. With
Both ways are O(n*m) where n is the number of nodes visited and m is
the number of active occluders.
Teller has a cool and intuitive metric for determining a "good occluder"
which should be added to the list of active occluders. You simply use
the value :
(Occluder Area) * (Occluder Normal) dot (View normal) /
(View to Occluder distance Squared)
This is obviously a rough estimate of the screen-space area of the
occluder.
----------------------------------------------------------
2. The "rough BSP" algorithm, for portal rendering.
The goal here is to construct a set of zones and portals. The portals
are planar polygon outlines that connect pairs of zones. Rendering
proceeds by finding the zone the camera is in (using a BSP or Oct-Tree),
and rendering all objects in that zone. All of the portals in the zone
are projected into camera space, and the 2d extent rect of that portal
is found. If that rect intersects with the view rect, the view rect is
reduced to the intersection and rendered into the zone that portal goes
to.
Note that reducing the view rect is essentially free, and all clipping
can still be done by hardware or D3D or whatever (eg. no arbitrary
frustum is needed), and the amount of over-draw is greatly reduced. As
before, an 'object' indicates a set of polygons with a bounding box.
Also note that the zones do not need to be convex volumes, but testing
to find the zone that the view is in is much easier if the zone is
defined by a set of planes.
The best way to actually implement this is instead of having the zones
point at a list of objects, have the objects find out what zones they're
in. The objects could be kept in an octtree as described in section 1.
First, a pass would be done as above to determine zone visibility; the
2d view rect for each zone would be stored in that zone (and it would be
marked visibile for the current frame); if a zone was visitied more than
once, the view rects would be union'ed together. Then, the oct-tree
would be descended with frustum clipping as described in part 1, and
each object (or each oct-tree node, it's up to you), would determine
which zones it was in. This which-zone test could be greatly accelerated
by A. caching it out on all non-moved objects, B. storing a list of
zones in the oct-tree leaves. The object would then union all the
(visible) view rects of the zones it's in and use them to render itself.
Well, this is a very nice algorithm, the problem is just finding the
zones and portals. First of all, creating the zones and portals is a big
pre-computation step, so no dynamic geometry is included, so only static
geometry will contribute to occlusion. Similarly, small objects are just
going to muck things up, so we'll just exclude them from our visibility
model as a pre-pass.
The basic idea is that we want to build a very-low polygon rough BSP for
our scene. Once this is done, the BSP planes define a set of convex
regions (the leaves of the bsp); these become our zones. We can gather
the set of planes which define a leaf and add the leaves to the octtree
leaves that contain them. The planes around the leaves also define the
portals between the zones. If we can get this BSP, we've got all we
need.
The simplest method to understand is the greedy-merge algorithm. In this
algorithm, we take all our "good potential occluder" geometry and add it
to the BSP. The result is then a massive number of zones and portals. We
start merging the smallest pair of zones connected by a portal (the
result of the merge is a non-convex zone which exits to all the portals
of the two original zones, discounting the portal connecting them). This
greedy merge continues until all zones are larger than some minimum
size, or the total number of zones and portals is under some maximum
count.
Some notes : 1. as first pointed out by Seth Teller, you should be careful
about the order in which you choose polygons to add them to the BSP. In
particular, you want to choose polygons that have large areas, are near the
middle (geometrically) of the region lie in, and that tend to balance the
tree and split very few other polygons. 2. rather than just merging the
smallest pairs of leaves, you can use some better heuristics. Probably the
best way is to make all the leaves, and then compute the full leaf-leaf PVS
table. You then merge leaves which look at the same visible set (or the most
similar visible set) of other leaves. A faster heuristic is due to Angel
Popov : take two neighbor zones as candidates for a merge. Find the convex
hull of the two zones. If that hull contains any "good potential occluder"
polygons (eg. large ones) then don't merge them. This heuristic is quite
good at merging all zones and leaving zone boundaries in doorways and around
corners. As described, this algorithm will prevent the merging of little
alcoves onto a large hallway; the way to allow these merges is by taking the
convex hull formed by two potential merges and pushing all those planes in by
a small amount; this allows you to merge little dips onto a larger zone, even
when you have to turn a corner around a good occluder to grab those dips.
This algorithm is fine (though it suffers the usual imperfection
of greedy optimization) except for the fact that building a BSP out all
that geometry is A. very slow B. splits lots of polygons C. prone to
severe numerical accuracy problems due to very-small and nearly-coplanar
polygons. These problems are bad enough that we must discard this
algorithm and try to find something which is more robust (note that I'm
trying to build scenes with one million polygons plus, and the fact that
BSP building is O(n^2) is unacceptable).
So, we can use this algorithm is we just modify it a little bit.
My favorite method is the 'minimal largest interior' method. Instead of
adding one of the "good potential occluder" objects to the scene, we
find a low-poly representation of that object which is as large as
possible while being totally enclosed by that object. We then add this
minimal version to the BSP. As before, we use the leaves of the BSP to
construct our zones and portals, but in this case we already have a very
small set of zones and portals.
Now we're golden except for the problem of finding these low-poly
representations of our geometry (they need not be convex). This isn't
bad at all, there are lots of ways, I'll just talk about one of them;
it's probably not optimal in the sense of largest volume for fewest
polygons, but who cares? It's easiest to think about in 2d, so we'll
start there. If you're lucky enough to be working on a "2.5-d" game,
like "Dungeon Siege" or one of the Diablo/Baldur's Gate type games, then
you can build your whole BSP in 2d, and things just work better. Anyway,
in 2d we have some arbitrary closed polygonal curve, and we want to find
a low-poly curve that is inside it. You simply start building a new mesh
: pick a random vert to start with, then walk N verts and keep the edge
between vert 0 (in the walk) and vert N. If any of the verts 1 - (N-1)
is on the inside of your edge, then you must keep walking. Otherwise,
look at the area of the space between your edge and the original
polygonal line from verts 0 to N. Choose the N that minimize this error
area (divided by N). Now start the walk again at the vert that edge went
to. Once this is all done, do a little optimize by removing any vertex
which is locally concave (eg. outside of the edge connecting its two
neighbors) and whose removal doesn't change the area by more than a
pre-defined tolerance. The result is an outline, consisting only of
verts from the original model, whose edges lie totally inside the
original model, and is of as few polygons as you like. This is a rather
fast operation. If you are doing a 2.5-d game, then you want to create a
2d silhouette by taking all the farthest-out lying verts and connecting
them to make this silhouette polygon. If you are doing a true 3d game
then you'll have to do this algorithm in full 3d. In 3d it works by
finding the largest triangle you can which is on the inside of the mesh
and connects three points on the original model. It's very similar, but
more computationally complex because of the difficult 3d connectivity of
triangles. You may want to find another technique, like voxelizing and
then only keeping large interior voxels (this could be done with an
oct-tree), or building a BSP, or something. I'd love some tips here.
----------------------------------------------------------