View Culling
October 31, 2000 (updated Nov 2)
by Charles Bloom (cbloom@cbloom.com)
- Introduction --------------------------------------------------------
A description of an efficient view-culling pipeline, with analysis.
I'll attempt to address the needs of both novices and experts in
this articles.
What I'll describe here is how to quickly and approximately reject
objects which are not visible. Once you find an object to be
visible, you can skin the mesh, and render it with D3D/HT&L or with
a pipeline like I describe in other articles. To drive the culling
you need a spatial heirarchy, such as a BSP, oct-tree or scene-graph.
I'm not going to describe that part here. I've got a little description
of one in my "what your engine should be/dogma" article. You can also
get some good info here :
http://www.tulrich.com/geekstuff/partitioning.html
from Thatcher Ulrich.
So, the algorithm structure looks like :
Spatial Heirarchy ->
View Culling ->
Object Renderer
and I'm doing the middle part.
Notez : the heirarchy is very important to culling. View culling an
object returns one of three states (a trinary return) : 1. the object
was totally rejected, 2. the object was totally included, 3. or the object
was partially included. You must render the object if it returns 2 or 3.
If the return value is 1 or 2, then you need not cull any of the children
in the spatial heirachy - all kids are either included or excluded. Only
on a return of 3 must you further test the children.
There are good papers on this stuff at :
http://www.ce.chalmers.se/staff/uffe/
For lots of stuff-stuff intersection testing, see
http://www.realtimerendering.com/int/
In particular, I won't go into plane-sphere or plane-box testing since
that's well documented elsewhere.
- What's the view frustum ----------------------------------------------
(for novices). The "view frustum" is the pyramidal geometry which
represents all the visible volume through your view screen. Take your
view position in world space, that is the location of the camera, and
take the (imaginary) view screen in world space. Make a triangle from
the view position to each side of the view screen. Now extend these
triangles out to infinity. This pyramidal shape is the view frustum
without near or far clipping. To add that, cut off the tip of the
pyramid at the near distance, and then cut it again (perpendicular to
the main axis of the pyramid) at the "far" distance from the tip. This
gives you a pyramid which looks like a trapezoid from the side.
To find the set of potentially visible objects, you could take this
trimmed-pyramid and do a perfectly polygon-accuracte intersection test
against the objects in your level. Any object that intersects needs to
be rendered. Obviously this is a bit too expensive for culling, so we'll
simplify.
The view frustum is often referred to as six planes. These are the six
side of this capped pyramid. The planes are named "front" and "back" (or
"near" and "far"), left,right, top and bottom. They are oriented, so
that a point is visible if and only if it is on the "inside" of all six
planes.
- Bounding spheres are great -------------------------------------------
I love bounding spheres; here's why. (novices: you want some kind of
rough bounding volume around your objects to use for rough culling. This
volume should be as small as possible while still totally enclosing your
object. The main choices are : a sphere, an axis-aligned bounding box,
and "oriented" bounding box, and a polygonal convex hull).
I can store a bounding sphere with just one float - the radius of the
sphere. To do that, I'm assuming that each of your models has a
ModelToWorld transform (MTW). If you set the origin of the model at the
center of its bounding sphere (not a bad thing to do anyway), then the
translation of the MTW gives you the position of the bounding sphere in
world space (hence, it's free). Now, I'm assuming that the MTW doesn't
have scaling in it. Many novices in 3d engine design think that putting
support in for scaling and shearing and whatnot in your MTW is a great
thing. It is cool, and it gives you the freedom to duplicate models and
size them a bit. It ends up biting you, though, if you ever want to use
quaternions in your rotation computations, or if you want to use D3D for
lighting (d3d simply cannot handle transforms with scaling in them (*1
see appendix)).
Thus, my world-space bounding sphere is just made from the radius which
I store in each model, and the translation of the MTW. Whenever I rotate
a model, it's world-space bounding sphere doesn't change at all! Thus I
only need to update its location in the world's spatial heirarchy when
the model is translated.
So, from here on out I'm going to describe culling using bounding
spheres as the basic primitive.
Since I want to use bounding spheres for my primitives, I need to use
bounding spheres for my spatial heirarchy, so that I can use the same
view-frustum cull for leaf and parent nodes in the heirarchy. Thus, the
most natural spatial heirarchy are a sphere-based scene graph, or any
kind of sphere-tree. A sphere-tree is very easily to implement for
static scenes; it's somewhat trickier to do a good implementation for
scenes with lots of dynamic geometry.
- The rough cull --------------------------------------------------------
The first step is to very quickly reject the objects that are way out of
the view frustum. The first cull I look to is a "whole frustum" sphere
test. That is, you make a bounding sphere for the capped pyramid which
is the view frustum. You then take this sphere and test it against your
object sphere. Any object which doesn't intersect this sphere can be
tossed out immediately.
A sphere-sphere test is very quick. If each sphere is a structure or
{radius,center} then the test is simply :
d = DistanceSquared( sphere1.center , sphere2.center );
return ( d <= Square( sphere1.radius + sphere2.radius ) );
That is, compare the distance between the centers to the sum of the radii.
You can also tell if the frustum sphere contains the test sphere (in which
case you can return a "totally include" trinary value). This is done with
( d <= Square( sphere1.radius - sphere2.radius ) );
And should be done before the earlier test. Thus, the whole algorithm is :
d = DistanceSquared( frustum_sphere.center , model_sphere.center );
if ( d <= Square( frustum_sphere.radius - model_sphere.radius ) ) , totally include
else if ( d <= Square( frustum_sphere.radius + model_sphere.radius ) ) , partially include
else totally reject
Now, this test is very rough, but very fast. It culls away essentially
all the objects that will be culled by the near and far plane, but it's
not so good at culling stuff around the sides of the view. The next test
I propose is a cone-sphere test. To do this, you need to build a cone
around the view frustum. This cone simply has its apex at the view
position, and runs along the view direction. The angular spread of the
cone is just big enough to enclose the corner of the screen. The cone is
computed just once per frame, and is stored as { apex_location,
direction_normal, cos(angle), sin(angle) } where the angle is the spread
of the cone. To test a sphere against a cone, you compute :
V = sphere.center - cone.apex_location
a = V * cone.direction_normal
b = a * cone.tan
c = sqrt( V*V - a*a )
d = c - b
e = d * cone.cos
now if ( e >= sphere.radius ) , cull the sphere
else if ( e <=-sphere.radius ) , totally include the sphere
else the sphere is partially included.
What's going on in this cone-sphere test? Basically, I'm trying to find
'e' which is the shortest distance from the center of the sphere to the
surface of the cone. You can draw some pictures and see what's going on.
'a' is how far along the ray of the cone the sphere's center is. 'b' is
the radius of the cone at 'a'. 'c' is the distance from the center of
the sphere to the axis of the cone, and 'd' is the distance from the
center of the sphere to the surface of the cone, along a line
perpendicular to the axis of the cone (which is not the closest
distance).
Note that once you compute 'a' you could tell that the sphere intersects
the cone just by testing
Square( a ) <= V*V * Square( cone.cos )
This is very fast and nice, but it can't tell us if the sphere was
partially included or totally included, so it's no good for heirarchy.
Thus, I don't bother and just go on with it.
Now, this seems like a lot of math compared to a sphere-plane test, but
we're getting a lot more culling out of it. The sqrt (square root) is
really the only unpleasant part. On modern CPU's the cost of the sqrt is
usually better than doing a lot of branches, and here we only have to do
1 branch, so it's a win.
Dave Eberly's Magic Software actually has a reasonably readable section
on sphere-cone intersection. See it at :
http://www.magic-software.com/Source/MgcIntersection3D/MgcIntr3DSphrCone.cpp
His approach is somewhat different, but the math comes out the same.
- The fine cull --------------------------------------------------------
After doing the entire-sphere and the cone culls, you'll have a pretty
good set of visible objects. At this point, you can do more culling, but
you must balance the cost of the cull against the cost of just
rendering. If the object is not too complex, it'll be cheaper to just
render at this point (since it will probably not be culled anyway).
What I propose is a compromise. If the object was totally-included by
the above tests (that is, totally enclosed by the frustum's sphere and
cone) then don't do any more testing, just render it. If it was
partially included, then do a more thorough test. At this point, you
could do a sphere test against the six planes of the frustum. I prefer
to get even more hard-core, since this test is pretty rare, and we could
cull some things if we had more accuracy.
To do this, I propose an OBB-Frustum test (OBB = oriented bounding box,
a rotated rectangular prism). The model must contain an OBB in model
space (four vectors required). Transform this OBB into world space.
First, take the OBB and test it against the six planes of the frustum
using a standard OBB-plane test. If any plane rejects the OBB, you're
done.
If all six planes include the OBB, then go on to the next step to make
sure you get the corners right. Take the normals of the 3 faces of the
OBB, and the cross products of the edges of the OBB and the frustum
(these are the directions from the separating axis theorem), and make
new planes which are aligned to those normals and go through the front 4
corners of the frustum (don't worry about the back 8; you can make your
frustum-sphere tighter to aggressively cull the back 4 corners of the
frustum, junk doesn't matter out there). Then test the OBB against these
new planes with the standard OBB-plane test. If the OBB is included now,
you're done.
You may want to do a cost analysis to decide whether to do this full
test. The benefit from doing a cull operation is :
(cull probability) * (render time) - (cull time)
So, if render time is very high, then even a small increase in cull
probability can balance a large increase in cull time, giving you a
positive increase in the cull benefit. That's basically what I'm working
from in the preceding paragraphs. If you can tell that a given model is
very simple (eg. very low poly count) then you should do a simpler cull,
eg. just skip this fine-grain cull.
Depending on your engine balance, you should try four options here, I'll
list them from fastest+roughest to slowest+tightest :
1. skip the fine cull stage
2. cull sphere vs. 6 frustum planes
3. cull OBB vs. 6 frustum planes
4. cull OBB vs. 6 + N frustum planes (exact obb-frustum test)
- appendix ---------------------------------------------------------------
*1 : about D3D handling a ModelToView xform with scaling :
Ok, so I need to be more precise about this point. If you do
NORMALIZENORMALS, then your life will be tolerable, but not
actually right. The problem is that D3D just uses the Model->View
xform for the normals, and this is not the right thing. The
right thing to do is to use the Transpose of the Inverse of the
Model->View xform. This will give the same result unless the
xform has non-uniform scalings or shears.
My "pipeline" article has a bit on this. Basically, if you do it he D3D
way, the normals will be skewed in the wrong direction. For example, if
you have a sphere, and you non-uniformly scale it into an ellipse, the
normals will actually be bent *away* from the way they should be bent.
See
http://www.gignews.com/realtime020100.htm
An excerpt from "real-time rendering".
- eof ------------------------------------------------------------------