From cbloom@cbloom.com Thu Dec 12 22:38:37 2002
Date: Sun, 10 Nov 2002 12:21:48 -0800
From: Charles Bloom
To: Casey Muratori , Jonathan Blow
Cc: cbloom@cbloom.com
Subject: Re: (very) rough draft
Comments :
A) I would start section 2 with some general comments on interpolation of
transformations. Something like :
Alexa boldly tries to attack the general problem of interpolation of
any type of transformation. We will show that this is impossible and
impractical, and in fact not even desirable. For the majority of
this paper we will consider the special case of a "TRS" (Translation,
Rotation and Scale) transformation for concreteness, and because it's
the most important case in computer graphics.
(Since Alexa starts out attacking the general case and only later admits
that he has to specialize to rotations, I would start our paper with
the general case, rather than digging right into rotations. There's
another key point that I think should be made, perhaps at the same
early point or maybe later in) :
The fundamental problem with Alexa's approach is that he does not
address the global structure of transformations. Alexa's approach boils
down to one well-known prescription : "Interpolate transformations in
their Lie space". All transformations are elements of a group. That
group has lie generators which define a local manifold in each neighborhood
of the group. Alexa is suggesting that you take the local neighborhood
around the identity operation, use the Lie basis to define a flat
Reimann topology for the transformations, and then to use that topology
globally. This works fine for groups that have globally flat topology,
such as the Translation and Scale groups (ie the Lie algebra is commutative).
Because Translation and Scale are locally flat, and we can write our TRS
transformation as a group product, the only problem with interpolation
between TRS transformations comes from the rotation part. The Rotation
group (in dimension > 2) is a group that does not have globally flat
topology. Alexa's approach is akin to making a flat projection of the
group topology, and then working in that projected space.
The rotation group has topology S3/Z2. S3 is the 3-dimensional surface of a
sphere (embedded in 4 dimensions); the division by Z2 refers to the fact that
opposite rotations are identical. That is, rotation around N by theta, and
rotation around -N by -theta are the same rotation. Note that the rotation
around N by theta and the rotation around N by (2pi-theta) lead to the same
orientation, but they are not the same rotation. The topology S3 means
that any flat projection (ala Alexa) is going to cause distortion Failure to
account for the Z2 (double-cover) will cause global neighborhood errors.
Two major problems arise from trying to use Alexa's method in general : distortion
and global neighborhood errors. Distortion means that distances in the
exponential map (Lie space) are not proportional to distances in the group. This
can cause you to take unnatural curved paths through the space of transformations.
Close to the identity, this distortion is small; infinitesimally close, the
exponential map is a perfect coverage; the error gets worse the farther you get
from the identity operation. Global neighborhood errors typically manifest by
taking an unnecessary long path between two transformations. In a flat Reimann
space (like R3) there is only one straight line between two points. On a sphere,
there are many geodesics between two points; if you try to make an unwrapping of
the surface of a sphere (to a disc), you must also considering the paths which
go around the boundary of the disc. In general, there may be many ways that two
different points on the unwrapping have zero distance between them; for example, a
torus may be unwrapped to a rectangle, with opposite edges representing the same
points in the group.
The problem with Alexa's technique in general is that it produces much more
distortion than rival techniques, and it doesn't account for the global topology
problems at all. The only way to handle the topology problems is to explicitly
know what group you are dealing with, and to account for that space. Thus, generic
interpolation of transformations is impossible and undesirable.
B) I would talk about the topology a bit. Casey, you sent a good mail on visualizing
this earlier. I would say something like :
To practically interpolate rotations, we must get away from the difficult (S3/Z2)
topology. Quaternions do this using a double-cover; the topology of quaternions
is S3, which means that for each rotation there are two quaternions. Linear
interpolation of quaternions is a mapping to R4, simply by allowing the quaternions
in S3 to leave the surface of their sphere. (Note that there are many ways to
boost rotations from S3 to R4; Casey's "Lerp" is one, but see also the Rational
Map papers). The exponential map is a projection/unwrapping of the sphere to R3.
C) I would take this chance to explicitly describe the "Lerp" technique (just letting
quats stray from S3 into R4); maybe in an appendix. You refer to "Lerp" in section 2,
but haven't really described it.
D) The bit at the end of page 2, beginning of page 3, starting with "one may ask oneself" :
I think this bit is a little questionable. It is correct, but there are a few problems.
For one thing, I would make it clear that we are talking about both "operations" and
"states" here; that is, a rotation R represents the "operation" of rotating by R; it also
represents the orientation which you get by applying R to the identity. Without this, I
think it's confusing whether we are talking about "rotations" or "orientations" - really
we're talking about both. Really, I think I might leave this section out of the main
paper, like in an appendix, because it doesn't really address any problem in Alexa. If
you're going to mention it, you should mention that orientations in the real world actually
have the topology of quaternions (SU(2)) not rotations (SO(3)). The classic example of
this is of course Dirac's "belt trick"; you can also find web sites of Feynman's "Candle Dances",
and knot theoretical issues. That is, a rigid body in space rotations like SO(3) - that is,
a 2pi rotation takes the object to itself. If, however, you attach that object to three strings
that extend to infinity, a 2pi rotation is no longer the identity, but a 4pi rotation still is!
E) On your capitalized part that you mark as unclear; this corresponds to Alexa section 6.2
First of all, the first paragraph in 6.2 is wrong. He says the angle of rotation is
(x+y+z)/phi
Actually, it is
|(x,y,z)|/phi
(the magnitude of the vector (x,y,z)).
The value "r" that he talks about here is (pi * phi). This "phi" factor is very
confusing, so I'll drop it from now on, and just use phi = 1. The choice of phi=1
means that my rotations are :
Rx = e^(Jx)
log(Rx) = x_hat
and
T = e^(J*n theta)
T is rotation around "n" by "theta"
And
x = = n_x theta
In particular,
{x,y,z} = n * theta
Are the parameters in the exponential map space.
Now, Alexa's "r" is just pi. Alexa then considers each
parameter separately. He says that if you look at a parameter "x",
then values "pi" and "-pi" are the same rotation. In the exponential
map, "x" is a segment; by equating the ends, it's a circle; he's
saying that the rotations around X have the topology S1. This is
true when you only do a rotation around X, but it is WRONG for
a 3d rotation. Alexa is treating the global topology as S1*S1*S1,
which is just wrong, that's not the same as S3, which is what he
should be using. In particular, the rotations :
{x,y,z} and {2pi-x,y,z}
are NOT the same rotation. They are only the same if y and z are zero.
The correct relation is that :
{ n * theta } and { n * (theta - 2pi) }
are the same rotation. If we take that theta starts in [0,pi] , then this
means that opposite points on the surface of the sphere of vectors of
magnitude pi are equal rotations.
I just worked out the correct relations for choosing the shortest path in
the exponential map.
Consider two vectors in the exponential map :
r1 = n1 * theta1
r2 = n2 * theta2
Here the "n" are unit vectors an the theta are angles in [0,pi].
Each rotation has a "mirror" that must be considered :
r1' = n1 * (theta1 - 2pi)
r2' = n2 * (theta2 - 2pi)
We must consider four different paths. You can interpolate
r1->r2, r1->r2', r1'->r2, or r1'->r2'. I used just the Euclidean
distance between the vectors in the exponential map. That's
actually questionable, but it's the best we can do. I call
these paths "A","B","C", and "D", respectively.
Put theta1 = pi * u
theta2 = pi * v
so u is in [0,1] and v is in [0,1]
Also put
z = n1 * n2
So z is in [-1,1]
We find these terms :
d(A) = 0
d(B) = 1 - v + z*u
d(C) = 1 - u + z*v
d(D) = 2*(z+1) + (z-1)*(u+v)
You should choose the path that has the smallest "d". The
first thing we see is that
d(D) >= Min{ d(A), d(B), d(C) }
So that the "D" path (r1'->r2') is never the best one.
We also see that
if z >= 0 , then A is the best path. That is, the rotation
axes must be opposing for the "strange" paths to be favorable.
Also, if (u+v) <= 1, then A is the best path. The sum of
angle must be >= pi for the strange paths to be preferred.
It's fun to plot the change-over point for when the two
strange paths B & C become shorter. If you make a plot
of u vs. v and draw the lines of change-over for various
values of "z", you get this triangle in (u+v) <= 1 where
the basic path A is always preferred. At z = -1, both B
and C are equally good, and they start at u+v = 1. As
z gets bigger, up to "1", the B lines rotate back towards
v = 1, and the C lines rotate up to u = 1.
I think all this material is appropriate for an appendix.
It's a good record of how to correctly handle the exponential
map, and it's a good demonstration that it's definitely
not "simple".
F) I think this is a good opportunity for you to write a little
more on your use of Lerp and just doing splines of the quats in R4.
-------------------------------------------------------
Charles Bloom cb@cbloom.com http://www.cbloom.com