From cbloom@cbloom.com Thu Dec 12 22:38:37 2002 Date: Sun, 10 Nov 2002 12:21:48 -0800 From: Charles Bloom To: Casey Muratori , Jonathan Blow Cc: cbloom@cbloom.com Subject: Re: (very) rough draft Comments : A) I would start section 2 with some general comments on interpolation of transformations. Something like : Alexa boldly tries to attack the general problem of interpolation of any type of transformation. We will show that this is impossible and impractical, and in fact not even desirable. For the majority of this paper we will consider the special case of a "TRS" (Translation, Rotation and Scale) transformation for concreteness, and because it's the most important case in computer graphics. (Since Alexa starts out attacking the general case and only later admits that he has to specialize to rotations, I would start our paper with the general case, rather than digging right into rotations. There's another key point that I think should be made, perhaps at the same early point or maybe later in) : The fundamental problem with Alexa's approach is that he does not address the global structure of transformations. Alexa's approach boils down to one well-known prescription : "Interpolate transformations in their Lie space". All transformations are elements of a group. That group has lie generators which define a local manifold in each neighborhood of the group. Alexa is suggesting that you take the local neighborhood around the identity operation, use the Lie basis to define a flat Reimann topology for the transformations, and then to use that topology globally. This works fine for groups that have globally flat topology, such as the Translation and Scale groups (ie the Lie algebra is commutative). Because Translation and Scale are locally flat, and we can write our TRS transformation as a group product, the only problem with interpolation between TRS transformations comes from the rotation part. The Rotation group (in dimension > 2) is a group that does not have globally flat topology. Alexa's approach is akin to making a flat projection of the group topology, and then working in that projected space. The rotation group has topology S3/Z2. S3 is the 3-dimensional surface of a sphere (embedded in 4 dimensions); the division by Z2 refers to the fact that opposite rotations are identical. That is, rotation around N by theta, and rotation around -N by -theta are the same rotation. Note that the rotation around N by theta and the rotation around N by (2pi-theta) lead to the same orientation, but they are not the same rotation. The topology S3 means that any flat projection (ala Alexa) is going to cause distortion Failure to account for the Z2 (double-cover) will cause global neighborhood errors. Two major problems arise from trying to use Alexa's method in general : distortion and global neighborhood errors. Distortion means that distances in the exponential map (Lie space) are not proportional to distances in the group. This can cause you to take unnatural curved paths through the space of transformations. Close to the identity, this distortion is small; infinitesimally close, the exponential map is a perfect coverage; the error gets worse the farther you get from the identity operation. Global neighborhood errors typically manifest by taking an unnecessary long path between two transformations. In a flat Reimann space (like R3) there is only one straight line between two points. On a sphere, there are many geodesics between two points; if you try to make an unwrapping of the surface of a sphere (to a disc), you must also considering the paths which go around the boundary of the disc. In general, there may be many ways that two different points on the unwrapping have zero distance between them; for example, a torus may be unwrapped to a rectangle, with opposite edges representing the same points in the group. The problem with Alexa's technique in general is that it produces much more distortion than rival techniques, and it doesn't account for the global topology problems at all. The only way to handle the topology problems is to explicitly know what group you are dealing with, and to account for that space. Thus, generic interpolation of transformations is impossible and undesirable. B) I would talk about the topology a bit. Casey, you sent a good mail on visualizing this earlier. I would say something like : To practically interpolate rotations, we must get away from the difficult (S3/Z2) topology. Quaternions do this using a double-cover; the topology of quaternions is S3, which means that for each rotation there are two quaternions. Linear interpolation of quaternions is a mapping to R4, simply by allowing the quaternions in S3 to leave the surface of their sphere. (Note that there are many ways to boost rotations from S3 to R4; Casey's "Lerp" is one, but see also the Rational Map papers). The exponential map is a projection/unwrapping of the sphere to R3. C) I would take this chance to explicitly describe the "Lerp" technique (just letting quats stray from S3 into R4); maybe in an appendix. You refer to "Lerp" in section 2, but haven't really described it. D) The bit at the end of page 2, beginning of page 3, starting with "one may ask oneself" : I think this bit is a little questionable. It is correct, but there are a few problems. For one thing, I would make it clear that we are talking about both "operations" and "states" here; that is, a rotation R represents the "operation" of rotating by R; it also represents the orientation which you get by applying R to the identity. Without this, I think it's confusing whether we are talking about "rotations" or "orientations" - really we're talking about both. Really, I think I might leave this section out of the main paper, like in an appendix, because it doesn't really address any problem in Alexa. If you're going to mention it, you should mention that orientations in the real world actually have the topology of quaternions (SU(2)) not rotations (SO(3)). The classic example of this is of course Dirac's "belt trick"; you can also find web sites of Feynman's "Candle Dances", and knot theoretical issues. That is, a rigid body in space rotations like SO(3) - that is, a 2pi rotation takes the object to itself. If, however, you attach that object to three strings that extend to infinity, a 2pi rotation is no longer the identity, but a 4pi rotation still is! E) On your capitalized part that you mark as unclear; this corresponds to Alexa section 6.2 First of all, the first paragraph in 6.2 is wrong. He says the angle of rotation is (x+y+z)/phi Actually, it is |(x,y,z)|/phi (the magnitude of the vector (x,y,z)). The value "r" that he talks about here is (pi * phi). This "phi" factor is very confusing, so I'll drop it from now on, and just use phi = 1. The choice of phi=1 means that my rotations are : Rx = e^(Jx) log(Rx) = x_hat and T = e^(J*n theta) T is rotation around "n" by "theta" And x = = n_x theta In particular, {x,y,z} = n * theta Are the parameters in the exponential map space. Now, Alexa's "r" is just pi. Alexa then considers each parameter separately. He says that if you look at a parameter "x", then values "pi" and "-pi" are the same rotation. In the exponential map, "x" is a segment; by equating the ends, it's a circle; he's saying that the rotations around X have the topology S1. This is true when you only do a rotation around X, but it is WRONG for a 3d rotation. Alexa is treating the global topology as S1*S1*S1, which is just wrong, that's not the same as S3, which is what he should be using. In particular, the rotations : {x,y,z} and {2pi-x,y,z} are NOT the same rotation. They are only the same if y and z are zero. The correct relation is that : { n * theta } and { n * (theta - 2pi) } are the same rotation. If we take that theta starts in [0,pi] , then this means that opposite points on the surface of the sphere of vectors of magnitude pi are equal rotations. I just worked out the correct relations for choosing the shortest path in the exponential map. Consider two vectors in the exponential map : r1 = n1 * theta1 r2 = n2 * theta2 Here the "n" are unit vectors an the theta are angles in [0,pi]. Each rotation has a "mirror" that must be considered : r1' = n1 * (theta1 - 2pi) r2' = n2 * (theta2 - 2pi) We must consider four different paths. You can interpolate r1->r2, r1->r2', r1'->r2, or r1'->r2'. I used just the Euclidean distance between the vectors in the exponential map. That's actually questionable, but it's the best we can do. I call these paths "A","B","C", and "D", respectively. Put theta1 = pi * u theta2 = pi * v so u is in [0,1] and v is in [0,1] Also put z = n1 * n2 So z is in [-1,1] We find these terms : d(A) = 0 d(B) = 1 - v + z*u d(C) = 1 - u + z*v d(D) = 2*(z+1) + (z-1)*(u+v) You should choose the path that has the smallest "d". The first thing we see is that d(D) >= Min{ d(A), d(B), d(C) } So that the "D" path (r1'->r2') is never the best one. We also see that if z >= 0 , then A is the best path. That is, the rotation axes must be opposing for the "strange" paths to be favorable. Also, if (u+v) <= 1, then A is the best path. The sum of angle must be >= pi for the strange paths to be preferred. It's fun to plot the change-over point for when the two strange paths B & C become shorter. If you make a plot of u vs. v and draw the lines of change-over for various values of "z", you get this triangle in (u+v) <= 1 where the basic path A is always preferred. At z = -1, both B and C are equally good, and they start at u+v = 1. As z gets bigger, up to "1", the B lines rotate back towards v = 1, and the C lines rotate up to u = 1. I think all this material is appropriate for an appendix. It's a good record of how to correctly handle the exponential map, and it's a good demonstration that it's definitely not "simple". F) I think this is a good opportunity for you to write a little more on your use of Lerp and just doing splines of the quats in R4. ------------------------------------------------------- Charles Bloom cb@cbloom.com http://www.cbloom.com