Transformation matrices

From DmWiki

(Redirected from Rotation)

Please note, this article requires an understanding of vectors and matrices.

Transformation matrices are a really useful tool in game-development, and can neatly solve a great variety of problems.

Table of contents

Basic Usage

When a matrix is multiplied by a vector, another vector is the result. By manipulating the matrix, we can perform a slew of useful operations on the vector. What makes this even more useful is that the matrix can be stored, and easily applied to a whole group of vectors automatically by the graphics hardware. This consitutes the T in 'T&L' (transform and lighting), an essential function of a modern graphics card.

Common uses of matrices are to translate (move around), rotate and scale vectors.

Daisy Chaining (Combining Transformations)

Matrices can be easily combined so that one matrix does the work of many. For example, matrix A rotates around the x axis and matrix B translates across the z axis. One could multiply each vector by A, then by B. Or, you could multiply A by B, then multiply the vectors by the single resultant matrix - cutting the number of operations by half.

Order matters

One should be very wary about the order matrices are applied. Matrix multiplication is in general not commutative and therefore the order of the transformation matters. This can easily be seen by considering a rotation matrix R and a translation matrix T. The transformation TR would rotate the object (around the origin) and translate it afterwards. RT on the other hand would first move the object away from the origin and then rotate it. So you must combine the matrices in the same order you would normally multiply the vector by them. In equation terms,

\mathbf{AB} \neq \mathbf{BA}.

Note that multiplying AB by a vector corresponds to first performing the transformation B, followed by the transformation A. In other words, the matrices must be multiplied in the reverse of the order in which you want the transformations to take place.

Homogeneous Coordinates

In 3D graphics, we are obviously working with 3-dimensional space, and so we would naturally use 3×3 matrices to perform transformations on them. However, 3×3 matrices are not big enough to allow for some transformations that we want to perform, such as translation and perspective projection (both described later). To allow these, we must use 4×4 matrices instead.

In order to use a 4×4 matrix we must have a 4-dimensional vector to multiply it by. We do this by taking a standard 3D vector (x, y, z) and adding either a 0 or a 1 on the end, to form (x, y, z, 0) or (x, y, z, 1). The 4×4 transformation matrix can then be multiplied by this vector.

When do we use 1, and when do we use 0? The answer lies in what the vector represents. If the vector represents a point in space, let the last component be 1; if it represents a direction, let it be 0. The reason for this is that if 1 is used, translation works as normal, and if a 0 is used, translation has no effect (but rotations and such work as normal). Directions (like north, south, east and west) shouldn't be affected by translation since they are the same no matter where you are in space, so we use a 0 for their fourth components.

After transformation, in the vast majority of cases the fourth component of the result vector will still be 0 or 1, whichever it started with. Then you can simply strip it off and use the x, y, z components as normal. The only time this will not be true is if your matrix includes a perspective projection. This is discussed more later.

Translation

For translation, a 4×4 matrix will be used. The position vector p is composed as follows (with x, y, z corresponding to a position in 3d space):

\mathbf{p} = \begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix}

The translation matrix T is as follows (tx,ty,tz are the distances to translate on each axis):

\mathbf{T} = \begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\                                        0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix}

It is good to manually compute the result on paper, to see it in action. For completeness, the result is shown here:

\mathbf{T} \mathbf{p} = \begin{bmatrix} x + t_x \\ y + t_y \\ z + t_z \\ 1 \end{bmatrix}

Scaling

Scaling transformations are used to make objects bigger or smaller. Uniform scalings make all vectors longer or shorter by a certain factor, but without changing their directions. Nonuniform scalings involve different scale factors in each of the x, y, and z directions, and can be used to make objects appear squished or stretched. Also, using a negative scale factor causes the object to be reflected.

The matrix for a scaling takes three parameters: sx,sy,sz which are the x, y, and z scaling factors. For a uniform scaling, set all three to the same value.

\mathbf{S} = \begin{bmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

Note that if you set all the scale factors to 1, the identity matrix is the result, which is what you would expect since scaling by 1 results in no change to the size.

For an example, setting all three scale factors to 2 would give you a matrix that doubled the length of any vector, thus doubling the size of 3D objects. If you wanted to make an object twice as long in the x direction but without changing in the y or z directions, you could set sx to 2 and sy,sz to 1.

Note that you should usually do scaling of an object in local coordinates (see coordinate systems). This is so that the origin of the coordinates is at the center of the object. If you scale an object that is off-center, it will appear to move toward or away from the center of the coordinate system. This is usually not what you want.

Rotation

In 3D, rotation is performed around an axis (a line). The x, y, and z axes are common choices, but any axis can be chosen. Here we give x, y, and z axis rotation matrices as well as an arbitrary-axis rotation matrix.

In all cases, obviously, the amount of rotation is described by an angle θ. Note that in most cases θ must be measured in radians. The rotation matrices here follow the right-handed rule. That is, if you make a thumbs-up with your right hand, and your thumb points along the rotation axis, then the direction that your fingers curl is the direction of rotation.

Here are the x, y, and z rotation matrices:

R_X = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta & 0 \\ 0 & \sin \theta & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}
R_Y = \begin{bmatrix} \cos \theta & 0 & \sin \theta & 0 \\ 0 & 1 & 0 & 0 \\ -\sin \theta & 0 & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}
R_Z = \begin{bmatrix} \cos \theta & -\sin \theta & 0 & 0 \\ \sin \theta & \cos \theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

The following matrix allows rotation about an arbitrary axis. It requires a unit vector, (ux,uy,uz) pointing in the direction of the axis.

R = \begin{bmatrix} 1 + (1 - \cos \theta)(u_x^2 - 1) & (1 - \cos \theta)u_x u_y - u_z \sin \theta & (1 - \cos \theta)u_x u_z + u_y \sin \theta & 0 \\ (1 - \cos \theta)u_x u_y + u_z \sin \theta & 1 + (1 - \cos \theta)(u_y^2 - 1) & (1 - \cos \theta)u_y u_z - u_x \sin \theta & 0 \\ (1 - \cos \theta)u_x u_z - u_y \sin \theta & (1 - \cos \theta)u_y u_z + u_x \sin \theta & 1 + (1 - \cos \theta)(u_z^2 - 1) & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

Rotations, like scalings, should usually be done in the local coordinate system. Again, this is because the rotation will be around the origin. To rotate around a point other than the origin, you would use a translation to move the desired point to the origin, then perform the rotation, and finally use another translation to move the object back to its original position.

Rotations can also be represented by quaternions, and this has many advantages over working with rotation matrices directly. However, a quaternion will need to be converted into a rotation matrix in order to use it with an API, such as OpenGL or DirectX.

Inversion

Transformation matrices involving only rotation and translation (ie. no scale) have convenient properties making them very easy to invert. A 3x3 rotation matrix can be inverted by transposing it, and a translation matrix can be inverted by negating its translation part. When inverting the matrix, the rotation part and translation part are inverted separately and then combined in the opposite order, since:

\mathbf{(AB)}^{-1} = \mathbf{B}^{-1} \mathbf{A}^{-1}

When a transformation matrix is known to only contain rotation and translation, it can therefore be inverted like so:

void InvertRotation(float *m)
{
    // Invert the rotation part, by transposing it
    Swap(m[1], m[4]);
    Swap(m[2], m[8]);
    Swap(m[6], m[9]);
}

void InvertTranslation(float *m)
{
    // Invert the translation part by negating it; and then rotate it by the new rotation part
    Vec3 t = Vec3(-m[12], -m[13], -m[14]);
    m[12] = t.x * m[0] + t.y * m[4] + t.z * m[8];
    m[13] = t.x * m[1] + t.y * m[5] + t.z * m[9];
    m[14] = t.x * m[2] + t.y * m[6] + t.z * m[10];
}

If the matrix has been scaled, the scale must first be inverted. The scale vector [x,y,z] equals the lengths of the first three column vectors. To negate the scale, each column vector is simply divided by their corresponding scale. To invert it, they are divided by the scale again, meaning we only need the squared length of each column vector. The squared length is much faster to compute.

void InvertScale(float *m)
{
    float sx = 1.0f / ( m[0]*m[0] + m[1]*m[1] + m[2]*m[2] );
    float sy = 1.0f / ( m[4]*m[4] + m[5]*m[5] + m[6]*m[6] );
    float sz = 1.0f / ( m[8]*m[8] + m[9]*m[9] + m[10]*m[10] );
    
    m[0] *= sx; m[1] *= sx; m[2] *= sx;
    m[4] *= sy; m[5] *= sy; m[6] *= sy;
    m[8] *= sz; m[9] *= sz; m[10] *= sz;
}

In conclusion, to invert a transformation matrix involving translation, rotation, and scale:

void InvertTransform(float *m)
{
    InvertScale(m);
    InvertRotation(m);
    InvertTranslation(m);
}

Please note that the code above assumes that the matrix is stored like demonstrated below. Also keep in mind that this cannot be used to invert a projection matrix.

\begin{bmatrix} 0 & 4 & 8 & 12 \\ 1 & 5 & 9 & 13 \\ 2 & 6 & 10 & 14 \\ 3 & 7 & 11 & 15 \end{bmatrix} = \begin{bmatrix} x_1 & y_1 & z_1 & t_1 \\ x_2 & y_2 & z_2 & t_2 \\ x_3 & y_3 & z_3 & t_3 \\ 0 & 0 & 0 & 1 \end{bmatrix}

Interpolation

Interpolating between matrices is not as easy as multiplying each by a number and then add them to together, like linear interpolation. The three basis vectors must remain orthogonal at all times. An easy but slightly inaccurate way to achieve this is to interpolate two basis vectors linearly, and get the third basis vector using the cross product. All three vectors must then be normalized. The following does not work for scaled matrices.

Mat4 InterpolateMatrices(Mat4 a, Mat4 b, float t)
{
    Vec3 x = a.GetXAxis() * (1-t) + b.GetXAxis() * t;
    x.Normalize();
    
    Vec3 y = a.GetYAxis() * (1-t) + b.GetYAxis() * t;
    y.Normalize();
    
    Vec3 z = x.Cross(y);
    z.Normalize();
    
    y = z.Cross(x);
    y.Normalize();
    
    Vec3 t = a.GetTranslation() * (1.0f - t) + b.GetTranslation() * t;
    
    Mat4 m;
    m.SetXAxis(x);
    m.SetYAxis(y);
    m.SetZAxis(z);
    m.SetTranslation(t);
    
    return m;
}

The GetXAxis and related methods depend on how your matrix is implemented. If your translation vector is a column vector, then the basis vectors are also column vectors, and likewise if they are row vectors.

Projection

undone


DevMaster navigation