(Doom 3's models)

Written by David Henry, 21^{th} august of 2005

**NOTE:** this has nothing to do with the cryptographic hash
function also called “MD5”.

The MD5 model format comes from *id Software*'s Doom 3 first person
shooter, released in august 2004. The mesh data and animation data are separated
in distinct files. These are ASCII files and are human readable. Here are
some generalities about the MD5 format:

- Model's geometric data are stored in *.md5mesh files;
- Animations are stored in *.md5anim files;
- Supports Skeletal Animation;
- Supports Vertex Skinning;
- Uses quaternions for orientation.

Textures are in separate files (TGA, DDS, or whatever you want). In Doom 3,
they are controlled by the *.mtr files in the `/materials`

directory from
the game's *.pk4 files. The MTR files are not covered here.

The MD5 Mesh and MD5 Anim formats work with quaternions. Quaternions are magic mathematical objects which can represent an orientation. Quaternions are an extension of the complex numbers. If you just discover them now, or don't know how to use them, take a look at a computer graphics math book or at an online FAQ about them.

Quaternions are an alternative to matrices for representing a rotation. Quaternions can't hold information about position (like 4x4 matrices), just the orientation. They can hold the same information as 3x3 rotation matrices.

There is not a lot of things to know about quaternions here, just some formulas:

- Quaternion multiplication (Quat × Quat);
- Rotation of a point by a quaternion;
- Quaternion inverse;
- Quaternion normalization;
- Quaternion interpolation (SLERP), for smooth animation.

Quaternions are represented by four components: w, x, y and z. Orientation quaternions are unit quaternions.

In the MD5 Mesh and MD5 Anim files, only the x, y and z components are stored. You'll have to compute the w-component yourself, given the three others.

Since we deal with only unit quaternions (their length is 1.0), we can obtain the last component with this formula:

float t = 1.0f - (q.x * q.x) - (q.y * q.y) - (q.z * q.z); if (t < 0.0f) { q.w = 0.0f; } else { q.w = -sqrt (t); }

A realy quick overview of needed quaternion operations and formulas. For more about them, refer to a 3D Math book, an online FAQ or Wikipedia.

The quaternion multiplication allows to concatenate two rotations. The product of two quaternions
Q_{a} and Q_{b} is given by the following formula :

Q_{a}.Q_{b}= (w_{a}, v_{a})(w_{b}, v_{b}) = (w_{a}w_{b}- v_{a}·v_{b}, w_{a}v_{b}+ w_{b}v_{a}+ w_{a}×w_{b})

After expanding and making simplifications, we have the following result :

r.w = (qa.w * qb.w) - (qa.x * qb.x) - (qa.y * qb.y) - (qa.z * qb.z); r.x = (qa.x * qb.w) + (qa.w * qb.x) + (qa.y * qb.z) - (qa.z * qb.y); r.y = (qa.y * qb.w) + (qa.w * qb.y) + (qa.z * qb.x) - (qa.x * qb.z); r.z = (qa.z * qb.w) + (qa.w * qb.z) + (qa.x * qb.y) - (qa.y * qb.x);

Be careful! Quaternions are non-commutative, i.e. Q_{a} × Q_{b} ≠
Q_{b} × Q_{a}.

The rotation of a point by a quaternion is given by the formula:

R = Q.P.Q^{*}

Where R is the resultant quaternion, Q is the orientation quaternion by which you want to
perform a rotation, Q^{*} the conjugate of Q and P is the point converted to a quaternion.
To convert a 3D vector to a quaternion, copy the x, y and z components and set the w component
to 0. This is the same for quaternion to vector conversion: take the x, y and z components and
forget the w.

Note: here the “.” is the multiplication operator.

Quaternion inverse can be obtained, **for unit quaternions**, by negating the
x, y and z components (this is equal to the conjugate quaternion):

inverse(<w, x, y, z>) = conjugate(<w, x, y, z>) = <w, -x, -y, -z>

Quaternion normalization is exactly the same as for vectors, but with four components.

I will not cover the quaternions spherical linear interpolation (SLERP) here, but you can look at the sample code (at the end of this document), or in books, or in the web for the formula. Spherical linear interpolation is used to interpolate two orientations. It is usefull for skeletal animation.

The MD5 Mesh files have the “md5mesh” extension. They contain the geometric data of the models:

- Model's bind-pose skeleton;
- One or multiple meshes. Each mesh have its proper data:
- Vertices;
- Triangles;
- Vertex weights;
- A shader name.

When parsing the MD5 Mesh file, you can find comments. They start with the “//” string and goes to the end of the line. They are here just for humans who want to take a look at the file with a text editor, they don't affect model's data. You can ignore them.

Before loading the geometric data, you will find some precious variables needed to check if this is a valid md5mesh file and for allocating memory:

MD5Version 10 commandline "<string>" numJoints<int>numMeshes<int>

The first line tell you the version of the format. This is an integer. Doom 3's MD5 version is 10. This document covers the version 10 of the format. Olders (or newers) may differ in some points in the structure of the format.

Then comes the `commmandline`

string used by Doom 3 with the `exportmodels`

console command. I have nothing to tell you about it.

`numJoints`

is the number of joints of the model's skeleton. `numMeshes`

is
the number of meshes of the model contained in the md5mesh file.

After that you have the bind-pose skeleton's joints:

joints { "name"parent(pos.xpos.ypos.z) (orient.xorient.yorient.z) ... }

`name`

(string) is the joint's name. `parent`

(int) is the joint's parent
index. If it is equal to -1, then the joint has no parent joint and is what we call a *root*
joint. `pos.x`

, `pos.y`

and `pos.z`

(float) are the joint's
position in space. `orient.x`

, `orient.y`

and `orient.z`

(float)
are the joint's orientation quaternion x, y and z components. After reading a joint, you must
calculate the w component.

After the skeleton, there are the meshes. Each mesh is in the form:

mesh { shader "<string>" numverts<int>vertvertIndex(st)startWeightcountWeightvert ... numtris<int>tritriIndexvertIndex[0]vertIndex[1]vertIndex[2]tri ... numweights<int>weightweightIndexjointbias(pos.xpos.ypos.z) weight ... }

The `shader`

string is defined in the MTR files (`/materials`

directory)
of Doom 3 and tell you what are the textures to apply to the mesh and how to combine them.

`numverts`

(int) is the number of vertices of the mesh. After this variable, you have
the vertex list. `vertIndex`

(int) is the vertex index. `s`

and `t`

(float) are the texture coordinates (also called UV coords). In the MD5 Mesh format, a vertex
hasn't a proper position. Instead, its position is computed from vertex weights (this is
explained later in this document). `countWeight`

(int) is the number of weights,
starting from the `startWeight`

(int) index, which are used to calculate the final
vertex position.

`numtris`

is the number of triangles of the mesh. `triIndex`

(int)
is the index of the triangle. Each is defined by three vertex indices composing it:
`vertIndex[0]`

, `vertIndex[1]`

and `vertIndex[2]`

(int).

`numweights`

(int) is the number of weights of the mesh. `weightIndex`

(int)
is the weight index. `joint`

(int) is the joint it depends of. `bias`

(float)
is a factor in the ]0.0, 1.0] range which defines the contribution of this weight when computing
a vertex position. `pos.x`

, `pos.y`

and `pos.z`

(float) are the
weight's position in space.

The model's skeleton stored in the MD5 Mesh files is what we call the “bind-pose skeleton”. It is generally in a position in which the model has been created.

Its joints are already in their final position, you don't have to make any precomputation on it, like adding the parent joint's position and rotating it or anything. Their position are in object space and independent of others joints.

As said before, the vertex positions must be calculated from the weights. Each vertex has one or more weights, each of them having a position dependent of a joint (the position is in joint's local space), and a factor telling us how much it affects the vertex position. The sum of all weight factors of a vertex should be 1.0. This technique is called “vertex skinning” and allows a vertex to depend on more than one joint of the skeleton for better animation rendering.

First, each vertex' weight position must be converted from joint local space to object space. Then, sum all the weights multiplied by their bias value:

finalPos = (weight[0].pos * weight[0].bias) + ... + (weight[N].pos * weight[N].bias)

The vertex data that comes from the MD5 Mesh file has a `start`

index and a
`count`

value. The `start`

index is the index to the first weight
used by the vertex. Then, all vertex' weight comes just right after this one. The
`count`

value indicates the number of weights used from the first weight.
Here is the code to compute the final vertex positions (in object space) from their
weights:

/* Setup vertices */ for (i = 0; i < mesh->num_verts; ++i) { vec3_t finalVertex = { 0.0f, 0.0f, 0.0f }; /* Calculate final vertex to draw with weights */ for (j = 0; j < mesh->vertices[i].count; ++j) { const struct md5_weight_t *weight = &mesh->weights[mesh->vertices[i].start + j]; const struct md5_joint_t *joint = &joints[weight->joint]; /* Calculate transformed vertex for this weight */ vec3_t wv; Quat_rotatePoint (joint->orient, weight->pos, wv); /* the sum of all weight->bias should be 1.0 */ finalVertex[0] += (joint->pos[0] + wv[0]) * weight->bias; finalVertex[1] += (joint->pos[1] + wv[1]) * weight->bias; finalVertex[2] += (joint->pos[2] + wv[2]) * weight->bias; } ... }

Each vertex has its own texture coordinates. The ST (or UV) texture coordinates for the upper-left corner of the texture are (0.0, 0.0). The ST texture coordinates for the lower-right corner are (1.0, 1.0).

The vertical direction is the inverse of the standard OpenGL direction for the
T coordinate. This is like the DirectDraw Surface way. When loading a texture (other
than a DDS file), you'll have to flip it vertically or take the oposite of the T
texture coordinate for MD5 Mesh vertices (i.e., `1.0 - T`

).

You will probably need to compute normal vectors, for example for lighting. Here is how to compute them in order to get “weight normals”, like the weight positions (this method also works for tangents and bi-tangents):

First, compute all model's vertex positions in bind-pose (using the bind-pose skeleton).

Compute the vertex normals. You now have the normals in object space for the bind-pose skeleton.

For each weight of a vertex, transform the vertex normal by the inverse joint's orientation quaternion of the weight. You now have the normal in joint's local space.

Then when calculating the final vertex positions, you will be able to do the same for the normals, except you won't have to translate from the joint's position when converting from joint's local space to object space.

The MD5 Anim files have the “md5anim” extension. They store information about skeletal animation of MD5 Mesh models:

- Skeleton hierarchy with flags for each joint for animation data;
- A bounding box for each frame of the animation;
- A baseframe skeleton from wich the animated skeleton is computed;
- A list of frames, each containing data to compute a skeleton from the baseframe skeleton.

MD5 Anim files has the same syntax than MD5 Mesh files. Comments begin with “//” and are valable until the end of the line. There is also a header with a version number, a command line and some variables for memory allocation:

MD5Version 10 commandline "<string>" numFrames<int>numJoints<int>frameRate<int>numAnimatedComponents<int>

The version number is the same for all MD5 files, so it should be 10. The
`commandline`

is a Doom 3's internal command.

`numFrames`

(int) is the number of frames of the animation. An animation
is composed of multiple frames, each one being a copy of the skeleton at a particular
position. Running all frames gives you an animation.

`numJoints`

(int) is the number of joints of the frame skeletons. It
must be the same as MD5 Mesh file's joint number to be playable for the model.

`frameRate`

(int) is the number of frames per second to draw for the
animation. The duration of a frame can be obtain by simply inverting
`frameRate`

.

`numAnimatedComponents`

(int) is the number of parameters per frame
used to compute the frame skeletons. These parameters, combined with the baseframe
skeleton given in the MD5 Anim file, permit to build a skeleton for each frame.

After reading the header, comes the skeleton hierarchy. It brings information about the joints for building the skeleton frames from the baseframe data:

hierarchy { "name"parentflagsstartIndex... }

`name`

(string) is the joint's name. `parent`

(int) is
the joint's parent index. If parent is -1, then he joint has no parent. From this
two informations, and the number of joints, it could be reasonable to compare
with MD5 Mesh's skeleton to ensure that the animation is valid for this model.
`flags`

(int) is a set of bit flags which tell you how to compute
the skeleton of a frame for this joint. `startIndex`

(int) is an
index to the beginning of the parameters used to compute the frame skeletons.

After the hierarchy comes the frame bounds. There is a bounding box for each frame:

bounds { (min.xmin.ymin.z) (max.xmax.ymax.z) ... }

`min.x`

, `min.y`

and `min.z`

(float)
represent the minimum 3D coordinates of the box; `max.x`

,
`max.y`

and `max.z`

(float) represent the maximum.
These coordinates are in object space. They are usefull for computing
AABB or
OBB for frustum culling
and basic collision detection.

After bounds you'll find the baseframe data. It contains the position and orientation of each joint from which the frame skeletons will be built. There is a line for each joint:

baseframe { (pos.xpos.ypos.z) (orient.xorient.yorient.z) ... }

`pos.x`

, `pos.y`

and `pos.z`

(float) are
the joint's position. `orient.x`

, `orient.y`

and
`orient.z`

(float) are the joint's orientation quaternion.

After the baseframe data, the frame data. There is a chunk of data for each frame. This data are the parameters used to compute the frame's skeleton:

frameframeIndex{<float><float><float>... }

`frameIndex`

(int) is the index of the frame. Between the brackets,
you have an array of float values. There are `numAnimatedComponents`

values. When you have collected all these data for a frame, you can build
the skeleton of this frame.

From the baseframe data, the hierarchy info and the frame data, you can build a skeleton for a particular frame. Here is how it works for each joint: we start with the baseframe joint's data (position and orientation). Then we replace some of the position and orientation components by a value from the frame's data. The joint's flags (from the hierarchy information) indicate which ones.

`flags`

variable description: starting from the right, the frist
three bits are for the position vector and the next three for the orientation
quaternion. If a bit is set, then you have to replace the corresponding
(x, y, z) component by a value from the frame's data. Which value? This is
given by the `startIndex`

. You begin at the `startIndex`

in the frame's data array and increment the position each time you have to
replace a value to a component.

Once you have computed the “animated” joint's position and orientation, you must compute the joint's position and orientation in object space. Before that, don't forget to compute the w component of the “animated” orientation!

For the position, if the joint has a parent, you must transform the “animated”
joint by its parent orientation quaternion, and add the result to the parent's
position. If the joint is a *root* joint (no parent), then just copy
the “animated” position.

For the orientation, if the joint has a parent, you must concatenate the two orientations; first the parent's orientation and then the “animated” orientation. Just multiply (with the formula given at the beginning of the document) parent's orientation by the “animated” orientation and renormalize the result (orientation quaternion must be unit quaternions). If the joint has no parent, then just copy the “animated” orientation.

Here is the code to build a frame skeleton:

for (i = 0; i < num_joints; ++i) { const struct baseframe_joint_t *baseJoint = &baseFrame[i]; vec3_t animatedPos; quat4_t animatedOrient; int j = 0; memcpy (animatedPos, baseJoint->pos, sizeof (vec3_t)); memcpy (animatedOrient, baseJoint->orient, sizeof (quat4_t)); if (jointInfos[i].flags & 1) /* Tx */ { animatedPos[0] = animFrameData[jointInfos[i].startIndex + j]; ++j; } if (jointInfos[i].flags & 2) /* Ty */ { animatedPos[1] = animFrameData[jointInfos[i].startIndex + j]; ++j; } if (jointInfos[i].flags & 4) /* Tz */ { animatedPos[2] = animFrameData[jointInfos[i].startIndex + j]; ++j; } if (jointInfos[i].flags & 8) /* Qx */ { animatedOrient[0] = animFrameData[jointInfos[i].startIndex + j]; ++j; } if (jointInfos[i].flags & 16) /* Qy */ { animatedOrient[1] = animFrameData[jointInfos[i].startIndex + j]; ++j; } if (jointInfos[i].flags & 32) /* Qz */ { animatedOrient[2] = animFrameData[jointInfos[i].startIndex + j]; ++j; } /* Compute orient quaternion's w value */ Quat_computeW (animatedOrient); /* NOTE: we assume that this joint's parent has already been calculated, i.e. joint's ID should never be smaller than its parent ID. */ struct md5_joint_t *thisJoint = &skelFrame[i]; int parent = jointInfos[i].parent; thisJoint->parent = parent; strcpy (thisJoint->name, jointInfos[i].name); /* Has parent? */ if (thisJoint->parent < 0) { memcpy (thisJoint->pos, animatedPos, sizeof (vec3_t)); memcpy (thisJoint->orient, animatedOrient, sizeof (quat4_t)); } else { struct md5_joint_t *parentJoint = &skelFrame[parent]; vec3_t rpos; /* rotated position */ /* Add positions */ Quat_rotatePoint (parentJoint->orient, animatedPos, rpos); thisJoint->pos[0] = rpos[0] + parentJoint->pos[0]; thisJoint->pos[1] = rpos[1] + parentJoint->pos[1]; thisJoint->pos[2] = rpos[2] + parentJoint->pos[2]; /* Concatenate rotations */ Quat_multQuat(parentJoint->orient, animatedOrient, thisJoint->orient); Quat_normalize (thisJoint->orient); } }

`jointInfos`

contains the hierarchy information. `animFrameData`

is an array containing the frame data. Also don't forget to copy the parent index
from the hierarchy info to your new joint structure. The joint's name can also be
usefull sometimes.

You must do this operation for all frames. At least, the ones you need.

Animating the model consist of calculating the current frame to draw, the next frame and updating the elapsed time from the beginning of the current frame.

The current frame index increase when the frame's max time has been reached.
Remember that this max time is the inverse of `frameRate`

.

You can then process to the interpolation of the current frame's skeleton and
the next frame's skeleton. The percent of interpolation is obtained by multiplying
the elapsed time since the current frame changed by animation's `frameRate`

.

For interpolating two skeletons, you just have to interpolate each joint of them. And for interpolating two joints, you just have to interpolate the position and the orientation.

For the position, just perform a linear interpolation:

finalJoint->pos.x = jointA->pos.x + interp * (jointB->pos.x - jointA->pos.x); finalJoint->pos.y = jointA->pos.y + interp * (jointB->pos.y - jointA->pos.y); finalJoint->pos.z = jointA->pos.z + interp * (jointB->pos.z - jointA->pos.z);

For the orientation, it's better to perform a spherical linear interpolation rather a simple linera interpolation, unless the rotations are very small. For the SLERP formula, look on math book or on the web:

Quat_slerp (jointA->orient, jointB->orient, interp, finalJoint->orient);

**Sample code 1:** md5.c (14 KB). Only MD5 Mesh. No
texture mapping, no lighting, no animation. This light demo fits in less than 650 lines
of code.

**Sample code 2:** md5mesh.c (15 KB),
md5anim.c (13 KB), md5model.h
(3.8 KB). MD5 Mesh and Anim. No texture mapping, no lighting. Less than 1300 lines.

This document is
available under the terms of the GNU
Free Documentation License (GFDL)

© David Henry – contact : tfc.duke (AT)
gmail (DOT) com