MD5Mesh and MD5Anim files formats
(Doom 3's models)

Written by David Henry, 21^th august of 2005

NOTE: this has nothing to do with the cryptographic hash function also called “MD5”.

Introduction

The MD5 model format comes from id Software's Doom 3 first person shooter, released in august 2004. The mesh data and animation data are separated in distinct files. These are ASCII files and are human readable. Here are some generalities about the MD5 format:

Model's geometric data are stored in *.md5mesh files;
Animations are stored in *.md5anim files;
Supports Skeletal Animation;
Supports Vertex Skinning;
Uses quaternions for orientation.

Textures are in separate files (TGA, DDS, or whatever you want). In Doom 3, they are controlled by the *.mtr files in the /materials directory from the game's *.pk4 files. The MTR files are not covered here.

Working with quaternions

The MD5 Mesh and MD5 Anim formats work with quaternions. Quaternions are magic mathematical objects which can represent an orientation. Quaternions are an extension of the complex numbers. If you just discover them now, or don't know how to use them, take a look at a computer graphics math book or at an online FAQ about them.

Quaternions are an alternative to matrices for representing a rotation. Quaternions can't hold information about position (like 4x4 matrices), just the orientation. They can hold the same information as 3x3 rotation matrices.

There is not a lot of things to know about quaternions here, just some formulas:

Quaternion multiplication (Quat × Quat);
Rotation of a point by a quaternion;
Quaternion inverse;
Quaternion normalization;
Quaternion interpolation (SLERP), for smooth animation.

Quaternions are represented by four components: w, x, y and z. Orientation quaternions are unit quaternions.

In the MD5 Mesh and MD5 Anim files, only the x, y and z components are stored. You'll have to compute the w-component yourself, given the three others.

Computing the w-component

Since we deal with only unit quaternions (their length is 1.0), we can obtain the last component with this formula:

float t = 1.0f - (q.x * q.x) - (q.y * q.y) - (q.z * q.z);

if (t < 0.0f)
  {
    q.w = 0.0f;
  }
else
  {
    q.w = -sqrt (t);
  }

Others quaternion operations

A realy quick overview of needed quaternion operations and formulas. For more about them, refer to a 3D Math book, an online FAQ or Wikipedia.

The quaternion multiplication allows to concatenate two rotations. The product of two quaternions Q_a and Q_b is given by the following formula :

Q_a.Q_b = (w_a, v_a)(w_b, v_b) = (w_aw_b - v_a·v_b, w_av_b + w_bv_a + w_a×w_b)

After expanding and making simplifications, we have the following result :

r.w = (qa.w * qb.w) - (qa.x * qb.x) - (qa.y * qb.y) - (qa.z * qb.z);
r.x = (qa.x * qb.w) + (qa.w * qb.x) + (qa.y * qb.z) - (qa.z * qb.y);
r.y = (qa.y * qb.w) + (qa.w * qb.y) + (qa.z * qb.x) - (qa.x * qb.z);
r.z = (qa.z * qb.w) + (qa.w * qb.z) + (qa.x * qb.y) - (qa.y * qb.x);

Be careful! Quaternions are non-commutative, i.e. Q_a × Q_b ≠ Q_b × Q_a.

The rotation of a point by a quaternion is given by the formula:

R = Q.P.Q^*

Where R is the resultant quaternion, Q is the orientation quaternion by which you want to perform a rotation, Q^* the conjugate of Q and P is the point converted to a quaternion. To convert a 3D vector to a quaternion, copy the x, y and z components and set the w component to 0. This is the same for quaternion to vector conversion: take the x, y and z components and forget the w.

Note: here the “.” is the multiplication operator.

Quaternion inverse can be obtained, for unit quaternions, by negating the x, y and z components (this is equal to the conjugate quaternion):

inverse(<w, x, y, z>) = conjugate(<w, x, y, z>) = <w, -x, -y, -z>

Quaternion normalization is exactly the same as for vectors, but with four components.

I will not cover the quaternions spherical linear interpolation (SLERP) here, but you can look at the sample code (at the end of this document), or in books, or in the web for the formula. Spherical linear interpolation is used to interpolate two orientations. It is usefull for skeletal animation.

MD5 Mesh

The MD5 Mesh files have the “md5mesh” extension. They contain the geometric data of the models:

Model's bind-pose skeleton;
One or multiple meshes. Each mesh have its proper data:
- Vertices;
- Triangles;
- Vertex weights;
- A shader name.

Reading a md5mesh file

When parsing the MD5 Mesh file, you can find comments. They start with the “//” string and goes to the end of the line. They are here just for humans who want to take a look at the file with a text editor, they don't affect model's data. You can ignore them.

Before loading the geometric data, you will find some precious variables needed to check if this is a valid md5mesh file and for allocating memory:

MD5Version 10
commandline "<string>"

numJoints <int>
numMeshes <int>

The first line tell you the version of the format. This is an integer. Doom 3's MD5 version is 10. This document covers the version 10 of the format. Olders (or newers) may differ in some points in the structure of the format.

Then comes the commmandline string used by Doom 3 with the exportmodels console command. I have nothing to tell you about it.

numJoints is the number of joints of the model's skeleton. numMeshes is the number of meshes of the model contained in the md5mesh file.

After that you have the bind-pose skeleton's joints:

joints {
    "name" parent ( pos.x pos.y pos.z ) ( orient.x orient.y orient.z )
    ...
}

name (string) is the joint's name. parent (int) is the joint's parent index. If it is equal to -1, then the joint has no parent joint and is what we call a root joint. pos.x, pos.y and pos.z (float) are the joint's position in space. orient.x, orient.y and orient.z (float) are the joint's orientation quaternion x, y and z components. After reading a joint, you must calculate the w component.

After the skeleton, there are the meshes. Each mesh is in the form:

mesh {
    shader "<string>"

    numverts <int>
    vert vertIndex ( s t ) startWeight countWeight
    vert ...

    numtris <int>
    tri triIndex vertIndex[0] vertIndex[1] vertIndex[2]
    tri ...

    numweights <int>
    weight weightIndex joint bias ( pos.x pos.y pos.z )
    weight ...
}

The shader string is defined in the MTR files (/materials directory) of Doom 3 and tell you what are the textures to apply to the mesh and how to combine them.

numverts (int) is the number of vertices of the mesh. After this variable, you have the vertex list. vertIndex (int) is the vertex index. s and t (float) are the texture coordinates (also called UV coords). In the MD5 Mesh format, a vertex hasn't a proper position. Instead, its position is computed from vertex weights (this is explained later in this document). countWeight (int) is the number of weights, starting from the startWeight (int) index, which are used to calculate the final vertex position.

numtris is the number of triangles of the mesh. triIndex (int) is the index of the triangle. Each is defined by three vertex indices composing it: vertIndex[0], vertIndex[1] and vertIndex[2] (int).

numweights (int) is the number of weights of the mesh. weightIndex (int) is the weight index. joint (int) is the joint it depends of. bias (float) is a factor in the ]0.0, 1.0] range which defines the contribution of this weight when computing a vertex position. pos.x, pos.y and pos.z (float) are the weight's position in space.

The bind-pose skeleton

The model's skeleton stored in the MD5 Mesh files is what we call the “bind-pose skeleton”. It is generally in a position in which the model has been created.

Its joints are already in their final position, you don't have to make any precomputation on it, like adding the parent joint's position and rotating it or anything. Their position are in object space and independent of others joints.

Computing vertex positions

As said before, the vertex positions must be calculated from the weights. Each vertex has one or more weights, each of them having a position dependent of a joint (the position is in joint's local space), and a factor telling us how much it affects the vertex position. The sum of all weight factors of a vertex should be 1.0. This technique is called “vertex skinning” and allows a vertex to depend on more than one joint of the skeleton for better animation rendering.

First, each vertex' weight position must be converted from joint local space to object space. Then, sum all the weights multiplied by their bias value:

finalPos = (weight[0].pos * weight[0].bias) + ... + (weight[N].pos * weight[N].bias)

The vertex data that comes from the MD5 Mesh file has a start index and a count value. The start index is the index to the first weight used by the vertex. Then, all vertex' weight comes just right after this one. The count value indicates the number of weights used from the first weight. Here is the code to compute the final vertex positions (in object space) from their weights:

/* Setup vertices */
for (i = 0; i < mesh->num_verts; ++i)
  {
    vec3_t finalVertex = { 0.0f, 0.0f, 0.0f };

    /* Calculate final vertex to draw with weights */
    for (j = 0; j < mesh->vertices[i].count; ++j)
      {
        const struct md5_weight_t *weight = &mesh->weights[mesh->vertices[i].start + j];
        const struct md5_joint_t *joint = &joints[weight->joint];

        /* Calculate transformed vertex for this weight */
        vec3_t wv;
        Quat_rotatePoint (joint->orient, weight->pos, wv);

        /* the sum of all weight->bias should be 1.0 */
        finalVertex[0] += (joint->pos[0] + wv[0]) * weight->bias;
        finalVertex[1] += (joint->pos[1] + wv[1]) * weight->bias;
        finalVertex[2] += (joint->pos[2] + wv[2]) * weight->bias;
      }

    ...
  }

Texture coordinates

Each vertex has its own texture coordinates. The ST (or UV) texture coordinates for the upper-left corner of the texture are (0.0, 0.0). The ST texture coordinates for the lower-right corner are (1.0, 1.0).

The vertical direction is the inverse of the standard OpenGL direction for the T coordinate. This is like the DirectDraw Surface way. When loading a texture (other than a DDS file), you'll have to flip it vertically or take the oposite of the T texture coordinate for MD5 Mesh vertices (i.e., 1.0 - T).

Precomputing normals

You will probably need to compute normal vectors, for example for lighting. Here is how to compute them in order to get “weight normals”, like the weight positions (this method also works for tangents and bi-tangents):

First, compute all model's vertex positions in bind-pose (using the bind-pose skeleton).

Compute the vertex normals. You now have the normals in object space for the bind-pose skeleton.

For each weight of a vertex, transform the vertex normal by the inverse joint's orientation quaternion of the weight. You now have the normal in joint's local space.

Then when calculating the final vertex positions, you will be able to do the same for the normals, except you won't have to translate from the joint's position when converting from joint's local space to object space.

MD5 Anim

The MD5 Anim files have the “md5anim” extension. They store information about skeletal animation of MD5 Mesh models:

Skeleton hierarchy with flags for each joint for animation data;
A bounding box for each frame of the animation;
A baseframe skeleton from wich the animated skeleton is computed;
A list of frames, each containing data to compute a skeleton from the baseframe skeleton.

Reading a md5anim file

MD5 Anim files has the same syntax than MD5 Mesh files. Comments begin with “//” and are valable until the end of the line. There is also a header with a version number, a command line and some variables for memory allocation:

MD5Version 10
commandline "<string>"

numFrames <int>
numJoints <int>
frameRate <int>
numAnimatedComponents <int>

The version number is the same for all MD5 files, so it should be 10. The commandline is a Doom 3's internal command.

numFrames (int) is the number of frames of the animation. An animation is composed of multiple frames, each one being a copy of the skeleton at a particular position. Running all frames gives you an animation.

numJoints (int) is the number of joints of the frame skeletons. It must be the same as MD5 Mesh file's joint number to be playable for the model.

frameRate (int) is the number of frames per second to draw for the animation. The duration of a frame can be obtain by simply inverting frameRate.

numAnimatedComponents (int) is the number of parameters per frame used to compute the frame skeletons. These parameters, combined with the baseframe skeleton given in the MD5 Anim file, permit to build a skeleton for each frame.

After reading the header, comes the skeleton hierarchy. It brings information about the joints for building the skeleton frames from the baseframe data:

hierarchy {
    "name"   parent flags startIndex
    ...
}

name (string) is the joint's name. parent (int) is the joint's parent index. If parent is -1, then he joint has no parent. From this two informations, and the number of joints, it could be reasonable to compare with MD5 Mesh's skeleton to ensure that the animation is valid for this model. flags (int) is a set of bit flags which tell you how to compute the skeleton of a frame for this joint. startIndex (int) is an index to the beginning of the parameters used to compute the frame skeletons.

After the hierarchy comes the frame bounds. There is a bounding box for each frame:

bounds {
    ( min.x min.y min.z ) ( max.x max.y max.z )
    ...
}

min.x, min.y and min.z (float) represent the minimum 3D coordinates of the box; max.x, max.y and max.z (float) represent the maximum. These coordinates are in object space. They are usefull for computing AABB or OBB for frustum culling and basic collision detection.

After bounds you'll find the baseframe data. It contains the position and orientation of each joint from which the frame skeletons will be built. There is a line for each joint:

baseframe {
    ( pos.x pos.y pos.z ) ( orient.x orient.y orient.z )
    ...
}

pos.x, pos.y and pos.z (float) are the joint's position. orient.x, orient.y and orient.z (float) are the joint's orientation quaternion.

After the baseframe data, the frame data. There is a chunk of data for each frame. This data are the parameters used to compute the frame's skeleton:

frame frameIndex {
    <float> <float> <float> ...
}

frameIndex (int) is the index of the frame. Between the brackets, you have an array of float values. There are numAnimatedComponents values. When you have collected all these data for a frame, you can build the skeleton of this frame.

Building the frame skeletons

From the baseframe data, the hierarchy info and the frame data, you can build a skeleton for a particular frame. Here is how it works for each joint: we start with the baseframe joint's data (position and orientation). Then we replace some of the position and orientation components by a value from the frame's data. The joint's flags (from the hierarchy information) indicate which ones.

flags variable description: starting from the right, the frist three bits are for the position vector and the next three for the orientation quaternion. If a bit is set, then you have to replace the corresponding (x, y, z) component by a value from the frame's data. Which value? This is given by the startIndex. You begin at the startIndex in the frame's data array and increment the position each time you have to replace a value to a component.

Once you have computed the “animated” joint's position and orientation, you must compute the joint's position and orientation in object space. Before that, don't forget to compute the w component of the “animated” orientation!

For the position, if the joint has a parent, you must transform the “animated” joint by its parent orientation quaternion, and add the result to the parent's position. If the joint is a root joint (no parent), then just copy the “animated” position.

For the orientation, if the joint has a parent, you must concatenate the two orientations; first the parent's orientation and then the “animated” orientation. Just multiply (with the formula given at the beginning of the document) parent's orientation by the “animated” orientation and renormalize the result (orientation quaternion must be unit quaternions). If the joint has no parent, then just copy the “animated” orientation.

Here is the code to build a frame skeleton:

for (i = 0; i < num_joints; ++i)
  {
    const struct baseframe_joint_t *baseJoint = &baseFrame[i];
    vec3_t animatedPos;
    quat4_t animatedOrient;
    int j = 0;

    memcpy (animatedPos, baseJoint->pos, sizeof (vec3_t));
    memcpy (animatedOrient, baseJoint->orient, sizeof (quat4_t));

    if (jointInfos[i].flags & 1) /* Tx */
      {
        animatedPos[0] = animFrameData[jointInfos[i].startIndex + j];
        ++j;
      }

    if (jointInfos[i].flags & 2) /* Ty */
      {
        animatedPos[1] = animFrameData[jointInfos[i].startIndex + j];
        ++j;
      }

    if (jointInfos[i].flags & 4) /* Tz */
      {
        animatedPos[2] = animFrameData[jointInfos[i].startIndex + j];
        ++j;
      }

    if (jointInfos[i].flags & 8) /* Qx */
      {
        animatedOrient[0] = animFrameData[jointInfos[i].startIndex + j];
        ++j;
      }

    if (jointInfos[i].flags & 16) /* Qy */
      {
        animatedOrient[1] = animFrameData[jointInfos[i].startIndex + j];
        ++j;
      }

    if (jointInfos[i].flags & 32) /* Qz */
      {
        animatedOrient[2] = animFrameData[jointInfos[i].startIndex + j];
        ++j;
      }

    /* Compute orient quaternion's w value */
    Quat_computeW (animatedOrient);

    /* NOTE: we assume that this joint's parent has
       already been calculated, i.e. joint's ID should
       never be smaller than its parent ID. */
    struct md5_joint_t *thisJoint = &skelFrame[i];

    int parent = jointInfos[i].parent;
    thisJoint->parent = parent;
    strcpy (thisJoint->name, jointInfos[i].name);

    /* Has parent? */
    if (thisJoint->parent < 0)
      {
        memcpy (thisJoint->pos, animatedPos, sizeof (vec3_t));
        memcpy (thisJoint->orient, animatedOrient, sizeof (quat4_t));
      }
    else
      {
        struct md5_joint_t *parentJoint = &skelFrame[parent];
        vec3_t rpos; /* rotated position */

        /* Add positions */
        Quat_rotatePoint (parentJoint->orient, animatedPos, rpos);
        thisJoint->pos[0] = rpos[0] + parentJoint->pos[0];
        thisJoint->pos[1] = rpos[1] + parentJoint->pos[1];
        thisJoint->pos[2] = rpos[2] + parentJoint->pos[2];

        /* Concatenate rotations */
        Quat_multQuat(parentJoint->orient, animatedOrient, thisJoint->orient);
        Quat_normalize (thisJoint->orient);
      }
  }

jointInfos contains the hierarchy information. animFrameData is an array containing the frame data. Also don't forget to copy the parent index from the hierarchy info to your new joint structure. The joint's name can also be usefull sometimes.

You must do this operation for all frames. At least, the ones you need.

Animating the model

Animating the model consist of calculating the current frame to draw, the next frame and updating the elapsed time from the beginning of the current frame.

The current frame index increase when the frame's max time has been reached. Remember that this max time is the inverse of frameRate.

You can then process to the interpolation of the current frame's skeleton and the next frame's skeleton. The percent of interpolation is obtained by multiplying the elapsed time since the current frame changed by animation's frameRate.

Skeleton Interpolation

For interpolating two skeletons, you just have to interpolate each joint of them. And for interpolating two joints, you just have to interpolate the position and the orientation.

For the position, just perform a linear interpolation:

finalJoint->pos.x = jointA->pos.x + interp * (jointB->pos.x - jointA->pos.x);
finalJoint->pos.y = jointA->pos.y + interp * (jointB->pos.y - jointA->pos.y);
finalJoint->pos.z = jointA->pos.z + interp * (jointB->pos.z - jointA->pos.z);

For the orientation, it's better to perform a spherical linear interpolation rather a simple linera interpolation, unless the rotations are very small. For the SLERP formula, look on math book or on the web:

Quat_slerp (jointA->orient, jointB->orient, interp, finalJoint->orient);

Sample code 1: md5.c (14 KB). Only MD5 Mesh. No texture mapping, no lighting, no animation. This light demo fits in less than 650 lines of code.

Sample code 2: md5mesh.c (15 KB), md5anim.c (13 KB), md5model.h (3.8 KB). MD5 Mesh and Anim. No texture mapping, no lighting. Less than 1300 lines.

This document is available under the terms of the GNU Free Documentation License (GFDL)
© David Henry – contact : tfc.duke (AT) gmail (DOT) com

MD5Mesh and MD5Anim files formats(Doom 3's models)