The C++ Augmented Reality Toolkit

6. Camera Orientation

Once a marker had been successfully identified, the final step in the process was to determine the location of the marker in world space. This step was needed so that, when augmented graphics were rendered on top of the image, their scale and orientation would match the detected marker.

Determining the location of the marker in world space raised some interesting problems. The main one was that the only information available was the 2D coordinates of the reference square and the marker corners. There was also the added complexity in that the marker corners were subjected to perspective.

In computer vision, this problem is referred to in many different ways, including Pose Estimation [ABIDI and CHANDRA 1989], Camera Calibration [EASON et al 1984] and Object Pose [LEPETIT 2004]. Although referred to by different names, they each attempt to calculate the camera’s internal and external parameters.

The internal parameters, referred to as intrinsic parameters, are the focal lengths in the u and v direction. These two values control the perspective scaling of the augmented graphical objects. The external parameters, called extrinsic parameters, define the camera’s location and orientation with respect to the marker.

Many solutions assume that the focal length is already known and can be substituted into the relevant equations. ARToolkit [KATO 2005] uses this approach and supplies the required tools for calibrating new cameras. For ARLib, a solution was needed that would be able to calculate the focal length automatically. Such a solution is described in [MALIK 2002] in which the mapping of a planar marker in world space is mapped to the image plane, see figure 6.1. It is this technique that was implemented in ARLib.

Figure 6.1</br>Planar Mapping from world to image space.

The 2D to 2D projection matrix calculated in section 4 were used as the starting point. The technique, used in ARLib, calls on the projection matrix values to enable the focal lengths and orientation to be calculated automatically.

The general perspective matrix is show in equation 6.1. This matrix was simplified and shown in equation 6.2.

Equation 6.1 Equation 6.2

The following rules exist with the H matrix:

Equation 6.3 Equation 6.4 Equation 6.5

Using the rules in equations 6.3 to 6.5, matrix H could then be arranged into two equations for solving fu and fv:

Equation 6.6 Equation 6.7

Once the intrinsic values have been calculated from the above equations, the remaining matrix values could be calculated using the following equations:

Equation 6.8 to 6.19

Where λ is a scaling factor and was calculated using the following equation:

Equation 6.20

6.2 OpenGL Matrices

Once the values had been calculated for fu, fv, r11-r33 and t1-t3, they could then be used to construct the projection and model view matrices.

To create the projection matrix, the fu and fv value were used in the glFrustum function call.


void glFrustum(GLdouble left, GLdouble right, GLdouble bottom, GLdouble top,

                    GLdouble zNear, GLdouble zFar)

Typical values used in the glFrustum call were:

right = imageWidth / fu

left = -right

top = imageHeight / fv

bottom = -top

The model view matrix could be constructed from the remaining values r11-r33 and t1-t3. The OpenGL function glLoadMatrix uses matrices in column-major order, therefore the values in ARLib were needed to be transposed from the matrix in equations 6.8 to 6.19. The sixteen elements of the glLoadMatrix parameter were set as follows:

M[0] = r11 M[4] = r21 M[8] = r31 M[12] = 0
M[1] = r12 M[5] = r22 M[9] = r32 M[13] = 0
M[2] = r13 M[6] = r23 M[10] = r33 M[14] = 0
M[3] = t1 M[7] = t2 M[11] = t3 M[15] = 1

6.3 Augmentation

Once the OpenGL projection and model view matrices had been initialised, then standard calls to Glut and OpenGL could be made to render 3D objects. Any rendered objects would be positioned and scaled on the marker with the scene.

Because there could be multiple markers within a particular image, the whole process of calculating the camera intrinsic and extrinsic parameters needed to be calculated for each marker. The call to glFrustum and the loading of the model view matrix also needed to be carried out for each marker.

Figure 6.2</br>Examples of augmented graphics.
<< 5. Marker Detection 7. Using ARLib >>