In this lesson, we will learn about using 4x4 transformation matrices to change the position, rotation and scale of 3D objects. So far, we assumed that the geometry we rendered was always positioned where the model was initially created. We learned how to ray-trace spheres with arbitrary center position. Though, the position of a polygon mesh in the scene is defined by the position of the vertices making up the mesh, and in most cases, this is not where we want the object to be in the final scene. Thus we need to transform it. Imagine for example that you have modelled a tree but want to render an image of a forest. To create the forest you will use the model of the tree you created, duplicate this model a large number of times and apply random transformations to these copies to make each tree scale, positon and rotation unique. By doing so, you introduce variety in the way that single model looks like from any viewpoint, giving the illusion that the forest is made of many unique tree models. Transformation more generally is useful to place any model into its final positon in the rendered scene (position here is used in a general sense, i.e. it includes the concept of position, rotation and scale). In CG, artists call this step layout or set dressing. It consist of taking objects as they were modelled (eventually taking multiply copies of the same model), and moving, rotating and scaling them around.
When they are created in a modeling software such as Maya or Blender, 3D models are generally centred around the world origin. More often the base of the model also lies in the xz-plane (in Maya, or xy plane in 3DSMax). When they are in this position, we say that the model is defined in object space. If we change the size, the rotation and the position of this object using a 4x4 transformation matrix for example, we say the object is defined in world space and the matrix transform the object from object to world space, is of course call the object-to-world matrix (in OpenGL this matrix is also known as the model matrix).
From a programming point of view, transforming the an object from object space to world space is really straightforward. All we need to do, is loop over all the vertices of the mesh and transform them with the object-to-world matrix:
The code above is written for clarity but it can be made faster. The object-to-world matrix can often be easily queried in 3D applications. For example in Maya you can select the object and use the Mel command xform -q -ws -m or getAttr .worldMatrix.
The object-to-world matrix is a property of any object in a scene, thus, we can define it as a member variable of the Object class in our program. The matrix will be passed to the constructor of the class (line 4).
All objects will have access to this matrix, since they are all derived from this base class (quadrics, polygon meshes, etc.). For example, here is how it works for the TriangleMesh class. The object-to-world matrix is passed to the constructor of the TriangleMesh class (line 6) which in turn passes it on to the constructor of the Object class (line 13). Finally, in the constructor of the triangle mesh class, we loop over all the vertices making the mesh and set the mesh vertices to the input vertices transformed by the object-to-world matrix (lines 19-22):
We now have all a full picture of the transformation pipeline. Objects are first transformed from object space to world space. If rasterization is used, objects are then transform from world-to-camera space. Vertices are then projected onto the screen (using the perspective projection matrix) to screen space and then remapped to NDC (as part of the perspective projection matrix). Finally, points on the image plane in NDC coordinates are converted to their final raster or pixel coordinates. In ray-tracing, we only care about the object to wold transformation matrix. Rays are defined in world space and the ray-geometry intersection test occurs in world space (it can also take place in object space as explained later in this lesson though this is generally more a special case).
Remember from the lesson on Geometry that when we transform points and vectors by a 4x4 matrix, normals have to be transformed by the transpose of the inverse of that object-to-camera matrix. Thus, in addition to setting up the object-to-world matrix in the Object class constructor, we will also compute the world-to-object matrix (line 4), which is going to be needed to transform normals:
In the constructor of the TriangleMesh, we will now need to transform the normal by the transpose of the world-to-object matrix. We first compute the transpose matrix (line 27) and use it to transform all normals (line 33):
Note that texture coordinates don't need to be transformed. They live in their own 2D space and are not affected by the object's transformation. It is of course possible to transform texture coordinates but a 3x3 matrix in this case should be enough (to handle scale, rotation and translation in 2D space). Though in practice this is rarely done. Texture coordinates are generally adjusted in the 3D application and the transformations are directly baked into the coordinates themselves.
If we put all the pieces together and render a new object (the famous Utah teapot, an iconic object in the world of computer graphics) we can produce the two following images:
The image on the left uses face normals and the image on the right uses vertex normals. Note that the faceted look of the object in the left image disappeared in the right image. This is what vertex normals are used for, but don't worry too much about it, we will learn about smooth shading in the next lesson. While it is not obvious in these images that the teapot has been transformed, you can change the matrix in the program to the identity matrix and render the image again. You will more easily be able to spot the difference and the effect of the object-to-world matrix.
A Special Use of the World-to-Object Matrix in Ray-Tracing
In this part of the lesson we will show one interesting use of the world-to-object matrix in ray-tracing. In the lesson A Minimal Ray-Tracer: Rendering Simple Shapes (Sphere, Cube, Disk, Plane, etc.) we studied two techniques to compute the ray-sphere intersection, a geometric and a parametric method. While it is possible in both method to specify the centre and the radius of the sphere we wish to render, let's imagine a situation in which we don't know how to ray-trace a sphere unless the sphere has radius 1 and is centred about the world coordinate system origin. How could we still change the sphere size, rotation and position? The solution to this problem is in fact simple. If the sphere new scale, position and rotation is defined by a 4x4 transformation matrix, then rather than transforming the sphere using this matrix, we will transform the ray instead of the sphere to the sphere object space, by transforming its position and direction using the sphere world-to-object matrix (the inverse of the sphere object-to-world matrix). We don't transform the sphere, we transform the ray by the matrix inverse. Imagine that we translate the sphere by 2 units in Y. Then instead of computing the intersection of the ray with the translated sphere, we keep the sphere at the origin, and move the ray position by -2 in Y (as shown in figure 2). This technique is quite common in ray-tracing. It is sometimes more convenient to compute the intersection of a shape while the shape is in its object space. Thus rather than transforming the shape by the object-to-world matrix and test the ray-shape intersection with the ray in world space, we transform the ray to the shape object space, and perform the ray-geometry test in object space.
Obviously, the problem with this technique is that \(t\) the intersection distance, is not in world space anymore but in object space. The value computed by this method can't be compared to the value returned for instance by the intersect() method of the TriangleMesh class. To fix the problem, you need to compute the intersection point in object space, transform the hit point to world space, and then compute \(t\) by taking the distance between the ray origin and the hit point in world space. Obviously all these complications are worth it if computing the ray-geometry intersection is much more efficient in object space than world space. We won't implement this method in our program. Mentioning the method is enough but you can give it a try as an exercise by adapting the code above.
As usual, the source code of the full program can be found in the last chapter of this lesson.