Home
Donate
favorite

Placing a Camera: the LookAt Function

Distributed under the terms of the CC BY-NC-ND 4.0 License.

  1. Framing: The LookAt Function

Framing: The LookAt Function

Reading time: 12 mins.

In this short lesson, we will study a simple but useful method to place 3D cameras. To understand this lesson, you will need to be familiar with the concept of transformation matrix and cross-product between vectors. If that's not already the case, you might want to read the lesson Geometry first.

Placing the Camera

Being able to place the camera in a 3D scene is essential. However, in most of the lessons from Scratchapixel, we usually set the camera position and rotation (remember that scaling a camera doesn't make sense) using a 4x4 matrix which is often labeled the camera-to-world matrix. However, setting up a 4x4 matrix by hand is not friendly.

Thankfully, we can use a method that is commonly referred to as the look-at method. The idea is simple. To set a camera position and orientation, all you need is a point in space to set the camera position and a point to define what the camera is looking at (an aim). Let's label our first point "from" and our second point "to".

We can easily create a world-to-camera 4x4 matrix from these two points as we will demonstrate in this lesson.

Before we get any further, however, let's address an issue that can be a source of confusion. Remember that in a right-hand coordinate system, if you are looking along the z-axis, the x-axis is pointing to the right, the y-axis is pointing upward and the z-axis is pointing towards you as shown in the figure below.

Therefore quite naturally, when we think of creating a new camera, it feels normal to orient the camera as if we were looking at the right-hand coordinate system with the z-axis pointing towards the camera (as shown in the image above). Because by convention cameras are oriented that way, books (e.g. Physically Based Rendering / PBRT) sometimes suggest that this is because cameras are not defined in a right-hand coordinate system but a left-hand one. If you look down the z-axis, a left-hand coordinate system is one in which the z-axis points away from you (in the same direction as the line of sight). Assuming the right-hand coordinate is the rule, why should we make an exception for cameras? This explanation is not inaccurate as such, it is nonetheless potentially a source of confusion.

We prefer to say that cameras are using a right-hand coordinate system like all the other objects in our 3D application. However, we do flip the orientation of the camera at render time, by "scaling" the ray direction by -1 along the camera's local coordinate z-axis when we cast rays into the scene. If you check the lesson Ray-Tracing: Generating Camera Rays you will notice that the ray-direction z-component is set to -1 before the ray direction vector is itself transformed by the camera-to-world matrix. This is not stricto sensu a scaling. We just flip the direction of the ray direction vector along the camera's local coordinate system z-axis.

Bottom line: if you use a right-hand coordinate system for your application, to keep things consistent, the camera should also be defined in a right-hand coordinate system like with any other 3D object. But as we cast rays in the opposite direction, it is as if the camera was indeed looking down along the negative z-axis. With this clarification out of the way, let's now see how we build this matrix.

Figure 1: the local coordinate system of the camera aimed at a point.
Figure 2: computing the forward vector from the position of the camera and target point.

Remember that a 4x4 matrix encodes the three axes of a Cartesian coordinate system. If this is not obvious to you, please read the lesson on Geometry. Remember that there are two conventions you need to pay attention to when you deal with matrices and coordinate systems. For matrices, you need to choose between row-major and column-major representations. Let's use the row-major notation. As for the coordinate system, you need to choose between the right-hand and the left-hand coordinate systems. Let's use a right-hand coordinate system. The fourth row of the 4x4 matrix (in a row-major matrix representation) encodes translation values.

$$ \begin{matrix} \color{red}{Right_x}&\color{red}{Right_y}&\color{red}{Right_z}&0\\ \color{green}{Up_x}&\color{green}{Up_y}&\color{green}{Up_z}&0\\ \color{blue}{Forward_x}&\color{blue}{Forward_y}&\color{blue}{Forward_z}&0\\ T_x&T_y&T_z&1 \end{matrix} $$
Figure 3: the vector (0,1,0) is in the plane defined by the forward and up vector. The vector perpendicular to this plane is thus the right vector.

How you name the axis of a Cartesian coordinate system is entirely up to you. You can call them x, y and z but in this lesson for clarity, we will name them right (for the x-axis), up (for the y-axis) and forward for the (z-axis). This is illustrated in figure 1. The method of building a 4x4 matrix from the from-to pair of points can be broken down into four steps:

Again, if you are unsure about why we do that, check the lesson on Geometry. Finally here is the source code of the complete function. It computes and returns a camera-to-world matrix from two arguments, the from and to points. Note that the function's third parameter (_up_) is the arbitrary vector used in the computation of the right vector. It is set to (0,1,0) in the main function but you may have to normalize it for safety (in case a user would input a non-normalized vector).

#include <cmath> 
#include <cstdint> 
#include <iostream> 
 
struct float3 
{ 
public: 
    float x{ 0 }, y{ 0 } , z{ 0 }; 
    float3 operator - (const float3& v) const 
    { return float3{ x - v.x, y - v.y, z - v.z }; } 
}; 
 
void normalize(float3& v) 
{ 
    float len = std::sqrt(v.x * v.x + v.y * v.y + v.z * v.z); 
    std::cout << v.x << " " << v.y << " " << v.z << "\n"; 
    v.x /= len, v.y /= len, v.z /= len; 
    std::cout << v.x << " " << v.y << " " << v.z << "\n"; 
} 
 
float3 cross(const float3& a, const float3& b) 
{ 
    return { 
        a.y * b.z - a.z * b.y, 
        a.z * b.x - a.x * b.z, 
        a.x * b.y - a.y * b.x 
    }; 
} 
 
struct mat4 
{ 
public: 
    float m[4][4] = {{1, 0, 0, 0},  {0, 1, 0, 0}, {0, 0, 1, 0}, {0, 0, 0, 1}}; 
    float* operator [] (uint8_t i) { return m[i]; } 
    const float* operator [] (uint8_t i) const { return m[i]; } 
    friend std::ostream& operator << (std::ostream& os, const mat4& m) 
    { 
        return os << m[0][0] << ", " << m[0][1] << ", " << m[0][2] << ", " << m[0][3] << ", " 
                  << m[1][0] << ", " << m[1][1] << ", " << m[1][2] << ", " << m[1][3] << ", " 
                  << m[2][0] << ", " << m[2][1] << ", " << m[2][2] << ", " << m[2][3] << ", " 
                  << m[3][0] << ", " << m[3][1] << ", " << m[3][2] << ", " << m[3][3]; 
    } 
 
}; 
 
void lookat(const float3& from, const float3& to, const float3& up, mat4& m) 
{ 
    float3 forward = from - to; 
    normalize(forward); 
    float3 right = cross(up, forward); 
    normalize(right); 
    float3 newup = cross(forward, right); 
 
    m[0][0] = right.x,   m[0][1] = right.y,   m[0][2] = right.z; 
    m[1][0] = newup.x,   m[1][1] = newup.y,   m[1][2] = newup.z; 
    m[2][0] = forward.x, m[2][1] = forward.y, m[2][2] = forward.z; 
    m[3][0] = from.x,    m[3][1] = from.y,    m[3][2] = from.z; 
} 
 
 
int main() 
{ 
    mat4 m; 
 
    float3 up{ 0, 1, 0 }; 
    float3 from{ 1, 1, 1 }; 
    float3 to{ 0, 0, 0 }; 
 
    lookat(from, to, up, m); 
 
    std::cout << m << std::endl; 
 
    return 0; 
} 

Should produce:

0.707107, 0, -0.707107, 0, -0.408248, 0.816497, -0.408248, 0, 0.57735, 0.57735, 0.57735, 0, 1, 1, 1, 1

The Look-At Method Limitations

The method is very simple and works generally well. Though it has a weakness. When the camera is vertical looking straight down or straight up, the forward axis gets very close to the arbitrary axis used to compute the right axis. The extreme case is of course when the forward axis and this arbitrary axis are perfectly parallel e.g. when the forward vector is either (0,1,0) or (0,-1,0). Unfortunately, in this particular case, the cross-product fails to produce a result for the right vector. There is no real solution to this problem. You can either detect this case and choose to set the vectors by hand (since you know what the configuration of the vectors should be anyway). A more elegant solution can be developed using quaternion interpolation.

Found a problem with this page?

Want to fix the problem yourself? Learn how to contribute!

Source this file on GitHub

Report a problem with this content on GitHub