Placing a Camera: the LookAt Function
Keywords: Matrix, LookAt, camera, cross product, transformation matrix, transform, camera-to-world, look-at, Gimbal lock.Want to report bugs, errors or send feedback? Write at feedback@scratchapixel.com. You can also contact us on Twitter and Discord. We do this for free, please be kind. Thank you.

News (August, 31): We are working on Scratchapixel 3.0 at the moment (current version of 2). The idea is to make the project open source by storing the content of the website on GitHub as Markdown files. In practice, that means you and the rest of the community will be able to edit the content of the pages if you want to contribute (typos and bug fixes, rewording sentences). You will also be able to contribute by translating pages to different languages if you want to. Then when we publish the site we will translate the Markdown files to HTML. That means new design as well.

That's what we are busy with right now and why there won't be a lot of updates in the weeks to come. More news about SaP 3.0 soon.

We are looking for native Engxish (yes we know there's a typo here) speakers that will be willing to readproof a few lessons. If you are interested please get in touch on Discord, in the #scratchapixel3-0 channel. Also looking for at least one experienced full dev stack dev that would be willing to give us a hand with the next design.

Feel free to send us your requests, suggestions, etc. (on Discord) to help us improve the website.

And you can also donate). Donations go directly back into the development of the project. The more donation we get the more content you will get and the quicker we will be able to deliver it to you.

13 mns read.
In this short lesson we will study a simple but useful method to move 3D cameras. To understand this lesson, you will need to be familiar with the concept of transformation matrix and cross-product between vectors. If that's not already the case, you can first read the lesson devoted to geometry.

Moving the Camera

Being able to move the camera in a 3D scene is essential. However, in most of the lessons from Scratchapixel we usually set the camera position and rotation (remember that scaling a camera doesn't make sense) using a 4x4 matrix which is often labeled the camera-to-world matrix. However, setting up a 4x4 matrix by hand is not friendly.

Thankfully, we can use a method that is commonly referred to as the look-at method. The idea is simple. To set a camera position and orientation, all you need is a point in space to set the camera position and a point to define what the camera is looking at (an aim). Let's label our first point "from" and our second point "to".

We can easily create a world-to-camera 4x4 matrix from these two points as we will demonstrate in this lesson.

Before we get any further, however, let's address an issue that can be a source of confusion. Remember that in a right-hand coordinate system, if you are looking along the z-axis, the x-axis is pointing to the right, the y-axis is pointing upward and the z-axis is pointing towards you as shown in the figure below.

Therefore quite naturally, when we think of creating a new camera, it feels normal to orient the camera as if we were looking at the right-hand coordinate system with the z-axis pointing towards the camera (as shown in the image above). Because by convention cameras are oriented that way, books (e.g. Physically Based Rendering / PBRT) sometimes suggest that this is because cameras are not defined in a right-hand coordinate system but a left-hand one. If you look down the z-axis, a left-hand coordinate system is one in which the z-axis points away from you (the same direction as the line of sight). Assuming the right-hand coordinate is the rule, why should we make an exception for cameras? This explanation is not inaccurate as such, it is nonetheless potentially a source of confusion.

We prefer to say that cameras are using a right-hand coordinate system like all the other objects in our 3D application. However, we do flip the orientation of the camera at render time, by "scaling" the ray direction by -1 along the camera's local coordinate z-axis when we cast rays into the scene. If you check the lesson Ray-Tracing: Generating Camera Rays you will notice that the ray-direction z-component is set to -1 before the ray direction vector is itself transformed by the camera-to-world matrix. This is not stricto sensu a scaling. We just flip the direction of the ray direction vector along the camera local coordinate system z-axis.

Bottom-line: if you use a right-hand coordinate system for your application, to keep things consistent, the camera should also be defined in a right-hand coordinate system like with any other 3D object. But as we cast rays in the opposite direction, it is as if the camera was indeed looking down along the negative z-axis. With this clarification out of the way, let's now see how we build this matrix.

Figure 1: the local coordinate system of the camera aimed at a point.

Figure 2: computing the forward vector from the position of the camera and target point.

Remember that a 4x4 matrix encodes the three axes of a Cartesian coordinate system. If this is not obvious to you, please read the lesson on Geometry. Remember that there are two conventions you need to pay attention to when you deal with matrices and coordinate systems. For matrices, you need to choose between row-major and column-major representations. Let's use the row-major notation. As for the coordinate system, you need to choose between the right-hand and the left-hand coordinate systems. Let's use a right-hand coordinate system. The fourth row of the 4x4 matrix (in a row-major matrix representation) encodes translation values.

$$ \begin{matrix} \color{red}{Right_x}&\color{red}{Right_y}&\color{red}{Right_z}&0\\ \color{green}{Up_x}&\color{green}{Up_y}&\color{green}{Up_z}&0\\ \color{blue}{Forward_x}&\color{blue}{Forward_y}&\color{blue}{Forward_z}&0\\ T_x&T_y&T_z&1 \end{matrix} $$

How you name the axis of a Cartesian coordinate system is entirely up to you. You can call them x, y and z but in this lesson for clarity, we will name them right (for x-axis), up (for the y-axis) and forward for the (z-axis). This is illustrated in figure 1. The method of building a 4x4 matrix from the from-to pair of points can be broken down into four steps:

Again, if you are unsure about why we do that, check the lesson on Geometry. Finally here is the source code of the complete function. It computes and return a camera-to-world matrix from two arguments, the from and to points. Note that the function third parameter (up) is the arbitrary vector used in the computation of the right vector. It is set to (0,1,0) in the main function but you may have to normalize it for safety (in case a user would input a non-normalized vector).

#include <cmath> #include <cstdint> #include <iostream> struct float3 { public: float x{ 0 }, y{ 0 } , z{ 0 }; float3 operator - (const float3& v) const { return float3{ x - v.x, y - v.y, z - v.z }; } }; void normalize(float3& v) { float len = std::sqrt(v.x * v.x + v.y * v.y + v.z * v.z); std::cout << v.x << " " << v.y << " " << v.z << "\n"; v.x /= len, v.y /= len, v.z /= len; std::cout << v.x << " " << v.y << " " << v.z << "\n"; } float3 cross(const float3& a, const float3& b) { return { a.y * b.z - a.z * b.y, a.z * b.x - a.x * b.z, a.x * b.y - a.y * b.x }; } struct mat4 { public: float m[4][4] = {{1, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 1, 0}, {0, 0, 0, 1}}; float* operator [] (uint8_t i) { return m[i]; } const float* operator [] (uint8_t i) const { return m[i]; } friend std::ostream& operator << (std::ostream& os, const mat4& m) { return os << m[0][0] << ", " << m[0][1] << ", " << m[0][2] << ", " << m[0][3] << ", " << m[1][0] << ", " << m[1][1] << ", " << m[1][2] << ", " << m[1][3] << ", " << m[2][0] << ", " << m[2][1] << ", " << m[2][2] << ", " << m[2][3] << ", " << m[3][0] << ", " << m[3][1] << ", " << m[3][2] << ", " << m[3][3]; } }; void lookat(const float3& from, const float3& to, const float3& up, mat4& m) { float3 forward = from - to; normalize(forward); float3 right = cross(up, forward); normalize(right); float3 newup = cross(forward, right); m[0][0] = right.x, m[0][1] = right.y, m[0][2] = right.z; m[1][0] = newup.x, m[1][1] = newup.y, m[1][2] = newup.z; m[2][0] = forward.x, m[2][1] = forward.y, m[2][2] = forward.z; m[3][0] = from.x, m[3][1] = from.y, m[3][2] = from.z; } int main() { mat4 m; float3 up{ 0, 1, 0 }; float3 from{ 1, 1, 1 }; float3 to{ 0, 0, 0 }; lookat(from, to, up, m); std::cout << m << std::endl; return 0; }

Should produce:

0.707107, 0, -0.707107, 0, -0.408248, 0.816497, -0.408248, 0, 0.57735, 0.57735, 0.57735, 0, 1, 1, 1, 1

The Look-At Method Limitations

The method is very simple and works generally well. Though it has a weakness. When the camera is vertical looking straight down or straight up, the forward axis gets very close to the arbitrary axis used to compute the right axis. The extreme case is of course when the forward axis and this arbitrary axis are perfectly parallel e.g. when the forward vector is either (0,1,0) or (0,-1,0). Unfortunately, in this particular case, the cross product fails to produce a result for the right vector. There is no real solution to this problem. You can either detect this case and choose to set the vectors by hand (since you know what the configuration of the vectors should be anyway). A more elegant solution can be developed using quaternion interpolation.