Scratchapixel 2.0
Sign in
Computing the Pixel Coordinates of a 3D Point
Keywords: rasterisation, perspective projection, wireframe, perspective projection matrix, camera coordinate system, projective transformation, world space, world origin, world coordinate system, coordinate system, transformation, matrix, matrices, inverse matrix, camera coordinate system, camera space, image plane, perspective divide, screen space, raster space, screen coordinate system, NDC, normalized device coordinate, viewport.

Your are subscriber and logged in!

One of the Most Common Questions about 3D Rendering on the Web

"How do I find the 2D pixel coordinates of a 3D point?" is one of the most common questions (related to 3D rendering) on the Web. It is an important question indeed because it is the really fundamental method by which an image of a 3D scene is formed. In the context of this lesson, we will use the term rasterisation to describe the process of finding 2D pixel coordinates of 3D points. Rasterisation in its broader sense, refers to the process of converting 3D shapes into a raster image. A raster image, as explained in the previous lesson, is the technical term given to a digital image; it designates a two-dimensional array (or rectangular grid if you prefer) of pixels.

Don't be mistaken: different rendering techniques exist for producing images of 3D scenes. Rasterisation is only one of them. Ray-tracing is another. Note though that all these techniques rely on the same concept to produce that image: the concept of perspective projection. Therefore, for a given camera and a given 3D scene, all rendering techniques produce the same visual result; they just use a different approach to produce that result.

Note also that computing the 2D pixel coordinates of 3D points, is only one of the two steps in the process of creating a photo-realistic image. The other step is the process of shading, in which the color of these points will be computed to simulate the appearance of objects. You need more than just converting 3D points to pixel coordinates to produce a "complete" image.

To understand rasterisation, you first need to be familiar with a series of very important techniques which we will also introduce in this chapter (such as the concept of local vs. global coordinate system, learning how to interpret 4x4 matrices as coordinate systems, converting points from one coordinate system to another, etc.). Read this lesson carefully, as it will provide you with the very basic tools almost all rendering techniques are built upon.

We will use matrices in this lesson a lot so read the Geometry lesson first if you are not comfortable with coordinate systems and matrices yet.

We will apply the technique studied in this lesson to render a wireframe image of a 3D object (adjacent image). The files of this program can be found in the source code chapter of the lesson as usual.

A Quick Refresher on the Perspective Projection Process

Figure 1: to create an image of a cube, we just need to extend lines from the objects corners towards the eye and find the intersection of these lines with a flat surface (the canvas) perpendicular to the line of sight.

We talked about the perspective projection process in quite a few lessons already. Check out for instance the chapter "The Visibility Problem" in the lesson "Rendering an Image of a 3D Scene: an Overview". However, let's quickly recall here what the perspective projection is. In short, this technique can be used to create a 2D image of a 3D scene, by projecting points or vertices making up the objects of that scene, onto the surface of a canvas. Why are we doing that? Because this is more or less the way the human eye works, and since we are used to see the world through our eyes, it's quite natural to think that images created that way, will also look natural to us (images created using this method do look "real" to us). You can see the human eye as just a "point" in space (figure 2). What we see of the world is the result of light rays (reflected by objects), travelling to this point and entering the eye (the eye is obviously not exactly a point; it is an optical system converging rays onto a small surface - the retina). So again, one way of making an image of a 3D scene in CG is to do the same thing, which you can get by projecting vertices onto the surface of the canvas (or the surface of the screen) as if they were sliding along straight lines connecting the vertices themselves to the eye.

It is important to understand that the perspective projection is just an arbitrary way of representing 3D geometry onto a two-dimensional surface. It is the most commonly used way because it simulates foreshortening which is one of the most important properties of human vision: objects in the distance appear smaller than objects close by. Nonetheless, as mentioned in the Wikipedia article on perspective, it is important to understand that while creating realistic images, perspective stays an "approximate representation, on a flat surface (such as paper), of an image as it is seen by the eye". The important word here is "approximate".

Figure 2: among all light rays reflected by an object, some of these rays enter the eye, and the image we have of this object, is the result of these rays.

Figure 3: the projection process can be seen as if the point we want to project was moved down along a line connecting the point or the vertex itself to the eye. We can stop moving the point along that line when it lies on the plane of the canvas. Obviously we don't "slide" the point along this line explicitly, but this how the projection process can be interpreted.

In the aforementioned lesson, we also explained how the world coordinates of a point located in front of the camera (and enclosed within the viewing frustum of the camera, thus visible to the camera), can be computed using a simple geometric construction based on one of the properties of similar triangles (figure 3). We will review this technique one more time in this lesson. It turns out that the equations to compute the coordinates of a projected points can actually somehow be expressed in the form of a 4x4 matrix. If you don't use the matrix form, computing the projected point's coordinates is of course possible. It is in itself not very complex but requires nonetheless a series of operations on the original point's coordinates: this is what you will learn in this lesson. However, expressed in the form a matrix, you can reduce this series of operations to a single point-matrix multiplication. Being able to represent this critical operation in such a compact and easy to use form is the main advantage of this approach. It turns out, that the perspective projection process, and its associated equations, can be expressed in the form of a 4x4 matrix indeed, as we will demonstrate in lesson 5. This is what we call the perspective projection matrix. Multiplying any point whose coordinates are expressed with respect to the camera coordinate system (see below), by this perspective projection matrix, will give you the position (or coordinates) of that point onto the canvas.

In CG, transformations are almost always linear. But it is important to know that the perspective projection which belongs to the more generic family of projective transformation, is a non-linear transformation.

Again, in this lesson, we will learn about computing the 2D pixel coordinates of a 3D point without using the perspective projection matrix. To do so, we will need to learn how we can "project" a 3D point onto the surface of a 2D drawable surface (which we will call in this lesson, a canvas) using some simple geometry rules. Once we understand the mathematics of this process (and all the other steps involved in computing these 2D coordinates, as the projection process is just one among many), we will then be ready to study the construction and use of the perspective projection matrix, a matrix used to simplify the projection step (and the projection step only). This will be the topic of the next lesson.

Some History

The mathematics behind perspective projection started to be understood and mastered by artists towards the end of the fourteenth and beginning of the fifteenth century. Artists greatly contributed to the education of others in the mathematical basis of perspective drawing through books that they would write and illustrate themselves. A notable example is "The Painter's Manual" published by Albrecht Dürer in 1538 (the illustration above comes from this book). Perspective drawing is largely characterized by two concepts: that objects appear smaller as their distances to the viewer increases and that of foreshortening. Foreshortening describes the impression, or optical illusion, that an object or a distance is smaller than it really is, due to being angled towards the viewer. Another rule related to foreshortening states that vertical lines are parallel, while nonvertical lines converge to a perspective point, thereby appearing shorter than they really are. These effects give a sense of depth, which is useful in evaluating the distance of objects from the viewer. Today the same mathematical principles are used in computer graphics to create a perspective view of a 3D scene.