**Contents**

**News (August, 31)**: We are working on Scratchapixel 3.0 at the moment (current version of 2). The idea is to make the project open source by storing the content of the website on GitHub as Markdown files. In practice, that means you and the rest of the community will be able to edit the content of the pages if you want to contribute (typos and bug fixes, rewording sentences). You will also be able to contribute by translating pages to different languages if you want to. Then when we publish the site we will translate the Markdown files to HTML. That means new design as well.

That's what we are busy with right now and why there won't be a lot of updates in the weeks to come. More news about SaP 3.0 soon.

We are looking for native Engxish (yes we know there's a typo here) speakers that will be willing to readproof a few lessons. If you are interested please get in touch on Discord, in the #scratchapixel3-0 channel. Also looking for at least one experienced full dev stack dev that would be willing to give us a hand with the next design.

Feel free to send us your requests, suggestions, etc. (on Discord) to help us improve the website.

And you can also donate). Donations go directly back into the development of the project. The more donation we get the more content you will get and the quicker we will be able to deliver it to you.

## Quick Review

In the previous chapter, we gave a high-level overview of the rasterization rendering technique. It can be decomposed into two main stages: first, the projection of the triangle's vertices onto the canvas, then the rasterization of the triangle itself. Rasterization really means in this case, **"breaking apart" the triangle's shape into pixels** or **raster element squares**; this is how pixels used to be called in the past. In this chapter, we will review the first step. We have already described this method in the two previous lessons, thus we won't explain it here again. If you have any doubt about the principles behind perspective projection, check these lessons again. However, in this chapter, we will study a couple of new tricks related to projection that are going to be useful when we will get to the lesson on the perspective projection matrix. We will learn about a new method to remap the coordinates of the projected vertices from screen space to NDC space. We will also learn more about the role of the z-coordinate in the rasterization alogrithm and how it should be handled at the projection stage.

Keep in mind as already mentioned in the previous chapter, that the goal of the rasterization rendering technique is to solve the visibility or hidden surface problem, which is to determine with parts of a 3D object are visible and which parts are hidden.

## Projection: What Are We Trying to Solve?

What are we trying to solve here at that stage of the rasterization algorithm? As explained in the previous chapter, the principle of rasterization is to find if pixels in the image overlap triangles. To do so, we first need to project triangles onto the canvas, and then convert their coordinates from screen space to raster space. Pixels and triangles are then defined in the same space, which means that it becomes possible to compare their respective coordinates (we can check the coordinates of a given pixel against the raster-space coordinates of a triangle's vertices).

The goal of this stage, is thus to convert the vertices making up triangles from camera space to raster space.

## Projecting Vertices: Mind the Z-Coordinate!

In the previous two lessons, we mentioned that when we compute the raster coordinates of a 3D point what we really need in the end are its x- and y-coordinates (the position of the 3D point in the image). As a quick reminder, recall that these 2D coordinates are obtained by dividing the x and y coordinates of the 3D point in camera space, by the point respective z-coordinate (what we called the perspective divide), and then remapping the resulting 2D coordinates from screen space to NDC space and then NDC space to raster space. Keep in mind that because the image plane is positioned at the near clipping plane, we also need to multiply the x- and y-coordinate by the near clipping plane. Again, we explained this process in great detail in the previous two lessons.

$$ \begin{array}{l} Pscreen.x = \dfrac{ near * Pcamera.x }{ -Pcamera.z }\\ Pscreen.y = \dfrac{ near * Pcamera.y }{ -Pcamera.z }\\ \end{array} $$Note that so far, we have been considering points in screen space as essentially 2D points (we didn't need to use the points' z-coordinate after the perspective divide). From now on though, we will declare points in screen-space, as 3D points and set their z-coordinate to the camera-space points' z-coordinate as follow:

$$ \begin{array}{l} Pscreen.x = \dfrac{ near * Pcamera.x }{ -Pcamera.z }\\ Pscreen.y = \dfrac{ near * Pcamera.y }{ -Pcamera.z }\\ Pscreen.z = { -Pcamera.z }\\ \end{array} $$It is best at this point to set the projected point z-coordinate to the inverse of the original point z-coordinate, which as you know by now, is negative. Dealing with positive z-coordinates will make everything simpler later on (but this is not mandatory).

**Keeping track of the vertex z-coordinate in camera space is needed to solve the visibility problem**. Understanding why is easier if you look at figure 1. Imagine two vertices v1 and v2 which when projected onto the canvas, have the same raster coordinates (as shown in figure 1). If we project v1 before v2 then v2 will be visible in the image when it should actually be v1 (v1 is clearly in front of v2). However if we store the z-coordinate of the vertices along with their 2D raster coordinates, we can use these coordinates to define which point is closest to camera independently of the order in which the vertices are projected (as shown in the code fragment below).

What we want to render though are triangles not vertices. So the question is, how does the method we just learned about apply to triangles? In short, we will use the triangle vertices coordinates to find the position of the point on the triangle that the pixel overlaps (and thus its z-coordinate). This idea is illustrated in figure 2. If a pixel overlaps two or more triangles, we should be able to compute the position of the points on the triangles that the pixel overlap, and use the z-coordinates of these points as we did with the vertices, to know which triangle is the closest to the camera. This method will be described in detail in chapter 4 (The Depth Buffer. Finding the Depth Value of a Sample by Interpolation).

## Screen Space is Also Three-Dimensional

To summarize, to go from camera space to screen space (which is the process during which the perspective divide is happening), we need to:

- Perform the perspective divide: that is dividing the point in camera space x- and y-coordinate by the point z-coordinate. $$ \begin{array}{l} Pscreen.x = \dfrac{ near * Pcamera.x }{ -Pcamera.z }\\ Pscreen.y = \dfrac{ near * Pcamera.y }{ -Pcamera.z }\\ \end{array} $$
- But also set the projected point z-coordinate to the original point z-coordinate (the point in camera space). $$ Pscreen.z = { -Pcamera.z } $$

Practically, this means that our projected point is not a 2D point anymore, but in fact a 3D point. Or to say it differently, that screen space is not two- by three-dimensional. In his thesis Ed-Catmull writes:

You should now be able to understand this quote. The process is also illustrated in figure 3. First the geometry vertices are defined in camera space (top image). Then, each vertex undergoes a perspective divide. That is, the vertex x- and y-coordinates are divided by its z-coordinate, but as mentioned before, we also set the resulting projected point z-coordinate to the inverse of the original vertex z-coordinate. This by the way, infers a change of direction in the z-axis of the screen space coordinate system. As you can see, the z-axis is now pointing inward rather than outward (middle image in figure 3). But the most important thing to notice, is that the resulting object is a deformed version of the original object but nonetheless a three-dimensional object. Furthermore what Ed-Catmull means when he writes "an orthogonal projection of the object onto the x-y plane, would result in the expected perspective image", is that once the object is in screen space, if we trace lines perpendicular to the x-y image plane from the object to the canvas, then we get an perspective representation of that object (as shown in figure 4). This is an interesting observation because it means that the image creation process can be seen as **a perspective projection followed by an orthographic projection**. Don't worry if you don't understand clearly the difference between the perspective and orthographic projection. It is the topic of the next lesson. However, try to remember this observation, as it will become handy later.

## Remapping Screen Space Coordinates to NDC Space

In the previous two lessons, we explained that once in screen space, the x- and y-coordinates of the projected points need to be remapped to NDC space. In the previous lessons, we also explained that in NDC space, points on the canvas had their x- and y-coordinates contained in the range [0,1]. In the GPU world though, coordinates in NDC space are contained in the range [-1,1]. Sadly, this is one of these conventions again, that we need to deal with. We could have kept the convention [0,1] but because GPUs are the reference when it comes to rasterization, it is best to stick to the way the term is defined in the GPU world.

Thus once the points have been converted from camera space to screen space, the next step is to remap them from the range [l,r] and [b,t] for the x- and y-coordinate respectively, to the range [-1,1]. The term l, r, b, t here denotes the left, right, bottom and top coordinates of the canvas. By re-arranging the terms, we can easily find a equation that performs the remapping we want:

$$l < x < r$$Where x here is the x-coordinate of a 3D point in screen space (remember that from now on, we will assume that points in screen space are three-dimensional as explained above). If we remove the term l from the equation we get:

$$0 < x - l < r - l$$By dividing all term by (r-l) we get:

$$ \begin{array}{l} 0 < \dfrac {(x - l)}{(r - l)} < \dfrac {(r - l)}{(r - l)} \\ 0 < \dfrac {(x - l)}{(r -l)} < 1 \\ \end{array} $$

We can now develop the term in the middle of the equation:

$$0 < \dfrac {x}{(r -l)} - \dfrac {l}{(r -l)}< 1$$We can now multiply all terms by 2:

$$0 < 2 * \dfrac {x}{(r -l)} - 2 * \dfrac {l}{(r -l)}< 2$$We now remove 1 from all terms:

$$-1 < 2 * \dfrac {x}{(r -l)} - 2 * \dfrac {l}{(r-l)} - 1 < 1$$If we develop the terms and regroup them, we finally get:

$$ \begin{array}{l} -1 < 2 * \dfrac {x}{(r -l)} - 2 * \dfrac {l}{(r-l)} - \dfrac{(r-l)}{(r-l)}< 1 \\ -1 < 2 * \dfrac {x}{(r -l)} + \dfrac {-2*l+l-r}{(r-l)} < 1 \\ -1 < 2 * \dfrac {x}{(r -l)} + \dfrac {-l-r}{(r-l)} < 1 \\ -1 < \color{red}{\dfrac {2x}{(r -l)}} \color{green}{- \dfrac {r + l}{(r-l)}} < 1\\ \end{array} $$This is a very important equation because the red and green term of the equation in the middle of the formula will become coefficients of the perspective projection matrix. We will study this matrix in the next lesson. But for now, we will just apply this equation **to remap the x-coordinate of a point in screen space to NDC space** (any point that lies on the canvas has its coordinates contained in the range [-1.1] when defined in NDC space). If we apply the same reasoning to the y-coordinate we get:

## Putting Things Together

At the end of this lesson, we now can perform the first stage of the rasterization algorithm which you can decompose into two steps:

- Convert a point in camera space to screen space. It essentially projects a point onto the canvas, but keep in mind that we also need to store the original point z-coordinate. The point in screen-space is tree-dimensional and the z-coordinate will be useful to solve the visibility problem later on. $$ \begin{array}{l} Pscreen.x = \dfrac{ near * Pcamera.x }{ -Pcamera.z }\\ Pscreen.y = \dfrac{ near * Pcamera.y }{ -Pcamera.z }\\ Pscreen.z = { -Pcamera.z }\\ \end{array} $$
- We then convert the x- and y-coordinates of these points in screen space to NDC space using the following formulas: $$ \begin{array}{l} -1 < \color{}{\dfrac {2x}{(r -l)}} \color{}{- \dfrac {r + l}{(r-l)}} < 1\\ -1 < \color{}{\dfrac {2y}{(t - b)}} \color{}{- \dfrac {t + b}{(t-b)}} < 1 \end{array} $$ Where l, r, b, t denote the left, right, bottom and top coordinates of the canvas.

From there, it is extremely simple to convert the coordinates to raster space. We just need to remap the x- and y-coordinates in NDC space to the range [0,1] and multiply the resulting number by the image width and height respectively (don't forget that in raster space the y-axis goes down while in NDC space it goes up. Thus we need to change y's direction during this remapping process). In code we get:

Note that the coordinates of points or vertices in raster space are still defined as floating point numbers here and not integers (which is the case for pixels coordinates).

## What's Next?

We now have projected the triangle onto the canvas, and converted these projected vertices to raster space. Both the vertices of the triangle and the pixel live in the same coordinate system. We are now ready to loop over all pixels in the image and use a technique to find if they overlap a triangle. This is the topic of the next chapter.