Home
About
Donate
Rasterization: a Practical Implementation

Improving the Rasterization Algorithm

Figure 1: jagged edges pixel artefacts can be reduced using anti-aliasing.

Figure 2: when one sample is used the triangle is missed. But by using sub-pixels, we can detect that the pixel overlaps the triangle at least partially. The pixel color is equal to the sum of the sub-pixels colors divided by the total number of sub-pixels or samples (here this example 16 samples).

Figure 3: pixels can't properly capture the shape of continuous surfaces.

Figure 4: using sub-pixels to improve fight aliasing.

Figure 5: anti-aliasing helps with smoothing jagged edges.

Figure 6: 1 sample per pixels or 1 pps (top) vs. 4 samples per pixels (bottom).

Figure 7: if the 4 corners of an 8x8 pixel grid overlap the triangle then all remaining pixels of the grids cover the triangle too.

Aliasing and Anti-Aliasing

All the techniques we presented in the previous chapters are really the foundation of the rasterization algorithm. Though, we have only implemented these techniques in a very basic way. The GPU rendering pipeline and other rasterization based production renderers use the same concepts but they used highly optimized version of these algorithms. Presenting all the different tricks that are used to speed up the algorithm goes way beyond the scope of an introduction. We will just review some of them quickly now but we plan to devote a lesson to this topic in the future.

First, let's consider one basic problem with 3D rendering. If you zoom up on the image of the triangle that we renderered in the previous chapter, you will notice that the edges of the triangle are not regular (in fact, it is not specific to the edge of the triangle, you can also see that the checkerboard pattern is irregular on the edge of the squares). The steps that you can easily see in figure 1, are called jaggies. These jagged edges or stair-stepped edges (depending on how you prefer to call them) are not really an artifact. This is just the result of the fact that the triangle is broken down into pixels. What we do with the rasterization process is breaking down a continuous surface (the triangle) into discrete elements (the pixels), a process that we already mentioned in the introduction to rendering. The problem is similar to trying to represent a continuous curve or surface with Lego bricks. You simply can't, you will always see the bricks (figure 2). The solution to this problem in rendering is called anti-aliasing (also denoted AA). Rather that rendering only 1 sample per pixel (check if the pixel overlaps the triangle by testing if the point in the center of the pixel covers the triangles), we split the pixel into sub-pixels and repeat the coverage test for each sub-pixels. Of course each sub-pixel is nothing else than another brick thus this doesn't solve the problem entirely. Nonetheless it allows to capture edges of the objects with slightly more precision. Pixels are most of the time divided into a N by N number of sub-pixels where N is generally a power of 2 (2, 4, 8, etc.), though it can technically take on any value greater or equal to 1 (1, 2, 3, 4, 5, etc.). There are in fact different ways of addressing this aliasing issue. The method we described belong to the category of sampling-based anti-aliasing methods.

The pixel final color is computed as the sum of all the sub-pixels color divided by the total number of sub-pixels. Let's take an example (as with pixels, if a sub-pixel or sample covers a triangle, it then takes on the color of that triangle, otherwise it takes on the background color which is usully black). Imagine that the triangle is white. If only 2 or the 4 samples overlap the triangle, the final pixel color will be equal to 0+0+1+1/4=0.5. The pixel won't be completely white, but it won't be complete black either. Thus rather than having a "binary" transition between the edge of the triangle and the background, that transition is more gradual, which has for effect to visually reduce the stair-stepped pixels artifact. This is what we call anti-aliasing. The understand anti-aliasing you need to study signal processing theory, which again is a very large and pretty complex topic on its own.

The reason why it is best to choose N as a power of 2, is because most processors these days, can run several instructions in parallel and the number of instruction run in parallel is also generally a power of 2. You can look on the Web for things such as SSE instruction sets which are specific to CPUs, but GPUs use the same concept. SSE is a feature that is available on most modern CPU and that can be used to run generally 4 or 8 floating point calculations at the same (in one cycle). All this means is that, for the price of 1 floating point operation, you actually get 3 or 7 for free. This can in theory speed up your rendering time by 4 or 8 (you can never reach that level of performance though because you need to pay a small penalty for setting these instructions up). You can use SSE instructions for example to render 2x2 sup-pixels for the cost of computing 1 pixel, and as a result you get smoother edges (the stair-stepped edges are less visible).

Rendering Blocks of Pixels

Another common technique to accelerate rasterization is to render blocks of pixels, but rather than testing all pixels contained in the block, we first start to test pixels at the corners of the block. GPUs algorithms can use blocks of 8x8 pixels. In the fact, the technique that is being used is more elaborate and is based on a concept of tiles, but we won't detail it here. If all four corners of that 8x8 grid cover the triangle, then necessarily, the other pixels of the grid also cover the rectangle (as shown in figure 7). In that case, there is no need to test all the other pixels which saves obviously a lot of time. They can just be filled up with the triangle colors. If vertex attributes needs to interpolated across the pixels block, this is also straightforward because if you have computed them at the block's corners, then all you need to do is linearly interpolate them in both direction (horizontally and vertically). This optimisation only works when triangles are close to the screen and thus large in screen space. Small triangles don't benefit from this technique.

Optimizing the Edge Function

The edge function too can be optimized. Let's have a look at the function implementation again:

int orient2d(const Point2D& a, const Point2D& b, const Point2D& c) { return (b.x-a.x)*(c.y-a.y) - (b.y-a.y)*(c.x-a.x); }

Recall that a and b in this function are the triangle vertices and that c is the pixel coordinates (in raster space). Once interesting to note is that this function is going to be called for each pixel that is contained within the bounding box of the triangle. Though while we iterate over multiples pixels only c changes. The variable a and b stay the same. Suppose we evaluate the equation one time and get a result w0:

w0 = (b.x-a.x)*(c.y-a.y) - (b.y-a.y)*(c.x-a.x);

Then c.x gets incremented by an amount s (the per pixel step). The new value of w0 is going to be:

w0_new = (b.x-a.x)*(c.y-a.y) - (b.y-a.y)*(c.x+s-a.x);

Subtracting the first equation from the second, we get:

w0_new - w0 = -(b.y-a.y)*s;

-(b.y-a.y)*s is a constant value for a given triangle, because s is the same amount each time (one pixel), and a and b as already mentioned are constant too. We can calculate it once and store it in a variable (call it w0_step) and then the calculation reduces to:

w0_new = w0 + w0step;

You can do this for w1 and w2, and also do a similar thing for the c.y step.

The edge function uses 2 mults and 5 subs by with this trick it can be reduced to a simple addition (of course you need to compute a few initial values). This technique is well documented on the internet. We won't be using it in this lesson but we will study it in more detail and implement it in another lesson devoted to advanced rasterization techniques.

Fixed Point Coordinates

Figure 8: fixed point coordinates.

Finally and to conclude this section, we will briefly talk about the technique that consists of converting vertex coordinates which are initially defined in floating-point format to fixed-point format just before the rasterization stage. Fixed-point is the fancy word (in the fact the correct technical term) for integer. When vertex coordinates are converted from NDC to raster space, they are then also converted from floating point numbers to fixed point numbers. Why do we do that? There is no easy and quick answer to this question. But to be short, let's just say that GPUs use fixed point arithmetic because from a computing point of view, manipulating integers is easier and faster than manipulating floats or doubles (it only requires logical bit operations). Again this is just a very generic explanation. The conversion from floating point to integer coordinates and how is the rasterization process implemented using integer coordinates is a large and complex topic which is not documented on the Internet (you will find very little information about this topic which is very strange considering that this very process is central to the way modern GPUs work).

The conversion step involves to round off the vertex coordinates to the nearest integer. Though if you only do so, then you sort of snap the vertex coordinates to the nearest pixel corner coordinates. This is not so much an issue when you render a still image, but it creates visual artefacts with animation (vertices are snapped to different pixels from frame to frame). The workaround is to actually convert the number to the smallest integer value but to also reserve some bits to encode the sub-pixel position of the vertex (the fractional part of the vertex position). GPUs typical use 4 bits to encode sub-pixel precision (you can search graphics APIs documentation for the term sub-pixel precision). In other words, on a 32 bits integer, 1 bit might is used to encode the number's sign, 27 bits are used to encode the vertex integer position, and 4 bits are used to encode the fractional position of the vertex within the pixel. This means that the vertex position is "snapped" to the closest corner of a 16x16 sub-pixel grid as shown in figure 8 (with 4 bits, you can represent any integer number in the range [1:15]). Somehow the vertex position is still snapped to some grid corner, but snapping in this case is less of a problem than when the vertex is snapped to the pixel coordinates. This conversion process leads to many other issues one of which is integer overflow (overflow occurs when the result of an arithmetic operation produces a number that is greater than the number that can be encoded with the number of available bits). This may happen because integers cover a smaller range of values than floats. Things gets also sort of complex when anti-aliasing is thrown into the mix. It would take a lesson on its own to explore the topic in detail.

Fixed points coordinates allow to to speed up the rasterization process and the edge function even more. This is one of the reasons for converting vertex coordinates to integers. These optimisation techniques will be presented in another lesson.

Notes Regarding our Implementation of the Rasterization Algorithm

Finally, we are going to quickly review the code provided in the source code chapter. Here is a description of its main components:

Use this code for learning purposes only. It is not efficient at all. Many simple optimisations could be made but we wrote it for clarity, not efficiency. Our goal is not to write production code but code that helps to learn how things work in principle. Optimizations are left to you but trying to improve the code could be a good exercise.
The object that is being rendered by the program is stored in an include file. This is fine for this small program but this would never be done in a professional application. Geometry files can be very large and including this data in the program means that the size of the program will become very large as well (which is not a good thing, plus compile time is going to increase a lot). Though again for this simple test and because the object we are using is quite light, this is not an issue. Though more generally if you are insure about the way we represent the geometry of a 3D object in a program, check the lesson on this topic in the Modeling and Geometry section. The only information we will be using in this program are the triangle vertices positions (in world space) and their texture or st coordinates.

Here is the result the program should produce. On the left you can see a render of the object in Maya. On the right, our render. As you can see the results are identical in terms of objects position and shading. Maya uses a technique call stochastic sampling that has for effect to make the aliasing artefact slightly less visible than in our render, but the difference is really subtle.

As you can see, there is really nothing magic about rendering. When you know the rules, you can reproduce images produced by professional applications.

As a bonus, we exported the position in world space of the point on the triangle that each pixel in the image overlaps. We then displayed all the points in a 3D viewer. You can see the results below. Not surprisingly, points only show up on parts of the object that are directly visible to camera. This shows that the depth-buffer technique works as expected. Every triangle on the back of the object that is hidden by another triangle is not rendered. The second image just shows a close up of the same point set (right).

Conclusion

The main advantage of rasterization is simplicity and speed. The main drawback is that it is only useful to solve the visibility problem. Remember that rendering involves two steps: visibility and shading. The algorithm is of no use when it comes to shading.

In this lesson, we have told you everything you really needed to know about the rasterization algorithm. Everything else you can learn from there is not so much about the technique itself but more about optimizing it (how to run it in parallel, how to make it efficient, how to run in on the GPU, etc.).

We hope to have provided one of the best introductions to the rasterization technique. If you appreciated this work and learned something while reading this lesson, please consider making a donation.

Exercises

Reference

A Parallel Algorithm for Polygon Rasterization. Juan Pineda. Siggraph 1988.

A Subdivision Algorithm for Computer Display of Curved Surfaces. Edwin Earl Catmull. Thesis 1974.

Fundamentals of Texture Mapping and Image Wrapping. Paul S. Heckbert. Thesis 1989.