Rendering an Image of a 3D Scene: an Overview
Keywords: ray tracing, rasterization, rasterisation, pixel, discretisation, raster, aliasing, vector graphics, triangle, triangulation, smooth shading, sampling, tessellation, polygonal mesh, NURBS, subdivision surface, voxel, implicit surface, meatballs, bloodies, rendering primitive, perspective projection, foreshortening effect, orthographic projection, perspective divide, hidden surface elimination, hidden surface determination, hidden surface removal, occlusion culling, visible surface determination, shading, absorption, reflection, normal, scattering, shader, light transport, forward tracing, backward tracing, indirect diffuse, wireframe, frustum, screen space, raster space, depth sorting algorithm, acceleration structure, law of reflection, reflected direction, incident direction, Snell law, Fresnel equation, roughness, specular reflection, subsurface scattering, indirect diffuse, indirect specular, caustics, soft shadows, global illumination, radiosity, Monte Carlo ray tracing, primary ray, secondary ray, shadow ray, unidirectional path tracing, Appel, Whitted, photon mapping, albedo, energy conservation, physically plausible rendering, cosine law.

Introduction

Lesson 1, provided you with a quick introduction to some important concepts in rendering and computer graphics in general, as well as the source code of a small ray tracer (with which we rendered a scene containing a few spheres). Ray tracing is a very popular technique for rendering a 3D scene (mostly because it is easy to implement and also a more natural way of thinking of the way light propagates in space, as quickly explained in lesson 1), however other methods exist. In this lesson, we will look at what rendering really means, what sort of problems we need to solve to render an image of a 3D scene as well as quickly review the most important techniques that were developed to solve these problems specifically: our studies will be focused on the ray tracing and rasterization method, two popular algorithms used to solve the visibility problem (finding out which objects making up the scene is visible through the camera). We will also look at shading, the step in which the appearance of the objects as well as their brightness is defined.

The journey in the world of computer graphics starts... with a computer. It might sound strange to start this lesson by stating what may seems obvious to you, but it is in fact so obvious that we do take this for granted and never think of what it really means when it comes to making images with a computer. More than a computer, what we should be concerned about is how we display images with a computer: the computer screen. Both the computer and the computer screen have something important in common. They work with discrete structures to the contrary of the world around us, which is made of continuous structures (at least at the macroscopic level). These discrete structures are the bit for the computer and the pixel for the screen. Let's take a simple example. Take a thread in the real world. It is indivisible. But the representation of this thread onto the surface of a computer screen requires to "cut" or "break" it into small pieces called pixels. This idea is illustrated in figure 1.

Figure 1: in the real world, everything is "continuous". But in the world of computers, an image is made of discrete blocks, the pixels.

Figure 2: the process of representing an object on the surface of a computer can be seen as if a grid was laid out on the surface of the object. Every pixel of that grid overlapping the object is filled in with the color of the underlying object. But what happens when the object only partially overlaps the surface of a pixel? Which color should we fill the pixel with?

In computing the process of actually converting any continuous object (a continuous function in mathematics, a digital image of a thread) is called discretisation. Obvious? Yes and yet, most problems if not all problems in computer graphics comes from the very nature of the technology a computer is based onto: 0, 1 and pixels.

You may still think "who cares?". For someone watching a video on a computer, it's probably not very important indeed. But if you have to create this video, this is probably something you should care about. Think about this. Let's imagine we need to represent a sphere on the surface of a computer screen. Let's look at a sphere and apply a grid of top of it. The grid represents the pixels your screen is made of (figure 2). The sphere overlaps some of the pixels completely. Some of the pixels are also empty. However, some of the pixels have a problem. The sphere overlaps them only partially. In this particular case what should we fill the pixel with: the color of the background or the color of the object?

Intuitively you might think "if the background occupies 35% of the pixel area, and the object 75%, let's assign a color to the pixel which is composed of the background color for 35% and of the object color for 75%". This is pretty good reasoning, but in fact, you just moved the problem around. How do you compute these areas in the first place anyway? One possible solution to this problem is to subdivide the pixel into sub-pixels and count the number of sub-pixels the background overlaps and assume all over sub pixels are overlapped by the object. The area covered by the background can be computed by taking the number of sub-pixels overlapped by the background over the total number of sub-pixels.

Figure 4: to approximate the color of a pixel which is both overlapping a shape and the background, the surface can be subdivided into smaller cells. The pixels color can be found by computing the number of cells overlapping the shape multiplied by the shape's color plus the number of cells overlapping the background multiplied by the background color, divided by the entire number of cells. However no matter how small the cells, some of them will always overlap both the shape and the background.

However, no matter how small the sub-pixels, there will always be a some of them overlapping both the background and the object. While you might get a pretty good approximation of the object and background coverage that way (the smaller the sub-pixels the better the approximation), it will always just be an approximation. Computers can only approximate. Different techniques can be used to compute this approximation (subdividing the pixel into sub-pixels is just one of them), but what we need to remember from this example, is that a lot of the problems we will have to solve in computer sciences and computer graphics, comes from having to "simulate" the world which is made of continuous structures with discrete structures. And having to go from one to the other raises in fact all sort of complex problems (or maybe simple in their comprehension, but complex in their resolution).

Another way of solving this problem, is also obviously to increase the resolution of the image. In other words, to represent the shame shape (the sphere) using more pixels. However, even then, are we limited by the resolution of the screen.

Images and screens using a two-dimensional array of pixels to represent or display images, are called raster graphics and raster displays respectively. The term raster more generally defines a grid of x and y coordinates on a display space. We will learn more about rasterisation, in the chapter on perspective projection.

As suggested, the main issue with representing images of objects with a computer, is that the object shapes needs to be "broken" down into discrete surfaces, the pixels. Computers more generally can only deal with discrete data, but more importantly the definition with which numbers can be defined in the memory of the computer, is limited by the numbers of bits used to encode these numbers. The number of colors for example that you can display on a screen is limited by the number of bits used to encode RGB values. In the really early days of computers, a single bit was used to encode the "brightness" of pixels on the screen. When the bit had the value 0 the pixel was black and when it was 1, the pixel would be white. The first generation of computers using color displays, encoded color using a single byte or 8 bits. With 8 bits (3 bits for the red channel, 3 bits for the green channel and 2 bits for the blue channel) you can only define 256 distinct colors (2^3 * 2^3 * 2^2). What happens then when you want to display a color which is not one of the colors you can use? The solution is to find the closest possible matching color from the palette to the color you ideally want to display, and display this matching color instead. This process is called color quantization.

Figure 5: our eyes can perceive very small color variation. When too few bits are used to encode colors, banding occurs (right).

The problem with color quantization is that when we don't have enough colors to accurately sample a continuous gradation of color tones, continuous gradients appear as a series of discrete steps or bands of color. This effect is called banding (it's also known under the term posterisation or false contouring).

There's obviously no need to care about banding so much these days (the most common image formats uses 32 bits to encode colors. With 32 bits you can display about 16 millions distinct colors), however keep in mind that fundamentally, colors and pretty much any other continuous function that we may need to represent in the memory of a computer, have to be broken down into a series of discrete or single quantum values which precision is limited by the number of bits used to encode these values.

Figure 6: the shape of objects smaller than a pixel can't be accurately captured by a digital image.

Finally, having to break down continuous function into discrete values may lead to what's know in signal processing and computer graphics as aliasing. The main problem with digital images is that the amount of details you can capture depends on the image resolution. The main issue with this is that small details (roughly speaking, details smaller than a pixel) can't be captured by the image accurately. Imagine for example that you want to take a photograph with a digital camera of a teapot which is so far away though that the object is smaller than a pixel in the image (figure 6). A pixel is a discrete structure thus we can only fill it up with a constant color. If in this example, we fill it up with the teapot's color (assuming the teapot has a constant color which is probably not the case if it's shaded), your teapot will only show up as a dot in the image: you failed capturing the teapot's shape (and shading). In reality, aliasing is far more complex than that, but you should know about the term and know for now, and should keep in mind that by the very nature of digital images (because pixels are discrete elements), an image of a given resolution can only accurately represent objects of a given size. We will explain what the relationship between the objects size and the image resolution is in the lesson on Aliasing (which you can find in the Mathematics and Physic of Computer Graphics section)

Images are just a collection of pixels. As mentioned before, when an image of the real world is stored in a digital image, shapes are broken down into discrete structures, the pixels. The main drawback of raster images (and raster screens) is that the resolution of the images we can store or display is limited by the image or the screen resolution (its dimension in pixels). Zooming in, doesn't reveal more details in the image. Vector graphics were designed to address this issue. With vector graphics, you do not store pixels but represent the shape of objects (and their colors) using mathematical expressions. That way, rather than being limited by the image resolution, the shapes defined in the file can be rendered on the fly at the desired resolution, producing an image of the object's shapes which is always perfectly sharp.

All we wanted to put the emphasise on in this introduction, is again and to summarize, that computers work with quantum values when in fact, processes from the real world which we want to simulate with computers, are generally (if not always) continuous (at least at the macroscopic and even microscopic scale). And in fact, this is a very fundamental issue which is causing all sort of very puzzling problems, to which a very large chunk of computer graphics research and theory is devoted to.

Another field of computer graphics in which the discrete representation of the world is a particular issue, is fluid simulation. The flow of fluids by their very nature is a continuous process, but to simulate the motion of fluids with a computer, we need to divide space into "discrete" structures generally small cubes called cells.