An Overview of the Ray-Tracing Rendering Technique
Keywords: ray-tracing, primary ray, camera ray, eye ray, secondary ray, diffuse ray, reflection ray, refraction ray, shader, ray, illumination model, triangle, recursion, light transport, visibility, Whitted-style ray-tracing.
One problem with discussing ray-tracing is that ray-tracing is an overloaded term.

Ray-tracing is a technique for computing the visibility between points. Light transport algorithms are designed to simulate the way light propagates through space (while interacting with objects). They are used (in short) to compute the color of a point in the scene. Do not mix ray-tracing with light transport algorithms. They are two different things. Ray-tracing is not a light-transport algorithm. It is only a technique to compute visibility between points.

Given a set of obstacles in the Euclidean space, two points in the space are said to be visible to each other, if the line segment that joins them does not intersect any obstacles. (definition of visibility on Wikipedia)

## An Overview on the Ray-Tracing Rendering Technique Figure 1: "sliding" a raster image in front of the camera image plane. Figure 2: a ray can hit or miss the scene geometry.

In the first lesson of this section we have already quickly introduced you to the concept of rendering in general and how to produce images of 3D scenes using ray-tracing in particular. We have mentioned the ray-tracing technique in other lessons such as the lesson Rendering an Image of a 3D Scene and Rasterization: a Practical Implementation but the goal in these lessons was more to highlight the differences between the rasterization and the ray-tracing techniques, which are the main two common frameworks (or rendering techniques) for rendering images of 3D objects.

Shall we re-introduce the ray-tracing rendering technique one more time? You should already be familiar with the concept but this lesson is the first in a series in which we will study techniques that are specific to ray-tracing so it seems like a good idea to review one more time what the method is about, how it works, the basic principles of the technique, and what they main differences between rasterization and ray-tracing are.

Ray-tracing is all about rays, and rays is one of the main topics of this lesson. We will learn what they are and how they can be defined in programming. As we know by now, a raster image is made out of pixels. One way of producing an image of a 3D scene, is to actually somehow "slide" this raster image along the image plane of our virtual camera (figure 1), and shoot rays though each pixel in the image, to find which part of the scene each pixel covers. The way we do that, is simply by casting a ray originating from the eye (the camera position) and passing through the centre of each pixel. We then find what object (or objects) from the scene, these rays intersect. If a pixel "sees" something surely it sees the object that is right in front of it in the direction pointed by that ray. The ray direction as we just mentioned can simply be constructed by tracing a line from the camera's origin to the pixel centre and then extending than line into the scene (figure 2).

Now that we know "what" a pixel sees, all we need to do, is repeat this process for every pixel in the image. By setting up the pixel color with the color of the object that each ray passing through the centre of each pixel intersects, then we can form an image of the scene as seen from a particular viewpoint. Note that this method requires to loop over all the pixels in the image and cast a ray into the scene for each one of these pixels. The second step, the intersection step, requires to loop over all the objects in the scene to test if a ray intersects any of these objects. Here is an implementation of this technique in pseudo-code:

// loop over all pixels Vec3f *framebuffer = new Vec3f[imageWidth * imageHeight]; for (int j = 0; j < imageHeight; ++j) { for (int i = 0; i < imageWidth; ++i) { for (int k = 0; k < numObjectsInScene; ++k) { Ray ray = buildCameraRay(i, j); if (intersect(ray, objects[k]) { // do complex shading here but for now basic (just constant color) framebuffer[j * imageWidth + i] = objects[k].color; } else { // or don't do anything and leave it black framebuffer[j * imageWidth + i] = backgroundColor; } } } }

Note that some of the rays might not intersect any geometry at all. For example if you look at figure 2, one of the rays does not intersect the sphere. In this particular case, we generally leave the pixel's color black or set it up to any other color we want the background to be (line 10). Figure 3: trying to recreate reality is difficult.

Note also that in the code above, we set the pixel color with the object color at the intersection point. Though objects from the real world don't look flat. They have complex appearances or looks (which is very hard to reproduce convincingly in painting for example - and that is because the appearance of objects is generally visually very complex). Their brightness changes depending on the amount of light they receive, some are shiny, some are matte, etc. The goal of photorealistic rendering is not only to depict accurately the geometry as seen from a given viewpoint (and solve the visibility problem) but also to simulate the appearance of objects convincingly. Thus, most likely this process involves something more complicated than just returning a constant color (which is what we have done in the code above). In computer graphics, the task of defining the object's actual color at any given point on its surface (knowing that this color changes depending as we just said on external parameters such as how much light an object receives as we all internal parameters such as how shiny or matter the object is) is called shading. It will take quite a few more lessons before we can start studying shading, but it is important that you know about it at this stage.

Ray-tracing is said to be image centric. The outer loops iterate over all the pixels in the image and the inner loop iterates over the objects in the scene. The rasterization algorithm in comparison, is object centric. It requires to loop over all the geometric primitives in the scene (the outer loop), project these primitives onto the screen and then loop over all pixels in the image to find which one of these pixels overlap the screen projected geometry (the inner loop). The inner and outer loop in the two algorithms are swapped. Here is a quick implementation of the rasterization loops in pseudo-code (you can find a complete description of this algorithm in the lesson Rasterization: a Practical Implementation).

// loop over all pixels Vec3f *framebuffer = new Vec3f[imageWidth * imageHeight]; for (int k = 0; k < numObjectsInScene; ++k) { // project object onto the scene using perspective projection ... for (int j = 0; j < imageHeight; ++j) { for (int i = 0; i < imageWidth; ++i) { if (pixelCoversGeometry(i, j, objects[k]) { framebuffer[j * imageWidth + i] = objects[k].color; } } } }

As mentioned several times in the previous lessons, both rasterization and ray-tracing can be used to solve the visibility problem. Though remember that the rendering process is a two steps process: it includes visibility and shading. Rasterization is good for visibility (and much faster than ray-tracing) but not as good as ray-tracing when it comes to shading. Ray-tracing on the other hand can be used for both. Figure 4: computing the visibility between points is easy with ray-tracing.

What are going to explain is pretty central to rendering, so take a moment to read it carefully. Often you will read that ray-tracing is better than rasterization for shading but most text books fail to explain why. (Almost) everything in rendering is about computing the visibility between a point in space and the first visible surface in a given direction, or the visibility between two points. The former is used to solve the visibility problem, the latter to solve things such as shadowing. Rasterization (combined with the depth-buffer algorithm) is very good to find the first visible surface, but is inefficient to solve the visibility between two points. Ray-tracing on the other end, can handle both cases efficiently. Finding the first visible surface is useful to solve the visibility problem. Thus both ray-tracing and rasterization do a good job at that. Shading on the other hand, requires to solve visibility between surfaces. This is used to compute shadows, soft shadows when area lights are used, and more generally global illumination effects such as reflection, refraction, indirect reflections and indirect diffuse. For this particular part of the rendering process, ray-tracing is thus more efficient than rasterization. But keep in mind that any technique that computes the visibility between points can be used for shading and solving the visibility problem. It doesn't have to either be ray-tracing or rasterization but ray-tracing is somehow the lazy way of doing it, though it comes at a price as we will see at the end of this chapter. We will come back on this important concept in the next chapter when we will talk about light transport algorithms.

Ray-tracing sounds like a much better method than rasterization when you look at it that way, though the main issue with ray-tracing is that it involves to compute the intersection of rays with geometry which is a costly operation (i.e. slow). In real-time graphics applications, speed is more important than photo-realism which is why GPUs use rasterization. When photo-realism is more important than speed, ray-tracing is a much better choice, though it is harder to produce images with ray-tracing at interactive frame rates and even less so at real-time frame rates.

"Ray tracing isn't too slow; computers are too slow." (Kajiya 1986)

In summary, using ray-tracing to compute photo-realistic images of 3D objects can be decomposed in essentially three steps:

• Casting Rays: cast a ray for each pixel in the image
• Ray-Geometry Intersection: test if a ray intersects any of the objects in the scene (this requires to loop over all the objects for each cast ray).
• Shading: find out what the object "looks like" at the intersection point between the ray and the object (if an intersection has occurred).

Let's now review these steps one by one.

## Casting Rays into the scene

The first thing we need to do in order to create an image in ray-tracing, is to cast a ray for each pixel in the image. These rays are called camera or primary rays (because they are simply the first rays we will cast into the scene). As we will see later in this lesson, more rays can be spawned from primary rays. These other rays are called secondary rays. They are used to find if a given point in the scene is in shadow or to compute shading effects such as reflection or refraction. When a primary ray is cast into the scene, the next step is to find if it intersects any object in the scene.

When primary rays are used to solve the visibility problem, we can use the term ray-casting.

Generating primary or camera rays is the topic of the next lesson (Ray-Tracing: Generating Camera Rays).

## Testing for Ray-Geometry Intersections

Testing if a ray intersects any object in the scene requires to loop over all the objects in the scene and test the current object against the ray.

Ray ray = buildCameraRay(i, j); for (int k = 0; k < numObjects; ++k) { if (intersect(ray, objects[k]) { // this ray intersects objects[k] } }

Geometry in 3D can be defined in many different ways. Simple shapes such as sphere, disks, planes can be defined mathematically, or parametrically. The shapes of more complex objects can only be described using polygon meshes, subdivisions surfaces or NURBS surfaces (or something equivalent). Figure 5: triangulated meshes are easier to ray-trace than other types of geometry.

The first category of objects (objects represented mathematically) can often be tested for a ray intersection using a geometric or an analytical approach. The ray-sphere intersection test for example can be resolved using both approaches. This is the topic of the lesson Ray-Tracing: Rendering Simple Shapes.

The main problem with the second category of objects is that a ray-geometry intersection methods needs to be implemented for each supported type of surface. NURBS surfaces for example can be ray-traced directly though the solution for doing so is very different from the solution being used to test the intersection between a ray and a polygon mesh. Thus, if you want to support all geometry types (subdivision surface, NURBS, polygon mesh), you need to write one ray-geometry intersection routine per supported type of geometry. This potentially adds a fair degree of complexity to the program's code. The alternative solution (which is the approach chosen by almost ever professional program) is to convert each geometry type into the same internal representation, which in almost 99% of the cases, is going to be a triangulated polygon mesh. Ray-tracing loves triangles!

Why triangles? As suggested, first, it is a better approach to convert the different geometry types to the same internal geometry representation (a process also called tessellation), and it happens that converting almost any type of surface to a polygon mesh is generally easy. From there, converting a polygonal mesh to a triangulated mesh is also pretty straightforward (the only difficulty is with concave polygons). Though triangles are also advantageous because they can be used as a the basic geometric primitive in both ray-tracing and rasterization. Why do both algorithms love triangles is because they have interesting geometrical properties that other types don't have. They are co-planar which is not necessarily the case of faces with more than three vertices. Computing their barycentric coordinates using the edge function approach which we described in the lesson on rasterization is also straightforward. Barycentric coordinates play an important role in shading.

For this reason, a lot of research went into finding the best possible method for testing if a ray intersects a triangle. This problem can be solved geometrically but this is not necessarily the fastest method. Other methods using algebraic solutions have been developed over time. A good ray-triangle intersection routine is one that is both fast and numerically stable. This topic will be studied in the lesson Ray-Tracing: Rendering a Triangle.

Polygonal meshes or other types of surfaces are thus converted to triangles. This can be done either before you load the geometry into the memory of your program (in your modeling tool, before you export the mesh to the renderer), or while you load the geometry at render time. We will use the second option in this series of lessons. Note though that now, not only each camera ray needs to be tested against each object in the scene, but also against each individual triangle making up each polygon object in the scene. In other words, the loops now looks like the following:

for (int j = 0; j < imageHeight; ++j) { for (int i = 0; i < imageWidth; ++i) { for (int k = 0; k < numObjects; ++k) { for (int n = 0; n < objects[k].numTriangles; ++n) { Ray ray = buildCameraRay(i, j); if (intersect(ray, objects[k].triangles[n])) { framebuffer[j * imageWidth + i] = shade(ray, objects[k], n); } } } } }

What this practically means, is that the time it takes to render a scene in ray-tracing is directly proportional to the number of triangles the scene contains. This is not necessarily the case with rasterization. Of course the more triangles you have with a rasterizer the longer it takes to render the frame, though many of these triangles can be thrown away before the render even starts (for example the triangles facing away from the camera if back face culling is on). This is not possible in ray-tracing for a reason we will explain in the shading part of this chapter. All triangles need to be stored in memory and each one of them need to be tested when a ray is cast into the scene. The high-computational cost of ray-tracing (and the fact that render time grows linearly with the number of triangles the scene contains) is one of the algorithm main curses.

### The Trace Function

In ray-tracing the inner loops (the loop over all the objects and all the triangles each object contains) are generally moved to a function which is typically called trace(). The function returns true if an object/triangle was intersected and false otherwise.

bool trace( const Vec3f& rayOrigin, const Vec3f &rayDirection, const Object* objects, const uint32_t &numObjects, uint32_t &objectIndex, uint32_t &triangleIndex ) { bool intersect = false; for (uint32_t k = 0; k < numObjects; ++k) { for (uint32_t n = 0; n < objects[k].numTriangles; ++n)) { if (rayTriangleIntersect(rayOrigin, rayDirection, objects[k].triangles[n]) { intersect |= true; objectIndex = k; triangleIndex = n; } } } return intersect; } int main(...) { ... uint32_t objectIndeix, triangleIndex; for (int j = 0; j < imageHeight; ++j) { for (int i = 0; i < imageWidth; ++i) { if (trace(rayOrigin, rayDirection, objects, numObjects, objectIndex, triangleIndex)) { framebuffer[j * imageWidth + i] = shade(rayOrigin, rayDirection, objects[objectIndex], triangleIndex); } } } ... return 0; } Figure 6: a ray can intersect several objects.

As you can see in figure 6 though, a ray may intersect several triangles. With rasterization, we solve this problem using a depth-buffer. In ray-tracing, we solve this problem by simply creating a variable in the trace() function that keeps track of the closest distance between the ray origin and the intersection point while we iterate over the objects' triangles. When an intersection is found, the distance $t$ to the triangle is tested against this variable (let's call it $t_{nearest}$) which is originally set to infinity (a very large number). If $t$ is smaller than $t_{nearest}$ then $t_{nearest}$ is set to $t$. In the end, when all the triangles of all the objects have been tested, $t_{nearest}$ contains the distance to the closest triangle that the ray intersected if any (we will need to access this triangle again at the shading stage, so we also store some information about that triangle such as the object it belongs to as well as its index in that object's triangle list).

We will talk about rays in more details in the next lesson, though let's just say that a ray is defined by its origin and its (normalized) direction. Then the distance $t$ defines the distance from a point along the ray to the ray's origin. The 3D position of this point on the ray can be found using the following code:
Vec3f P = rayOrigin + t * rayDirection;
bool trace( const Vec3f& rayOrigin, const Vec3f &rayDirection, const Object* objects, const uint32_t &numObjects, uint32_t &objectIndex, uint32_t &triangleIndex, float &tNearest ) { tNearest = INFINITY; // initialize to infinity bool intersect = false; for (uint32_t k = 0; k < numObjects; ++k) { for (uint32_t n = 0; n < objects[k].numTriangles; ++n) { float t; if (rayTriangleIntersect(rayOrigin, rayDirection, objects[k].triangles[n], t) && t < tNearest) { objectIndex = k; triangleIndex = n; tNearest = t; // update tNearest with the distance to the closest intersect point found so far intersect |= true; } } } return intersect; } Figure 7: geometry information such as normals or texture coordinates is used in shading.

At the shading stage, we will need to access some information about the topology of the surface at the intersection point, such as the position of the intersection point itself (generally denoted P), but also the normal of the surface at the intersection point (the surface orientation which is generally denoted N) as well as its texture coordinates (the st or uv coordinates). Other quantities may be needed such as the surface derivatives at the intersection but this is an advanced topic. Finding the normal of the surface at the intersection point is often simple. For triangles, we just need to compute the cross product between two edges of the intersected triangles. For spheres, if we know the position of sphere centre (let's call it C), we just need to compute the vector ||P-C||. More information on this topic can be found in the next lessons. Figure 8: distortion effect is caused by the phenomenon of refraction.

Even though ray-tracing has a high computational cost compared to other approaches such as rasterization, it also has advantages that makes it far more appealing than other rendering techniques. This is particularly true when it comes to shading.

Once we found the object that a ray intersects, we then need to find out what the color of the object is at the intersection point. The color of objects or their intensity generally varies greatly across the surface of any object. This is due to change in lighting as well as the fact that the texture of objects changes across their surface. The color of an object at any point on its surface is "just" the result of the way the object returns or reflects light that impinges on its surface towards the observer. The color at the intersection point depends on: Figure 9: of all the light that is falling on the surface at P, how much of that light is reflected towards the observer?

• How much light (in the real world the concept of light color and intensity are actually one and the same) impinges on the object surface (at the intersection point but also away from the point if the surface is transparent or translucent but that is an advanced topic).
• The direction of that light.
• The property of the surface itself, notably its color.
• The observer position. Most surfaces don't reflect light equally in all directions. Thus the amount of reflected light is likely to change with the camera or observer's position.

We will review this process in detail when get to the lessons on shading. For now, all we want you to note, is that beside the property of the object (such as its color), shading has very much to do with "gathering" information about light that falls on the object surface. In nature, this process can be described as "light rays" emitted from light sources (or indirectly by other surfaces in the scene that reflect light rays of light sources towards other surfaces, a process known as indirect lighting), falling on the surface of objects and being reflected back into the scene. Some of these reflected rays will hit more surfaces and will be reflected again in turn. Other rays, might reach the observer's eyes or the camera. In the real word, light travels from light sources to the eye (figure 8). Though in ray-tracing we generally choose to trace the path of light rays back from the eye to the surface, and then from the surface to the light source, a process which we call backward tracing because it follows the natural propagation of light backward (figure 9).

In an obvious approach to ray tracing, light rays emanating from a source are traced through their paths until they strike the viewer. Since only a few will reach the viewer, this approach is wasteful. In a second approach suggested by Appel, rays are traced in the opposite direction -- from the viewer to the objects in the scene." An Improved Illumination Model for Shaded Display. Turner Whitted, 1980. Figure 10: the function of an illumination model is to define how much light is reflected in the direction $\omega_o$ based on the surface properties such as its color, light intensity and the incident light direction $\omega_i$.

We won't explain how shading works in this chapter or this lesson, though we will provide some information about light transport algorithms in the next chapter. The function of a light transport algorithm is to simulate the way light (in the form of energy) is distributed through the scene. How is a given material reflecting this light back into the environment, is defined by a mathematical model, or illumination model in computer graphics (if you heard the term BRDF before this is more or less what the role of that BRDF is). The Phong model (which is well known) is an example of illumination model. Illumination models are generally implemented in what we call shaders. Shaders are generally implemented as some sort of function that takes as input the incident light direction $\omega_i$ (the Greek letter omega) and intensity, the direction from which the material is looked at $\omega_o$, and return based on the material properties and the illumination model being used, the amount of incident light that is reflected in the direction $\omega_o$. Figure 11: primary and secondary rays (reflection and shadow rays). Note that each one of these secondary rays involves to compute the visibility between two points. For example when we cast a reflection ray from the mirror into the scene, we want to compute the first visible surface that the reflection ray will eventually intersect (which in our example is the diffuse object). When we cast a shadow ray, we want to find if this shadow ray intersects another object before reaching the light. And the easiest way of computing visibility between two points, as you can guess, is of course to use ray-tracing.

Ray-tracing has always been known for being the best solution to generate photo-realistic images of 3D objects, though keep in mind that until recently (up to the late 1990s early 2000s), the technique itself was considered way too expensive for being used in anything else than some research projects. Keep in mind also that computer resources increased but that the complexity of the scenes that are being rendered increases as well. Thus the actual time it takes to render these scenes hasn't changed much over the years. This effect is known as the Blinn's law.

In conclusion we can say that any light transport algorithm that requires heavy computation of visibility (such as when reflection or shadow rays are cast into the scene) is likely to be more efficient with raytracing. Q.E.D. (quod erat demonstrandum, meaning "which had to be proven").

## Ray-Tracing: the Challenges

Ray-tracing is very simple in its principle and "easy" to implement. The most complicated part of the process is to write code or routines to compute the intersection of rays with objects. This problem is often solved using either a geometric or analytical solutions, both implies a lot of mathematics. The main issue though beside the maths, is that computing these intersection tests is also often an expensive process. Though despite this main disadvantage the technique is elegant in its simplicity and make it possible to solve both visibility and shading in one unified framework.

When we say ray-tracing is easy to implement, we should probably precise though that developing a production ray-tracer is a very complex task. Implementing a basic prototype is simple but making the technique robust so that it can be used on large production, is a very challenging problem. A production ray-tracing needs to support many features such as displacement, motion-blur, programmable shading, etc. Many of these features are hard to get working efficiently in ray-tracing. Ray-tracing comes with a whole set of problems the rasterization algorithm doesn't have. These problems are difficult to solve efficiently. One of the main problems with ray-tracing is that it requires all the scene geometry to be stored in memory while the image is being rendered. This is not necessarily the case with rasterization: once we know that an object won't be visibly anywhere else on the screen we can throw it away (and remove it from memory). Triangles facing away from the camera can also be thrown away if back-face culling is on. In ray-tracing, even if an object is not visible in the scene, it might cast shadows on over objects visible by the camera, and thus need to be kept in memory until the last pixel in the image is processed. With production scenes containing millions of triangles, the amount of memory that is required to store the data can become a problem. Figure 13: ray-tracing is very memory consuming.

The ray-intersection step as mentioned in this lesson is the most expensive part of the rendering process in ray-tracing. Making the ray-geometry routine fast is generally not enough. Ray-tracers use what we call acceleration structures to speed up the process. Acceleration structures are spatial structures used to "sort" the geometry in space to avoid having to test all the objects in the scene for each cast ray. Acceleration structures won't be studied in this introduction on rendering, but you will find lessons on the topic in the ray-tracing advanced section. They work well and help to speed up ray-tracing by a great factor but they also use a lot of memory which adds up to the memory that is being already used to store the geometry.

Finally you also need to store information related to the motion of objects. This increase even further the amount of memory needed to store the scene data. All in all, ray-tracers generally use for the same scene far more memory than a rasterizer. Ray-tracing is useful to produce more realistic image but at the expense of being slower and using more memory than a program based on the rasterization algorithm. The algorithm is simple conceptually, but very hard to get working reliably and efficiently on very large and complex scenes with complex shading.

## Ray-Tracing on the GPU?

The ray-tracing algorithm was first successfully ported on the GPU in the mid-90. Quite a lot of research has been made in this area, though the ray-tracing algorithm is not yet natively supported by most consumer graphics card (hardware to accelerate ray-tracing has been around for a very long while). It is very likely that GPUs will support ray-tracing natively in the years to come. Though as mentioned above, it needs to overcome issues such as memory management, etc. which as you know with ray-tracing is a more difficult problem than with rasterization.