A Minimal Ray-Tracer: Rendering Simple Shapes (Sphere, Cube, Disk, Plane, etc.)

News (August, 31): We are working on Scratchapixel 3.0 at the moment (current version of 2). The idea is to make the project open source by storing the content of the website on GitHub as Markdown files. In practice, that means you and the rest of the community will be able to edit the content of the pages if you want to contribute (typos and bug fixes, rewording sentences). You will also be able to contribute by translating pages to different languages if you want to. Then when we publish the site we will translate the Markdown files to HTML. That means new design as well.

That's what we are busy with right now and why there won't be a lot of updates in the weeks to come. More news about SaP 3.0 soon.

We are looking for native Engxish (yes we know there's a typo here) speakers that will be willing to readproof a few lessons. If you are interested please get in touch on Discord, in the #scratchapixel3-0 channel. Also looking for at least one experienced full dev stack dev that would be willing to give us a hand with the next design.

Feel free to send us your requests, suggestions, etc. (on Discord) to help us improve the website.

And you can also donate). Donations go directly back into the development of the project. The more donation we get the more content you will get and the quicker we will be able to deliver it to you.

11 mns read.

Ray-Box Intersection

Figure 1: equation of a line. The equation of a line can be written as y=mx+b. For the oblique line which equation is y=x-1 we have m=1 and b=-1.

In the following example, we will assume that the box is aligned with the axes of our Cartesian coordinate system. Such a box is called an axis-aligned box or an axis-aligned bounding box (AABB). Boxes which are not oriented along the axis of a cartesian coordinate system are called oriented bounding box (OBB).

Computing the intersection of a ray with an AABB is quite simple. All we need is to remember that a straight line can be defined as:

$$y = mx + b$$

In mathematics, the \(m\) term is called the slope (or gradient) and is actually responsible for the orientation of the line and \(b\) corresponds to the point where the line intersects the y-axis. If this 1D line is parallel to the x-axis as shown in figure 1 (with the line whose equation is y=2), then \(m\) is equal to 0.

A ray can also be expressed as:

$$O + Dt$$

where \(O\) is the origin of the ray, \(D\) its direction and the parameter \(t\) can be any real value (negative, positive or zero). By changing \(t\) in this equation, we can define any point on the line defined by the ray's position and direction. This equation is very similar to the equation of a line. In fact they are the same: replace \(O\) with \(b\) and \(D\) with \(m\). To represent an axis-aligned bounding volume, all we need are two points representing the minimum and maximum extent of the box (called bounds in the code).

class Box3 { public: Box3(cont Vec3f &vmin, const Vec3f &vmax) { bounds[0] = vmin; bounds[1] = vmax; } Vec3f bounds[2]; };

The bounds of the volume define a set of lines parallel to each axis of the coordinate system which we can also expressed using the line equation. Let's say we have a line defined by the equation:

$$y = B0_x$$

Were \(B0.x\) is the x component of the bounding box minimum coordinates (that would be bounds[0].x in the code). To find where the ray intersects this line we can write:

$$O_x + tD_x = B0_x \text{ (eq1)}$$

Which can solved by reordering the terms:

$$t0x = (B0_x - O_x) / D_x \text{ (eq2)}$$

The x component of the bounding volume's maximum extent can be used in a similar way to compute the variable \(t1x\). Note than when the values for \(t\) are negative, intersections are "behind" the ray (behind with respect to the ray's origin \(0\) and its direction \(D\)).

Finally if we apply the same technique to the y and z components, at the end of this process, we will have a set of six values indicating where the ray intersects the planes defined by the six faces of the box.

$$ t0x = (B0_x - O_x) / D_x\\ t1x = (B1_x - O_x) / D_x\\ t0y = (B0_y - O_y) / D_y\\ t1y = (B1_y - O_y) / D_y\\ t0z = (B0_z - O_z) / D_z\\ t1z = (B1_z - O_z) / D_z\\ $$

Figure 2: a ray intersecting a 2D box.

Note that if the ray is parallel to an axis it won't intersect with the bounding volume plane for this axis (in this case, the line equation for the ray is reduced to the constant \(b\) and there's no solution to equation 1). The next step is to find which of these six values correspond to an intersection of the ray with the box (if the ray intersects the box at all). So far, we have just found where the ray intersects the planes defined by each face of the box. We know this can be found using each of the calculated \(t\) values. By looking at figure 2, the logic behind this test will become clear (we will just be considering the 2D case for this example). The ray first intersects the planes defined by the minimum extent of the box in two places: \(t0x\) and \(t0y\). However intersecting these planes doesn't necessarily mean that these intersecting points lie on the cube (if they don't lie on the cube, obviously the ray doesn't intersect the box). Mathematically, we can find which one of these two points lie on the cube by comparing their values: for that, we need to chose the point whose \(t\) is the biggest. In pseudo-code we can write:

t0x = (B0x - Ox) / Dx t0y = (B0y - Oy) / Dy tmin = (t0x > t0y) ? t0x : t0y

The process to find the second point where the ray intersects the box is similar however in that case, we will use the \(t\) values computed using the planes defined by the maximum extent of the box and select the point whose \(t\) is the smallest:

t1x = (B1x - Ox) / Dx t1y = (B1y - Oy) / Dy tmax = (t1x < t1y) ? t1x : t1y

Figure 3: a ray intersecting a 2D box.

The ray doesn't necessarily intersect the box. In figure 3, we are showing a couple of cases where the ray misses the cube. Comparing the \(t\) values can help us find these cases. As you can see on the figure, the ray misses the box when \(t0x\) is greater than \(t1y\) or when \(t0y\) is greater than \(t1x\). Let's add this test to our code:

if (t0x > t1y || t0y > t1x) return false

Finally we can extend the technique to the 3D case by computing \(t\) values for the z component and compare them to the \(t\) values we have computed so far for the x and y components:

t0z = (B0z - Oz) / Dz t1z = (B1z - Oz) / Dz if (tmin > t1z || t0z > tmax) return false if (t0z > tmin) tmin = t0z if (t1z < tmax) tmax = t1z

Here is a complete implementation of the method in C++. The variables min and max are the minimum and maximum extent (coordinates) of the bounding box:

bool intersect(const Ray &r) { float tmin = (min.x - r.orig.x) / r.dir.x; float tmax = (max.x - r.orig.x) / r.dir.x; if (tmin > tmax) swap(tmin, tmax); float tymin = (min.y - r.orig.y) / r.dir.y; float tymax = (max.y - r.orig.y) / r.dir.y; if (tymin > tymax) swap(tymin, tymax); if ((tmin > tymax) || (tymin > tmax)) return false; if (tymin > tmin) tmin = tymin; if (tymax < tmax) tmax = tymax; float tzmin = (min.z - r.orig.z) / r.dir.z; float tzmax = (max.z - r.orig.z) / r.dir.z; if (tzmin > tzmax) swap(tzmin, tzmax); if ((tmin > tzmax) || (tzmin > tmax)) return false; if (tzmin > tmin) tmin = tzmin; if (tzmax < tmax) tmax = tzmax; return true; }

Note that depending on the ray direction tmin might be greater than tmax. Considering that all the logic behind the tests we are doing later on relies on \(t0\) being smaller than \(t1\), we need to swap the values if \(t1\) is smaller than \(t0\).

Optimizing the Code

There is a couple of improvements we can make to this code to not only make it faster but also more robust. First, we can replace the swap operation with the following code:

if (ray.dir.x >= 0) { tmin = (min.x - r.orig.x) / ray.dir.x; tmax = (max.x - r.orig.x) / ray.dir.x; } else { tmin = (max.x - r.orig.x) / ray.dir.x; tmax = (min.x - r.orig.x) / ray.dir.x; }

However there is a problem with this code. When the value for the ray x, y or z component is 0, it causes a division by zero. The compiler though should handle this gracefully and return +INF if this happens, so at first glance this doesn't sound like a problem. The problem is that compilers don't make the difference between 0 and -0 (they consider that -0 == 0 and will return true). So if either one of the ray direction's component has a value of -0, the values tmin/tmax will be set to +INF and -INF respectively, instead of -INF and +INF. This is a problem because instead of executing the second block of the if-statement (the value is negative, not positive so the if (ray.dir.x >= 0) should return false), the first block will be executed instead, and consequently it will fail to detect the intersection (if there's one). To fix the problem, we can replace the ray direction by the inverse of the ray direction:

Vec3f invdir = 1 / ray.dir; if (invdir.x >= 0) { tmin = (min.x - r.orig.x) * invdir.x; tmax = (max.x - r.orig.x) * invdir.x; } else { tmin = (max.x - r.orig.x) * invdir.x; tmax = (min.x - r.orig.x) * invdir.x; }

When the ray direction is -0, the inverse value of the direction's component will be set to -INF and the test if (invdir.x >= 0) will now return false as desired.

In the eventuality of testing the intersection of the ray against many boxes, we can save some time by pre-computing the inverse direction of the ray in the ray's constructor and re-using it later in the intersect function. We can also pre-compute the sign of the ray direction which we can use to write the method in a much more compact form. Here is the final version of the ray-box intersection method:

class Ray { public: Ray(const Vec3f &orig, const Vec3f &dir) : orig(orig), dir(dir) { invdir = 1 / dir; sign[0] = (invdir.x < 0); sign[1] = (invdir.y < 0); sign[2] = (invdir.z < 0); } Vec3 orig, dir; // ray orig and dir Vec3 invdir; int sign[3]; }; bool intersect(const Ray &r) const { float tmin, tmax, tymin, tymax, tzmin, tzmax; tmin = (bounds[r.sign[0]].x - r.orig.x) * r.invdir.x; tmax = (bounds[1-r.sign[0]].x - r.orig.x) * r.invdir.x; tymin = (bounds[r.sign[1]].y - r.orig.y) * r.invdir.y; tymax = (bounds[1-r.sign[1]].y - r.orig.y) * r.invdir.y; if ((tmin > tymax) || (tymin > tmax)) return false; if (tymin > tmin) tmin = tymin; if (tymax < tmax) tmax = tymax; tzmin = (bounds[r.sign[2]].z - r.orig.z) * r.invdir.z; tzmax = (bounds[1-r.sign[2]].z - r.orig.z) * r.invdir.z; if ((tmin > tzmax) || (tzmin > tmax)) return false; if (tzmin > tmin) tmin = tzmin; if (tzmax < tmax) tmax = tzmax; return true; }

The optimized version runs on our machine approximately 1.5 faster than the first version (the speedup depends on the compiler).

The code from this lesson returns intersections with the box that are either in front or behind the origin of the ray. For instance, if the ray's origin is inside the box (adjacent image), there will be two intersections: one in front of the ray and one behind. We know that an intersection is "behind" the origin of the ray when \(t\) is negative. When \(t\) is positive, the intersection is in front of the origin of the ray (with respect to the ray's direction).

Check the file raybox.cpp (which you can find in the last chapter of this lesson) for a complete example.


An Efficient and Robust Ray–Box Intersection Algorithm. Amy Williams et al. 2004.