What Will You Learn in this Lesson?
In the previous lesson, we learned about some key concepts involved in the process of generating images, however we didn't speak specifically about cameras. 3D rendering is not only about producing realistic image by the mean of perspective projection. It is also about being able to deliver images similar to that of real world cameras. Why? Because when CG images are combined with live-action footage, images delivered by the renderer need to match images delivered by the camera with which that footage was produced. In this lesson, we will develop a camera model that allows us to simulate results produced by real cameras (we will use with real-world parameters to set the camera). To do so, we will first start to review how film and photographic cameras work.
More specifically. We will show in this lesson how to implement a camera model similar to that used in Maya and most (if not all) 3D applications (such as Houdini, 3DS Max, Blender, etc.). We will show the effect each control you can find on a camera has on the final image and how to simulate these controls in CG. This lesson will answer all questions you may have about CG cameras such as what does the film aperture parameter do and how the focal length parameter relates to the angle of view parameter.
While the optical laws involved in the process of generating images with real-world camera are simple, they can be hard to reproduce in CG, not because they are complex to simulate but because they are essentially and potentially very costly to simulate. Hopefully though you don't need very complex cameras to produce images. In fact it's quite the opposite. You can take photographs with a very simple imaging device called a pinhole camera which is just a box with a small hole on one side and photographic film lying on the other. Images produced by pinhole cameras are much easier to reproduce (and less costly) than those produced with more sophisticated cameras, and for this reason, the pinhole camera is actually the model used by most (if not all) 3D applications and video games. Let's start to review how these cameras work in the real world and build a mathematical model from there.
Camera Obscura: How is an Image Formed?
Most algorithms we use in computer graphics simulate how things work in the real world. This is particularly true of virtual cameras which are fundamental to the process of creating a computer graphics image. The creation of an image in a real camera is actually pretty simple to reproduce with a computer. It mainly relies on simulating the way light travels in space and interacts with objects including camera lenses. The light-matter interaction process is highly complex but the laws of optics are relatively simple and can easily be simulated in a computer program. There is two main parts to the principle of photography:
- The process by which an image is stored on film or to a file.
- The process by which this image is actually created in the camera.
In computer graphics, we don't need a physical support to store an image thus simulating the photochemical processes used in traditional film photography won't be necessary (unless like the Maxwell renderer, you want to provide a realistic camera model but this is not necessary to get a basic model working).
Now lets talk about the second part of the photography process: how images are formed in the camera. The basic principle of the image creation process is actually very simple and showed in the reproduction of this illustration published in the early 20th century (figure 1). In the setup from figure 1, the first surface (in red) blocks light from reaching the second surface (in green). However if you make a small hole (a pinhole), light rays can then pass through the first surface in one point and by doing so, form an (inverted) image of the candle on the other side (if you follow the path of the rays from the candle to the surface onto which the image of the candle is projected, you can see how the image is geometrically constructed). In reality, the image of the candle will be very hard to see because the amount of light emitted by the candle actually passing through point B is really very small compared to the overall amount of light emitted by the candle itself (only a fraction of the light rays emitted by the flame or reflected off of the candle will pass through the hole).
A camera obscura (which in Latin means dark room) works on the exact same principle. It is a lightproof box or room with a black interior (to prevent light reflections) and a tiny hole in the center on one end (figure 2). Light passing through the hole forms an inverted image of the external scene on the opposite side of the box. This simple device led to the development of photographic cameras. You can perfectly convert your own room into a camera obscura as shown in this video from the National Geographic (all rights reserved):
To perceive the projected image on the wall your eyes first need to adjust to the darkness of the room, and to capture the effect on a camera, long exposure times are needed (from a few seconds to half a minute). To turn your camera obscura into a pinhole camera all you need to do is put a piece of film on the face opposite the pinhole. If you wait long enough (and keep the camera perfectly still), light will modify the chemicals on the film and a latent image will form over time. The principle for digital camera is the same but the film is replaced by a sensor that converts light into electrical charges.
How Does Real Camera Work?
In real camera, images are created when light falls on a surface which is sensitive to light (note that this actually also true for the eye). For a film camera, this is the surface of the film and for a digital camera this is the surface of a sensor (or CCD). Some of these concepts have been explained in the lesson Introduction to Ray-Tracing but we will explain them again here briefly.
In the real world, light comes from various light sources (the most important one being the sun). When light hits an object, it can either be absorbed or reflected back into the scene. This phenomenon is explained in details in the lesson devoted to light-matter interaction which you can find in the section Mathematics and Physics for Computer Graphics. When you take a picture, some of that reflected light (in the form of packets of photons) travels in the direction of the camera and passes through the pinhole to form a sharp image on the film or digital camera sensor. We have illustrated this process in figure 3.
If you remove the back door of a disposable camera and replace it with a translucent plastic sheet, you should be able to see the inverted image that is normally projected onto the film (as shown in the images below).
The simplest type of camera we can find in the real world is the pinhole camera. It is a simple lightproof box with a very small hole in the front which is also called an aperture, and some light-sensitive film paper laid inside the box on the side facing this pinhole. When you want to take a picture, you simply open the aperture to expose the film to light (to prevent light from entering the box, you keep a piece of opaque tape on the pinhole which you remove to take the photograph and put back afterwards).
The principle of a pinhole camera is simple. Objects from the scene reflect light in all directions. The size of the aperture is so small that among the many rays that are reflected off at P, a point on the surface of an object in the scene, only one of these rays enter the camera (in reality it's never exactly one ray, but more a bundle of light rays or photons composing a very narrow beam of light). In figure 3, we can see how one single light ray among the many reflected at P passes through the aperture. In figure 4, we have colored six of these rays to track their path to the film plane more easily; notice one more time by following these rays how they form an image of the object rotated by 180 degrees. In geometry, the pinhole is also called the center of projection; all rays entering the camera converge to this point and diverge from it on the other side.
To summarize: light striking an object is reflected back in random directions in the scene but only one of these rays (or more exactly a bundle of these rays traveling along the same direction) enters the camera and strike the film in one single point. To each point in the scene corresponds a single point on the film.
What we call a point for simplification, is in fact a small area on the surface of an object or a small area on the surface of the film. It would be best to describe the process involved as an exchange of light energy between surfaces (the emitting surface of the object and the receiving surface or the film in our example), but for simplification, we will just treat these small surfaces as points for now.
The size of the aperture matters. To get a fairly sharp image each point (or small area) on the surface of an object needs to be represented as one single point (another small area) on the film. As mentioned before, what passes through the hole is never exactly one ray but more a small set of rays contained within a cone of directions. The angle of this cone (or more precisely its angular diameter) depends on the size of the hole as showed in figure 6.
The smaller the pinhole, the smaller the cone and the sharper the image. However, a smaller pinhole requires a longer exposure time because as the hole becomes smaller, the amount of light passing through the hole and striking the surface of the film decreases. It takes a certain amount of light for an image to form on the surface of a photographic paper, thus the less light it receives, the longer the exposure time. It won't be a problem for a CG camera, but for real pinhole cameras, a longer exposure time increases the risk of producing a blurred image if the camera is not perfectly still or if objects from the scene move. As a general rule, the shorter the exposure time the better. There is a limit though to the size of the pinhole. When it gets very small (when the size of the hole is about the same as the light's wavelength), light rays are diffracted which is not good either. For a shoe-box sized pinhole camera, a pinhole of about 2 mm in diameter should produce optimum results (a good compromise between image focus and exposure time). Note that when the aperture is too large (figure 5 bottom), a single point on the image if you keep using the concept of point or discrete lines to represent light rays (for example point A or B in figure 5) appears multiple times on the image. A more accurate way of visualizing what's happening in that particular case, is to imagine the footprints of the cones overlapping each over on the film (figure 6 bottom). As the size of the pinhole increases, the cones become larger and the amount of overlap increases. The fact that a point appears multiple time in the image (in the form of the cone's footprint or spot becoming larger on the film, which you can see as the color of the object at the light ray's origin being spread out on the surface of the film over a larger region rather than appearing as a singular point as it theoretically should) is what causes an image to be blurred (or out of focus). In photography, this effect is much more visible when you take a picture of very small and bright objects on a dark background such as fairy lights at night for instance (figure 8). Because they are small and generally spaced away from each other, the disks they generate on the picture (when the hole of the camera is too large) are clearly visible. In photography, these disks (which are not always perfectly circular in shape but explaining why is outside the scope of this lesson) are called circles of confusion or disks of confusion, blur circles, blur spots, etc. (figure 8).
To better understand the image formation process we created two short animations showing light rays from two disks passing through the camera's pinhole. In the first animation (figure 9), the pinhole is small and the image of the disks is sharp because each point on the object corresponds to a single point on the film.
The second animation (figure 10) shows what happens when the pinhole is too large. In this particular case, each point on the object corresponds to multiple points on the film. The result is a blurred image of the disks.
In conclusion, to produce a sharp image we need to make the aperture of the pinhole camera as small as possible to ensure than only a narrow beam of photons coming from one single direction enters the camera and hits the film or sensor in one single point (or a surface as small as possible). The ideal pinhole camera is one that has an aperture so small that only a single light ray enters the camera for each point in the scene. Such a camera can't be built in the real world though for reasons we already explained (when the hole gets too small, light rays are diffracted) but it can in the virtual world of computers (in which light rays are not affected by diffraction). Note that a renderer using an ideal pinhole camera to produce images of 3D scenes, outputs perfectly sharp images.
In photography, the term depth of field defines the distance between the nearest and the farthest object from the scene that appears "reasonably" sharp in the image. Pinhole cameras have an infinite depth of field. In other words, the sharpness of an object does not depend on its distance to the camera (assuming the pinhole itself has the right diameter for the size of the camera). This is generally not the case of photographs produced with lens cameras. Computer graphics images are most of the time produced using a pinhole camera model, and similarly to real-world pinhole cameras, they have an infinite depth of field; all objects from the scene visible though the camera are rendered perfectly sharp. Computer generated images have sometimes been criticised for being very clean and sharp; the use of this camera model has certainly a lot to do with it. Depth of field however can be simulated quite easily and a lesson from this section is devoted to this topic alone [link].