First published on 24 September 2009; updated August 2012
Note: This has become a rather technical essay, in which I juggle a lot with equations and units. As such, it doesn't quite meet the general readability standards of the other pages on this site. However, I have tried hard to keep things as straightforward as possible, and I do think the content is worthwhile for those who take the time to digest it.
This essay is the result of an exercise I did after writing the companion essay Sensor formats: does size matter?. In that essay, I have argued that every photograph made with a particular sensor size can also be made with an 'equivalent' system having another sensor size, adjusting the settings for the focal length, ISO value and f-stop. Interestingly, the very existence of this possibility indicates that one of these parameters can be made redundant by an appropriate choice of parameters.
One way in which this can be done, and one that is used regularly, is to scale all parameters to match a particular sensor size, usually the 135 ('full frame') format. However, this method has two disadvantages. The first is that the numerical values for f-stop, ISO and focal length lose their familiar meaning. This is confusing because it appears that you are redefining a well-defined concept, causing people to reject such because "f/2 is f/2". The other drawback is a psychological one. By choosing one format as the (arbitrary) reference point, some users will inevitably feel that this format is somehow preferred, clouding the discussion.
With this in mind, the goal of this essay is twofold:
In working towards these goals, I first re-introduce the traditional photographic parameters. These are then used to construct a new set of parameters that have maximum real-world relevance. The familiar photographic relations for exposure, depth of field and diffraction are expressed in terms of these parameters. Finally, a table is presented that shows their values for a number of common cameras and lenses.
From these familiar parameters, we will now construct the alternative set of parameters, motivating our steps along the way.
Rather than the focal length per se, the photographer is interested in the field of view that is produced by the combination of the camera and lens. This can be expressed through the normalized focal length, defined as
Z = f/s
Lenses in the range 1.5 < Z < 2 correspond to 'normal' lenses (around the 50mm mark in the 35mm format). Smaller values of Z indicate wide-angle lenses and larger values indicate telephoto lenses. Excluding a few exotic lenses, Z has a value in the range of 0.5 to 20.
The potential quality of the final image is determined by the amount of light the sensor has to work with, which is the amount of 'information', if you will. For this reason, it makes sense to forego the usual ISO values in favor of the luminous energy (measured in lumen seconds) that the sensor needs to produce a medium gray image. The ISO value for digital cameras is defined in terms of the amount of light per unit area [lux seconds] that is needs to be collected for a middle gray exposure. The ISO 12232:2006 Standard Output Specification for digital cameras mandates that ISO = 10/[middle gray [lux s]]. We can then define Q_{gray} as the total amount of light that is collected to form a uniformly middle gray image by
Q_{gray} [µlm s] = 10 s^{2} [mm²]/ISO [lux^{-1} s^{-1}].
So what does this number tell us? As a general rule, we can expect lower noise levels for larger values of Q_{gray}: more light leads to a better definition of the image. As sensor technology improves, the noise level for a given value of Q_{gray} will also decrease, however two cameras using similar technology (similar efficiency and total read noise) can be expected to have a similar noise levels for equal values of Q_{gray}. Moreover, as the read noise is further suppressed, there is only shot noise left and this rule of thumb becomes exact (assuming equal efficiencies).
Looking at my own cameras (2003-2009 technology level, multiple formats), I find that a luminous energy of approximately 10 µlm s produces files that can handle large contrast adjustments, whereas files down to approximately 3 µlm s can still produce good images, but only if they don't need too much work. Lower values of Q_{gray} lead to images that are severely compromised by noise. Future sensor enhancements will undoubtedly decrease these (personal) thresholds through further suppression of read noise, increased quantum efficiency and the use of multi-layer sensors.
At the other end of the scale, the maximum value of Q_{gray} tends to scale with the sensor area. The reason for this behavior is that the electron storage capacity of sensors is fairly constant per unit area. This is reflected in the fact that most cameras have a base ISO level of approximately 100. Innovative readout techniques that can work around this capacity limitation are being developed, and I fully expect the value of Q_{gray}, and the corresponding maximum attainable image quality, to increase steeply when these techniques enter production.
The f-number is probably the unit that features most prominently in heated discussions related to format comparisons. An important cause is that the f-number is important in determining both the exposure and the depth of field, and you cannot discuss one without addressing the other as well.
Instead of the 'speed' of a lens and the size of a sensor, it makes more sense to have a measure for the speed of the system as a whole. In this section we define the length Geff as an effective optical size (cross-section) that describes how much light is gathered by the combination of the lens and sensor. This quantity relates the intensity of the light striking the lens (illuminance) and the flow of light (luminous flux) that reaches the sensor, in the following way:
[luminous flux (µlm)] = G_{eff}^{2} [mm²] [illuminance (lux)]
Therefore, the value of G_{eff}^{2} can be interpreted as the light gathering capacity or 'speed' of the system. We will later see that this number also determines the depth of field, as the two concepts are directly related.
To determine an expression for G_{eff}, we start by calculating the illumination of the sensor in terms of the luminance L [cd/m²] of an object to be photographed. The luminance is an invariant in geometric optics, meaning that we can calculate the light intensity at the sensor plane by integrating L over the light cone reaching the sensor, taking into account a factor cos([angle from sensor normal]) to account for the decrease in intensity for directions away from the normal. Doing the math produces [sensor illuminance] = sin^{2}(theta) π L = π L / (2 N_{eff})^{2}. This result can be multiplied by sensor area s^{2} to obtain the total luminous flux reaching the sensor. Finally, realizing that (π L) equals the illuminance of the (front element of the) lens that is directed at a surface obeying Lambert's law, we get the following expression for the effective optical size
G_{eff} = s/(2 N_{eff}).
Note that G_{eff} can never be larger than the sensor size (and is often significantly smaller).
When discussing whether a photograph is in focus, it is important to specify the viewing conditions. Are we inspecting the image with a loupe or is it hanging on a distant wall? And how large is it? These factors are usually combined into a single number called the circle of confusion. It is the maximum size to which a point can be blurred while still being recognized as a point.
The circle of confusion is commonly defined as a distance on the film or sensor. However, for images printed at the same size this leads to a circle of confusion that scales with the sensor size. For clarity, we will introduce a normalized circle of confusion that is scaled by the sensor size.
C = 1000 CoC/s
An accepted number for the circle of confusion is the sensor diagonal divided by 1500. For common aspect ratios, this corresponds to approximately C=1. This is fine for regular prints and viewing, but if you'd like to make large prints that you inspect from up close, you'll have to decrease the threshold correspondingly.
If one is interested in maximum sharpness at a given pixel pitch, one can set the circle of confusion to be equal to the width of two Bayer pixels, leading to the expression
C = (4/MP_{bayer})^{½},
indicating that a 4MP bayer image roughly corresponds to the traditional circle of confusion standard. Finally, using the rule of thumb that 1 Bayer pixel is roughly equivalent to 1/2 a 'perfect' (full-color, detailed) pixel, we derive the following statement for the normalized circle of confusion as a function of the required resolution.
C = (2/MP_{output})^{½}.
This implies that the traditional definition of C=1 is also a good match for today's high-resolution monitors (close to 2MP). It needs to be noted, however, that there is no evident mapping from circles of confusion to square pixels, and especially Bayer pixels, so depending on the processing details these equations can be off by a constant factor. If you intend to put these equations to critical use, it is advisable to determine this 'fudge factor' for your own workflow.
Having defined the alternative photographic parameters, we now show that the standard photographic concepts such as exposure and depth of field take on an elegant form in terms of the new parameters.
From the theoretical discussion above, we find the following relation between the quantities Q_{gray}, t and G_{eff}, depending on the illuminance of the lens, or the luminance L of an object to be photographed.
Q_{gray}/(t G_{eff}^{2}) = [illuminance (lux)] = π L [cd/m²]
This theoretical result corresponds almost exactly to the transformed reflected-light exposure equation, used to calibrate light meters. Assuming assuming 100% transmission, we obtain
Q_{gray}/(t G_{eff})² = (40/K) L = 3.2 L [cd/m²] = 3.2 2^{EV - 3},
where L is the luminance of the object to be photographed and EV the corresponding exposure value (ISO100). Note, that in practice factors like transmission losses and natural vignetting due to off-axis subjects will lead to a slight underexposure compared to the ISO/Q_{gray} rating (see also this page (German only)).
Alternatively, we can write down the analogous expression for incident light measurement:
Q_{gray}/(t G_{eff})² = (40/C) E = 0.16 E [lux],
where E is the scene illuminance and C is taken to be 250 (for a flat surface receptor). Assuming a reflectance of 16%, we see that the reflective and incident light measurements are consistent for a surface that follows Lambert's law ([lens illuminance] = [total emittance] = [illuminance]×[reflectance]). Light meters with a hemispherical receptor use a value of C of approx 330, corresponding to an average reflectance of approximately 12%. For an excellent discussion on light metering, see this document.
In film photography, the rule of thumb for the longest hand-holdable shutter speed was t=1/f. For formats other than the 'full frame' film format, this needs to be adjusted. In terms of the normalized focal length, this rule becomes independent of the format:
1/t_{min} ~ 30 Z
Of course, this rule also assumes a standard measure for the circle of confusion (say, C=1). If we intend to inspect the final image from up close, assuming a constant angular velocity, we can adjust the rule as follows:
1/t_{min} ~ 20 Z (MP_{output})^{½}
When passing through an opening of finite size, such as the aperture of a lens, light gets diffracted. This means that a point source is no longer projected as a point, but as a small disk surrounded by interference rings (the Airy disk). If the aperture opening is small enough, or the required resolution is high enough, the diameter of these disks may get large enough to have a noticeable effect on image sharpness (see also this link).
The diffraction effect for a simple lens can be computed as follows. Let us take the point s on the optical axis in the plane of focus. This point has an image that is an Airy disk centered around point i on the optical axis. To determine the diameter of the Airy disc we note that the first minimum occurs when the rays coming from s interfere destructively. Because s lies on the optical axis, this interference must occur due to the optical path differences from the edges of the exit pupil to the image. We know that destructive interference from a circular opening is obtained if the difference between the minimum and maximum path length is equal to 1.22 wavelengths. On the other hand, we know that the path length difference is equal to 2 [radius] sin(theta). We thus get
[Airy disc diameter] = 1.22 [wavelength]/sin(theta)
We now take the Airy disc diameter to be equal to the circle of confusion and use a wavelength of 550 nm (the dominant green light) so that we can determine a diffraction limited G_{eff} as a function of C:
G_{eff}^{diff} [mm] = 0.67/C
And finally, expressing C in terms of the required output size gives
G_{eff}^{diff} [mm] = 0.67 (MP_{output}/2)^{½}
Finally, we restate the depth of field equations in the new parameters, starting with a generic version that is correct for all parameters and distances.
DOF = (1+m/P)(w² C/(1000 G))/(1 - ¼(w C/(1000 Z G))²)
Here, w² is the captured area of the focal plane (in object space), e.g. the area of a head for a tight portrait shot, or the size of a soccer field for a landscape shot.
At long distances (when the object distance d » f), we can use the substitution w=d/Z and we obtain the simplified expression
DOF = 2d × [(d/H)/(1 - (d/H)²)]
where H is the (approximate) hyperfocal distance. This very useful quantity has the simple expression
H [m] = 2 G [mm] Z²/C
When the object distance is much shorter than the hyperfocal distance, implying w « Z G/C, we can simplify the expression for the depth of field as follows
DOF = (1 + m/P)w² C/(1000 G)
If the object distance d is much larger than the focal length (and still much smaller than the hyperfocal length), we can ignore m and obtain the simple expression
DOF [m] = w² [m^{2}] C/G [mm]
Together with the hyperfocal distance, this may be the most useful equation for everyday use. Roughly speaking, it can be applied whenever your subject is more than a number of focal lengths away from you, and the background is sufficiently blurred.
For some applications, you'd like to maximize the depth of field by stopping the lens down as much as possible. However, stopping down too much will affect the sharpness of the in-focus areas, due to diffraction. This implies that there is an optimal aperture that gives you maximal depth of field for a given permitted circle of confusion. In this section we will compute the diffraction-limited depth of field for two common scenarios: landscape and macro photography.
It is important to realize that the out-of-focus blurring and diffraction are two independent effects and the blurring from both has to be taken into account. Even though both types of blur have quite different shapes (uniform disc for defocus; Airy disc for diffraction), it is common to estimate the combined blur by adding the two in quadrature (C_{total} = (C_{diff}^{2} + C_{dof}^{2})^{½}). The diameter of the Airy disc can be written as [Airy disc diameter] = 0.67 [um] s/G_{eff} and is therefore inversely proportional to G_{eff}. On the other hand, in the limiting cases we will discuss below, the diameter of the circle of confusion due to defocusing is proportional to G_{eff}. For the total blur C_{total} we thus get an expression of the form
C_{total} = ((A/G_{eff})² + (B G_{eff})²)^{½}
This expression is minimized for G_{eff} = sqrt(A/B), in which case C_{diff }= C_{dof} =
C_{total}/sqrt(2).
In landscape photography it is often desirable to get everything in focus, from the foreground up to infinity. Substituting C = C_{total}/sqrt(2) into the expression for the diffraction limited light gathering size G_{eff}^{diff }yields
G_{eff}^{diff }[mm] = 0.67 sqrt(2)/C_{total} = 0.67 (MP_{output})^{½}
Note that this corresponds to an aperture that is one stop smaller than what you would usually call the diffraction limited aperture, and it is constant for a given format and resolution requirement. Inserting this result into the expression for the hyperfocal distance, and substituting C = C_{total}/sqrt(2) = (MP_{output})^{-½} produces
H^{diff} [m] = 1.3 [m] Z^{2} MP_{output}
which means we can get everything from H/2 up to infinity in acceptable focus at any given resolution and field of view. This is a fairly sobering result, because it indicates that there is a hard limit to the resolution that you can attain whilst having the whole scene in focus. This expression does not depend on the sensor size, so there is no preferred format for diffraction-limited landscape photography, provided that the camera has sufficient resolution and the lens isn't diffraction-limited to begin with.
Alternatively, photographers occasionally need an even larger depth of field, even at the cost of some resolution. The equation above can be inverted to yield the effective resolution of an image for a given lower in-focus bound D=H/2 (note that these are downsampled megapixels, roughly corresponding to double that number of Bayer MPs).
MP_{output} = 1.5 D [m] / Z^{2}
Macro photography is another application where photographers are often craving for more depth of field. We will now derive an expression for this depth of field, starting from the expression for short distances. For small apertures, the opening angle of the light cone hitting the sensor is small, and we can make the approximation G_{eff} = G/(1 + m/P), so that (see above)
DOF = w² C/(1000 G_{eff})
Again using C = C_{total}/sqrt(2) = (MP_{output})^{-½ }and inserting the expression for the diffraction limited G_{eff}^{diff}, we obtain an expression for the maximum depth of field that can be obtained without significant diffraction losses (note the units).
DOF_{diff} [mm] = 0.15 w² [cm²] / MP_{output}
We see that this expression is neither dependent on the focal length nor on the format size and that the actual magnification is irrelevant. The only thing that matters is the visible area of the focus plane.
In photography, it is often desirable to separate an object from the background by making sure the background is blurred to the point where details are no longer visible. Although the concept of background blur is related to the depth of field, the two are not the same.
We can calculate the extent to which the background is blurred by determining how a point source at infinity (a distant light, for example) gets recorded onto the photograph. Using equation (4) from this depth-of-field derivation and substituting object space quantities using (2), sending v' to infinity and realizing that f/(v - f) = m = s/w, we find that the size of the blurred point in the plane of focus is exactly the same size as the aperture diameter. For example, if you were to take a portrait with a lens that has a 25mm aperture (50mm at f/2), lights in the distance end up looking like 25mm light blobs in the same plane as the person to be photographed.
We can also write down an expression for the relative size of these blobs as compared to our subject size w. This is a direct measure of the background blurring ability of the system, with the expression
B = [infinity blur diameter]/w = (1/500) Z G [mm]/w [m]
For large distances, where the object distance d » f, we can use the substitution w=d/Z. Setting B = C/1000 (no noticeable blur), we recover the expression for the hyperfocal distance that was derived above.
This essay has covered a lot of ground, much more than I had initially anticipated. First and foremost, I have given an example of how one can define a format-agnostic system of photographic parameters. This system is shown to be consistent with the familiar photographic relations for exposure, depth-of-field, and diffraction.
The fact that all those things photographers really care about can be expressed without explicit dependence on the sensor size has an important implication: for many photos, it really doesn't matter how large or small the sensor in your camera is, as long as you use the appropriate setting. This concept is the core of the companion essay Sensor formats: does size matter?. Of course, there is more to it, and the exceptions to this equivalence manifest themselves very clearly in the format-independent parameters:
The analysis of depth of field, diffraction and background blur in terms of the format-independent parameters has produced a number of interesting results. They clearly indicate that for high-depth-of-field photography all formats are equally affected by diffraction. The often asserted statement that small sensors are somehow better for this type of photography has little basis in physics (of course, an advantage can be created if the smaller camera has a shorter minimum focus distance). The flipside is that there is little need to upgrade to a large-sensor high-resolution camera if you are mainly interested in front-to-back sharpness, unless you find yourself constrained by the image quality of smaller sensors at the base ISO value.
To get a feeling for this approach to comparing camera systems, I have compiled a brief list of values for common cameras, bodies and lenses below. Do I propose that all camera makers should adopt this new system, or that you should memorize these values for your cameras? No, that wouldn't be realistic. However, the topic of sensor sizes can be rather contentious, and I hope that this essay contributes to the separation of facts and wishful thinking.
Camera model |
Resolution [MP] |
Z Normalized focal length |
G_{max} max light gathering [mm; wide - tele] |
Q_{gray} Luminous energy [µlm s] |
Apple iPhone 4S | 8 | 1.09 | 0.82 (fixed) | 0.19 - 2.4 |
Nokia 808 (4:3 mode*) | 38.4 | 0.87 - 3.5 |
1.92 - 0.48 |
0.53 - 17 (1.1 @ 4x zoom) |
Panasonic TZ30/ZS20 | 14 | 0.82 - 16 | 0.80 - 0.13 | 0.043 - 2.8 |
Panasonic LX7 (4:3) | 12 | 0.80 - 3.0 | 2.1 - 1.3 | 0.027 - 4.3 |
Ricoh GRD4 | 10 | 0.93 | 1.70 | 0.13 - 5.2 |
Sony RX100 | 20 | 0.96 - 3.4 | 3.3 - 0.45 | 0.045 - 12 |
Notes:
Camera model | Resolution [MP] |
Q_{gray} Luminous energy [µlm s] |
Nikon | 10 | 0.18 - 11.6 |
Olympus E-P1 | 12 | 0.35 - 22.5 |
Nikon D90 | 12 | 0.58 - 18 |
Canon 500D | 15 | 0.26 - 33 |
Pentax K7 | 15 | 0.58 - 37 |
Nikon D300s | 12 | 0.58 - 37 |
Canon 7D | 12 | 0.26 - 33 |
Nikon D3 | 12 | 0.68 - 86 |
Nikon D3x | 24 | 5.4 - 86 |
Canon 1Ds Mk3 | 21 | 2.7 - 86 |
Phase One P65+ | 60 | 27 - 435 |
Again, we note that cameras with larger sensors tend to have a higher maximum luminous energy. The reason for this is that the sensors tend to have a constant electron capacity per unit area. However, the minimum luminous energy (corresponding to the highest ISO value) is fairly constant within a sensor generation. An exception to the latter are the high-MP studio machines - presumably because their users have no interest in borderline usable images images.
Instead of producing a listing of the many lenses available for common formats, I simply present tables for the determination of the light gathering ability G_{max} as a function of the f-number of the lens.
format | Nikon CX | 4/3 | APS-C Canon |
APS-C 1.5x |
full frame 135 format |
Phase One P65+ |
s [mm] | 11 | 15 | 18 | 19 | 29 | 47 |
G_{max }[mm] @ f/1.4 | 4.0 | 5.4 | 6.5 | 6.9 | 10.5 | 16.7 |
G_{max }[mm] @ f/2 | 2.8 | 3.8 | 4.6 | 4.9 | 7.4 | 11.8 |
G_{max }[mm] @ f/2.8 | 2.0 | 2.7 | 3.3 | 3.4 | 5.2 | 8.3 |
G_{max }[mm] @ f/4 | 1.4 | 1.9 | 2.3 | 2.4 | 3.7 | 5.9 |
G_{max }[mm] @ f/5.6 | 1.0 | 1.3 | 1.6 | 1.7 | 2.6 | 4.2 |
G_{max }[mm] @ f/8 | 0.69 | 0.95 | 1.2 | 1.2 | 1.9 | 2.9 |