Alternate photographic parameters: a format-independent approach
First published on 24 September 2009; updated August 2012
Note: This has become a rather technical essay, in which I juggle a lot with equations and units. I have tried hard to keep things as straightforward as possible, and I do think the content is worthwhile for those who take the time to digest it.
This essay is the result of an exercise I did after writing the companion essay Sensor formats: does size matter?. In that essay, I have argued that every photograph made with a particular sensor size can also be made with an 'equivalent' system having another sensor size, adjusting the settings for the focal length, ISO value and f-stop. Interestingly, the very existence of this possibility indicates that one of these parameters can be made redundant by an appropriate choice of parameters.
One way in which this can be done, and one that is used regularly, is to scale all parameters to match a particular sensor size, usually the 135 ("full frame") format. However, this method has two disadvantages. The first is that the numerical values for f-stop, ISO and focal length lose their familiar meaning. This is confusing because it appears that you are redefining a well-defined concept, causing people to reject such because "f/2 is f/2". The other drawback is a psychological one. By choosing one format as the (arbitrary) reference point, some users will inevitably feel that this format is somehow preferred, clouding the discussion.
With this in mind, the goal of this essay is twofold:
- To show that the common system of focal length, f-stop and ISO is not unique. Another self-consistent system can be construed, and may even prove to be more insightful to the photographer, because it is directly linked to the things that matter for a photographer: image quality, perspective, light gathering and depth of field.
- To prove that in this system we can remove the explicit dependence on sensor/film size from all relevant photographic equations. [note: it can still place design constraints on the sensors and lenses, but that's an engineer's problem; the photographer simply selects the equipment that is available and affordable]
In working towards these goals, I first re-introduce the traditional photographic parameters. These are then used to construct a new set of parameters that have maximum real-world relevance. The familiar photographic relations for exposure, depth of field and diffraction are expressed in terms of these parameters. Finally, a table is presented that shows their values for a number of common cameras and lenses.
The regular parameters and their notation
- f [mm] - actual focal length of the lens
- N - f-number of a lens, is defined by N=f/[aperture diameter]
- Neff - effective aperture, defined as Neff=1/(2 sin(theta)), where theta is the angle between the central axis and the rim of the exit pupil of the lens, as seen from the sensor. For relatively small angles, we can approximate Neff=N(1+m/P), where m is the magnification and P the ratio between the exit and entrance pupil sizes. Usually, the subject is much further away than the focal length of the lens, so that m << 1, and we have Neff ~= N.
- ISO - the ISO value of the film, or sensor settings, that determines the amount of light per unit area that corresponds to a medium gray in the output
- s [mm] - sensor size, which I define in an aspect-ratio independent way as s = (width x height)^(1/2). This definition has the convenient property that s^2=[surface area]
- CoC [mm] - circle of confusion, defined as the maximum range of blurring on the imager that does not lead to a visible decrease of sharpness in the output.
The alternative parameters
From these familiar parameters, we will now construct the alternative set of parameters, motivating our steps along the way.
Z - normalized focal length
Rather than the focal length per se, the photographer is interested in the field of view that is produced by the combination of the camera and lens. This can be expressed through the normalized focal length, defined as
Z = f/s
Lenses in the range 1.5 < Z < 2 correspond to 'normal' lenses (around the 50mm mark in the 35mm format). Smaller values of Z indicate wide-angle lenses and larger values indicate telephoto lenses. Excluding a few exotic lenses, Z has a value in the range of 0.5 to 20.
Qgray [µ lm s] - Luminous energy for a medium gray image
The potential quality of the final image is determined by the amount of light the sensor has to work with, which is the amount of information, if you will. For this reason, it makes sense to forego the usual ISO values in favor of the luminous energy (measured in lumen seconds) that the sensor needs to produce a medium gray image. The ISO value for digital cameras is defined in terms of the amount of light per unit area [lux seconds] that is needs to be collected for a middle gray exposure. The ISO 12232:2006 Standard Output Specification for digital cameras mandates that ISO = 10/[middle gray [lux s]]. We can then define Qgray as the total amount of light that is collected to form a uniformly middle gray image by
Qgray [µ lm s] = 10 s^2 [mm^2]/ISO [1/(lux s)].
So what does this number tell us? As a general rule, we can expect lower noise levels for larger values of Qgray: more light leads to a better definition of the image. As sensor technology improves, the noise level for a given value of Qgray will also decrease, however two cameras using similar technology (similar efficiency and total read noise) can be expected to have a similar noise levels for equal values of Qgray. Moreover, as the read noise is further suppressed, there is only shot noise left and this rule of thumb becomes exact (assuming equal efficiencies).
Looking at my own cameras (2003-2009 technology level, multiple formats), I find that a luminous energy of approximately 10 µlm s produces files that can handle large contrast adjustments, whereas files down to approximately 3 µlm s can still produce good images, but only if they don't need too much work. Lower values of Qgray lead to images that are severely compromised by noise. Future sensor enhancements will undoubtedly decrease these (personal) thresholds through further suppression of read noise, increased quantum efficiency and the use of multi-layer sensors.
At the other end of the scale, the maximum value of Qgray tends to scale with the sensor area. The reason for this behavior is that the electron storage capacity of sensors is fairly constant per unit area. This is reflected in the fact that most cameras have a base ISO level of approximately 100. Innovative readout techniques that can work around this capacity limitation are being developed, and I fully expect the value of Qgray and the corresponding maximum attainable image quality, to increase steeply when these techniques enter production.
Geff [mm] - effective optical size (light gathering capacity)
The f-number is probably the unit that features most prominently in heated discussions related to format comparisons. An important cause is that the f-number is important in determining both the exposure and the depth of field, and you cannot discuss one without addressing the other as well.
Instead of the 'speed' of a lens and the size of a sensor, it makes more sense to have a measure for the speed of the system as a whole. In this section we define the length Geff as an effective optical size (cross-section) that describes how much light is gathered by the combination of the lens and sensor. This quantity relates the intensity of the light striking the lens (illuminance) and the flow of light (luminous flux) that reaches the sensor, in the following way:
[luminous flux (µlm)] = Geff^2 [mm^2] [illuminance (lux)]
Therefore, the value of Geff^2 can be interpreted as the light gathering capacity or 'speed' of the system. We will later see that this number also determines the depth of field, as the two concepts are directly related.
To determine an expression for Geff, we start by calculating the illumination of the sensor in terms of the luminance L [cd/m^2] of an object to be photographed. The luminance is an invariant in geometric optics, meaning that we can calculate the light intensity at the sensor plane by integrating L over the light cone reaching the sensor, taking into account a factor cos([angle from sensor normal]) to account for the decrease in intensity for directions away from the normal. Doing the math produces [sensor illuminance] = sin^2(theta) π L = π L / (2 Neff)^2. This result can be multiplied by sensor area s^2 to obtain the total luminous flux reaching the sensor. Finally, realizing that (π L) equals the illuminance of the (front element of the) lens that is directed at a surface obeying Lambert's law, we get the following expression for the effective optical size
Geff = s/(2 Neff).
Note that Geff can never be larger than the sensor size (and is often significantly smaller).
C - Normalized circle of confusion
When discussing whether a photograph is in focus, it is important to specify the viewing conditions. Are we inspecting the image with a loupe or is it hanging on a distant wall? And how large is it? These factors are usually combined into a single number called the circle of confusion. It is the maximum size to which a point can be blurred while still being recognized as a point.
The circle of confusion is commonly defined as a distance on the film or sensor. However, for images printed at the same size this leads to a circle of confusion that scales with the sensor size. For clarity, we will introduce a normalized circle of confusion that is scaled by the sensor size.
C = 1000 CoC/s
An accepted number for the circle of confusion is the sensor diagonal divided by 1500. For common aspect ratios, this corresponds to approximately C=1. This is fine for regular prints and viewing, but if you'd like to make large prints that you inspect from up close, you'll have to decrease the threshold correspondingly.
If one is interested in maximum sharpness at a given pixel pitch, one can set the circle of confusion to be equal to the width of two Bayer pixels, leading to the expression
C = (4/MPbayer)^(1/2)
indicating that a 4MP bayer image roughly corresponds to the traditional circle of confusion standard. Finally, using the rule of thumb that 1 Bayer pixel is roughly equivalent to 1/2 a 'perfect' (full-color, detailed) pixel, we derive the following statement for the normalized circle of confusion as a function of the required resolution.
C = (2/MPoutput)^(1/2).
This implies that the traditional definition of C=1 is also a good match for today's high-resolution monitors (close to 2MP). It needs to be noted, however, that there is no evident mapping from circles of confusion to square pixels, and especially Bayer pixels, so depending on the processing details these equations can be off by a constant factor. If you intend to put these equations to critical use, it is advisable to determine this 'fudge factor' for your own workflow.
Having defined the alternative photographic parameters, we now show that the standard photographic concepts such as exposure and depth of field take on an elegant form in terms of the new parameters.
From the theoretical discussion above, we find the following relation between the quantities Qgray, t and Geff, depending on the illuminance of the lens, or the luminance L of an object to be photographed.
Qgray/(t Geff^2) = [illuminance (lux)] = π L [cd/m^2]
This theoretical result corresponds almost exactly to the transformed reflected-light exposure equation used to calibrate light meters. Assuming assuming 100% transmission, we obtain
Qgray/(t Geff)^2 = (40/K) L = 3.2 L [cd/m^2] = 3.2 2^(EV - 3),
where L is the luminance of the object to be photographed and EV the corresponding exposure value (ISO100). Note, that in practice factors like transmission losses and natural vignetting due to off-axis subjects will lead to a slight underexposure compared to the ISO/Qgray rating (see also this page (German only). Alternatively, we can write down the analogous expression for incident light measurement:
Qgray/(t Geff)^2 = (40/C) E = 0.16 E [lux],
where E is the scene illuminance and C is taken to be 250 (for a flat surface receptor). Assuming a reflectance of 16%, we see that the reflective and incident light measurements are consistent for a surface that follows Lambert's law ([lens illuminance] = [total emittance] = [illuminance] x [reflectance]). Light meters with a hemispherical receptor use a value of C of approx 330, corresponding to an average reflectance of approximately 12%. For an excellent discussion on light metering, see http://www.largeformatphotography.info/articles/conrad-meter-cal.pdf .
In film photography, the rule of thumb for the longest hand-holdable shutter speed was t=1/f. For formats other than the 'full frame' film format, this needs to be adjusted. In terms of the normalized focal length, this rule becomes independent of the format:
1/tmin ~ 30 Z
Of course, this rule also assumes a standard measure for the circle of confusion (say, C=1). If we intend to inspect the final image from up close, assuming a constant angular velocity, we can adjust the rule as follows:
1/tmin~ 20 Z (MPoutput)^(1/2)
When passing through an opening of finite size, such as the aperture of a lens, light gets diffracted. This means that a point source is no longer projected as a point, but as a small disk surrounded by interference rings (the Airy disk). If the aperture opening is small enough, or the required resolution is high enough, the diameter of these disks may get large enough to have a noticeable effect on image sharpness (see also http://www.cambridgeincolour.com/tutorials/diffraction-photography.htm ).
The diffraction effect for a simple lens can be computed as follows. Let us take the point s on the optical axis in the plane of focus. This point has an image that is an Airy disk centered around point i on the optical axis. To determine the diameter of the Airy disc we note that the first minimum occurs when the rays coming from s interfere destructively. Because s lies on the optical axis, this interference must occur due to the optical path differences from the edges of the exit pupil to the image. We know that destructive interference from a circular opening is obtained if the difference between the minimum and maximum path length is equal to 1.22 wavelengths. On the other hand, we know that the path length difference is equal to 2 [radius] sin(theta). We thus get
[Airy disc diameter] = 1.22 [wavelength]/sin(theta)
We now take the Airy disc diameter to be equal to the circle of confusion and use a wavelength of 550 nm (the dominant green light) so that we can determine a diffraction limited Geff as a function of C:
Geff;diff [mm] = 0.67/C
And finally, expressing C in terms of the required output size gives
Geff;diff [mm] = 0.67 (MPoutput/2)^(1/2)
Depth of field
Finally, we restate the depth of field equations in the new parameters, starting with a generic version that is correct for all parameters and distances.
DOF = (1+m/P)(w^2 C/(1000 G))/(1 - 1/4 (w C/(1000 Z G))^2)
Here, w^2 is the captured area of the focal plane (in object space), e.g. the area of a head for a tight portrait shot, or the size of a soccer field for a landscape shot.
At long distances (when the object distance d >> f), we can use the substitution w=d/Z and we obtain the simplified expression
DOF = 2d x [(d/H)/(1 - (d/H)^2)]
where H is the (approximate) hyperfocal distance . This very useful quantity has the simple expression
H [m] = 2 G [mm] Z^2/C
Short and intermediate distances
When the object distance is much shorter than the hyperfocal distance, implying w >> Z G/C, we can simplify the expression for the depth of field as follows
DOF = (1 + m/P)w^2 C/(1000 G)
If the object distance d is much larger than the focal length (and still much smaller than the hyperfocal length), we can ignore m and obtain the simple expression
DOF [m] = w^2 [m^2] C/G [mm]
Together with the hyperfocal distance, this may be the most useful equation for everyday use. Roughly speaking, it can be applied whenever your subject is more than a number of focal lengths away from you, and the background is sufficiently blurred.
Diffraction limited depth of field
For some applications, you'd like to maximize the depth of field by stopping the lens down as much as possible. However, stopping down too much will affect the sharpness of the in-focus areas, due to diffraction. This implies that there is an optimal aperture that gives you maximal depth of field for a given permitted circle of confusion. In this section we will compute the diffraction-limited depth of field for two common scenarios: landscape and macro photography.
It is important to realize that the out-of-focus blurring and diffraction are two independent effects and the blurring from both has to be taken into account. Even though both types of blur have quite different shapes (uniform disc for defocus; Airy disc for diffraction), it is common to estimate the combined blur by adding the two in quadrature (Ctotal= (Cdiff^2 + Cdof^2)^(1/2)). The diameter of the Airy disc can be written as [Airy disc diameter] = 0.67 [µm] s/Geff and is therefore inversely proportional to Geff. On the other hand, in the limiting cases we will discuss below, the diameter of the circle of confusion due to defocusing is proportional to Geff. For the total blur Ctotal we thus get an expression of the form
Ctotal = ((A/Geff)^2+ (B Geff)^2)^(1/2)
This expression is minimized for Geff = sqrt(A/B), in which case Cdiff = Cdof=Ctotal/sqrt(2).
In landscape photography it is often desirable to get everything in focus, from the foreground up to infinity. Substituting C = Ctotal/sqrt(2) into the expression for the diffraction limited light gathering size Geff;diff yields
Geff;diff [mm] = 0.67 sqrt(2)/Ctotal = 0.67 (MPoutput)^(1/2)
Note that this corresponds to an aperture that is one stop smaller than what you would usually call the diffraction limited aperture, and it is constant for a given format and resolution requirement. Inserting this result into the expression for the hyperfocal distance, and substituting C = Ctotal/sqrt(2) = (MPoutput)^(-1/2) produces
Hdiff [m] = 1.3 [m] Z^2 MPoutput,
which means we can get everything from H/2 up to infinity in acceptable focus at any given resolution and field of view. This is a fairly sobering result, because it indicates that there is a hard limit to the resolution that you can attain whilst having the whole scene in focus. This expression does not depend on the sensor size, so there is no preferred format for diffraction-limited landscape photography, provided that the camera has sufficient resolution and the lens isn't diffraction-limited to begin with.
Alternatively, photographers occasionally need an even larger depth of field, even at the cost of some resolution. The equation above can be inverted to yield the effective resolution of an image for a given lower in-focus bound D=H/2 (note that these are downsampled megapixels, roughly corresponding to double that number of Bayer MPs).
MPoutput = 1.5 D [m] / Z^2
Macro photography is another application where photographers are often craving for more depth of field. We will now derive an expression for this depth of field, starting from the expression for short distances. For small apertures, the opening angle of the light cone hitting the sensor is small, and we can make the approximation Geff = G/(1 + m/P), so that (see above)
DOF = w^2 C/(1000 Geff)
Again using C = Ctotal/sqrt(2) = (MPoutput)^(-1/2) and inserting the expression for the diffraction limited Geff;diff, we obtain an expression for the maximum depth of field that can be obtained without significant diffraction losses (note the units).
DOFdiff [mm] = 0.15 w^2 [cm^2] / MPoutput
We see that this expression is neither dependent on the focal length nor on the format size and that the actual magnification is irrelevant. The only thing that matters is the visible area of the focus plane.
In photography, it is often desirable to separate an object from the background by making sure the background is blurred to the point where details are no longer visible. Although the concept of background blur is related to the depth of field, the two are not the same.
We can calculate the extent to which the background is blurred by determining how a point source at infinity (a distant light, for example) gets recorded onto the photograph. Using equation (4) from this depth of field derivation
[edit 2018: archived site] and substituting object space quantities using (2), sending v to infinity and realizing that f/(v - f) = m = s/w, we find that the size of the blurred point in the plane of focus is exactly the same size as the aperture diameter. For example, if you were to take a portrait with a lens that has a 25mm aperture (50mm at f/2), lights in the distance end up looking like 25mm light blobs in the same plane as the person to be photographed.
We can also write down an expression for the relative size of these blobs as compared to our subject size w. This is a direct measure of the background blurring ability of the system, with the expression
B = [infinity blur diameter]/w = (1/500) Z G [mm]/w [m]
For large distances, where the object distance d >> f, we can use the substitution w=d/Z. Setting B = C/1000 (no noticeable blur), we recover the expression for the hyperfocal distance that was derived above.
This essay has covered a lot of ground, much more than I had initially anticipated. First and foremost, I have given an example of how one can define a format-agnostic system of photographic parameters. This system is shown to be consistent with the familiar photographic relations for exposure, depth-of-field, and diffraction.
The fact that all those things photographers really care about can be expressed without explicit dependence on the sensor size has an important implication: for many photos, it really doesn't matter how large or small the sensor in your camera is, as long as you use the appropriate setting. This concept is the core of the companion essay Sensor formats: does size matter? Of course, there is more to it, and the exceptions to this equivalence manifest themselves very clearly in the format-independent parameters:
- Larger sensors allow for larger values of Qgray, corresponding to a higher potential image quality at base ISO.
- Larger sensors and their lenses can often have a larger effective light gathering size Gmax, opening up more low-light and shallow-depth-of-field opportunities. However, these two are always coupled: you cannot get more low-light sensitivity without shallow depth-of-field.
The analysis of depth of field, diffraction and background blur in terms of the format-independent parameters has produced a number of interesting results. They clearly indicate that for high-depth-of-field photography all formats are equally affected by diffraction. The often asserted statement that small sensors are somehow better for this type of photography has little basis in physics (of course, an advantage can be created if the smaller camera has a shorter minimum focus distance). The flipside is that there is little need to upgrade to a large-sensor high-resolution camera if you are mainly interested in front-to-back sharpness, unless you find yourself constrained by the image quality of smaller sensors at the base ISO value.
To get a feeling for this approach to comparing camera systems, I have compiled a brief list of values for common cameras, bodies and lenses below. Do I propose that all camera makers should adopt this new system, or that you should memorize these values for your cameras? No, that wouldn't be realistic. However, the topic of sensor sizes can be rather contentious, and I hope that this essay contributes to the separation of facts and wishful thinking.
Parameter values for various current cameras and lenses
- The sensor size largely determines the maximum luminous energy that can be used to form an image.
- The Nokia 808 uses a combination of a high-density sensor and cropping to zoom. The 'zoomed in' numbers given here correspond to 4x zoom (max for HD video).
Cameras with interchangeable lenses
Again, we note that cameras with larger sensors tend to have a higher maximum luminous energy. The reason for this is that the sensors tend to have a constant electron capacity per unit area. However, the minimum luminous energy (corresponding to the highest ISO value) is fairly constant within a sensor generation. An exception to the latter are the high-MP studio machines - presumably because their users have no interest in borderline usable images images.
Instead of producing a listing of the many lenses available for common formats, I simply present tables for the determination of the light gathering ability Gmax as a function of the f-number of the lens.