Linear light and compositing
This section is heavily inspired by the great explanation by Stu Maschwitz on https://prolost.com/blog/aces
Imagine a digital 50% gray card. In 0–255 RGB values, it’s 127, 127, 127.
On the RGB parade scope, the card is a perfect plateau at 50%.
Now imagine increasing the exposure of this scene by one stop. “Stops” of light are an exponential scale, meaning that subtracting one stop is cutting the quantity of light in half, and plus one stop is twice as much light. The light in our image is expressed in RGB pixel values, so let’s double the simulated light in this scene by doubling the values of the pixels.
Predictably, the 50% region has doubled to 100%. The perfectly-white regions are now overexposed to 200%, which looks the same as 100% in this non-HDR view. Our idealized pure-black patches remain unchanged.
But anyone who has overexposed a camera by one stop, knows that it will not slam middle-gray into pure white. And anyone who has shopped for physical camera charts knows that you don’t buy “50% gray” cards. A middle-gray card at a camera store is an 18% gray card.
“Linear” vs “Gamma”
An 18% gray card appears “middle gray” to our eyes because we humans do not perceive light linearly. Human vision has a “gamma” of sorts — a boosting curve that pumps up our perception of darkness and compresses highlights.Presumably this is a survival adaptation — it’s easier to see a predator or prey in the dark if we boost up the midtones.
The non-linearity of our vision closely matches a few historical imaging methods, such as the densities of dyes on a piece of film, and the voltages in a CRT. So by a combination of happy coincidence and clever design, images that “look right” to our eye on modern displays have a gamma that aligns with the way our brains transform light into pictures.
It is not necessary to deeply understand all that, the important takeaway is: linear images, where pixel math aligns well with real-world light phenomena, don’t look “right.”
An 18% gray card looks middle-gray both in-person and on on our devices because of a shared nonlinearity. Our eyesight has a gamma, and so do the images.
This convenient alignment actually makes it counter-intuitive to imagine working with real-world light values. If a 50%-bright thing on the display looks 50% of the way between black and white to our eyes, where’s the problem?
The problem comes when we want to model the real-world behavior of light.
When creating Virtual Production content, we utilize 3D rendering of course, but we also mix video and graphics through compositing.
That obviously-wrong one-stop-over-is-blown-completely-out gray card example at the top? We call that “working in display-referred space,” and it’s how a lot of computer graphics were created in the early days. It wasn’t right, and it often didn’t look right.
Linear Light & HDR
In both the real world and in gamma-managed image processing, light overpowers dark. When you add the ability to process pixel values greater than 1.0, light has even more opportunity to “win”.
Motion blur, defocus blurs, simple compositing operations, 3D lighting and shading, combining 3D render passes or live-action exposures, even anti-aliasing of text, all look better, more organic, and more realistic when performed in gamma 1.0.
sRGB and Gamma
Strictly-speaking, gamma is a power function. A gamma of 2.2 is the same as raising the pixel value, on a 0.0–1.0 scale, to the power of 1/2.2. But the term gamma has been broadened by some to include any kind of 1D tone curve applied to, or characteristic of, an image. Life is easier with this relaxed definition, so that’s how I use it.
You can absolutely gamma-manage your workflow using the pure gamma-2.2 and its inverse. But if your imagery is sRGB, it’s slightly more accurate to use the sRGB curve. The sRGB tone curve is a very close match to a pure gamma 2.2, but it has a little kink at the bottom to solve an old problem.
A pure gamma curve has a slope of 1.0 or 0.0 at its base, i.e. as the values in the image approach zero, the gamma curve approaches a flat line. This means that calculations on the darkest pixels in your image could be inaccurate, and those inaccuracies could compound through multiple steps of linearization and de-linearization.
sRGB has a steep, but not infinitely steep, linear slope at the very bottom, and then the rest of the curve uses a gamma of 2.4 squished to fit in the remaining range. The clever result is that the curve is smooth at the transition and robust through multiple generations of processing, even if the processing is not done in floating-point.
Round Tripping
While the pure gamma curve and the sRGB curve are similar, two values for which they are identical are zero and 1.0. That’s fine, although there’s nothing special about 1.0 in either curve in the sense that the power function extends naturally through 1.0 and operates equally well on “overbrights,” or HDR values greater than one.
What is significant about these curves and their 0.0–1.0 range is that they round-trip cleanly, as I mentioned above. If you linearize with the inverse of these curves, do your thing, and then de-linearize, the pixels that didn’t get blended go right back to their original values. This is convenient, and for some motion-graphic applications, essential.
Tonemapping
No object is really “white” in the sense of reflecting 100% of the light that hits it. But we often work with synthetic images that have pure white in them (such as logos or text), and of course we expect those values to remain pure white even after round-tripping through an sRGB or gamma 2.2 linear workflow.
But at the same time, we expect our cameras to have that gentle roll-off. We expect a white object to photograph not as pure white, but as some reasonable white-ish shade that is not blown-out. In fact, from modern cameras, we expect enough dynamic range to capture a sun-lit shiny white car, for example, and shadow detail on a person’s face
When you pass scene values to a simple sRGB lookup, with no other “tonemapping” (aka. Gamma management), you get ugly results. Low dynamic range, clipped highlights, and posterized colors near areas of overexposure.
Rendering to linear scene values and then converting them to sRGB with a pure sRGB / gamma 2.2 conversion, will look ugly, and no modern camera works this way. This is why Unreal has implemented the “Film” tonemapper.
Color Lights
While tone mapping (gamma management) helps ensure that highlights and shadows look more pleasing, there are some problems it does not solve.
If we use colored lights, things become more complicated.
This does not look good. The very red light seems unable to illuminate the not-quite-pure blue of the billiard ball, instead tinting it a weird green.
In real life, the illuminated portions are purple, not green.
To solve this we need a proper color management system, in addition to the tone mapping feature
Color Management with ACES
ACES is a color management system
ACES specifies a methodology for converting images among various color spaces. It is specifically designed for the motion picture industry.
ACES is a color space
ACES2065-1, or AP0, encompasses the entire CIE diagram. ACEScg, or AP1, is a carefully-chosen subset.
ACES defines two color gamuts, AP0 and AP1. AP1 is the “working” gamut, and like AdobeRGB and ProPhotoRGB, it is a wide-gamut color space, encompassing more colors than sRGB.
ACES includes color profiles for many popular cameras
ACES ships with profiles for Canon, Sony, ARRI, Red, and more. This means it’s trivial to match the output from various cameras.
ACES includes an evolving set of final lookups for presentation
For that final conversion from the linear-light, wide-gamut working space of AP1, ACES offers a handful of Output Display Transforms, or ODTs. The ones designed for SDR video output have built-in highlight rolloff, a subtle contrast curve, and special handling for bright, saturated colors.
ACES is a gentle prescription for a workflow
The core ACES color profiles are designed to support the phases of a motion picture project:
ACEScg is the linear, AP1 color space designed for 3D rendering and compositing.
ACEScc is a log color space that also uses AP1 primaries. It is designed to be a universal space for color grading.
ACES2065-1 is intended to be a universal mastering color space for sharing and archiving finished projects. This is where that AP0 gamut comes into play — it encompass every color visible to the human eye.
ACEScg is a linear-gamma working space of course, so it’s ideal for rendering and compositing. But that it is also a carefully-chosen wide-gamut color space is an equally important part of its design. Rendering in a wider-gamut space is one way to combat the green ball problem above.
Output / Display mapping
Once you choose to work in a wide gamut, you then have to figure out how to map that image back to various output formats. As we have established, the simple sRGB transform (and its cousin, Rec. 709) is not good enough. The ACES team performed numerous tests and evaluations in designing their output transforms — and then revised the results several times.
And they are still working on it. The look of these transforms is both studied and subjective, and while many people love the look, others have criticisms (especially around rendering of saturated colors). Remember above where I said that a simplistic linear workflow had left an aesthetic gap to be filled? Well these Output Display Transforms (OTF) are the primary way that ACES has stepped up to fill it. This explains why folks are so enthusiastic about the results it gives them, even if it is an ongoing field of development.
The ACES example above, shows the power of a properly color managed workflow and a great ODT. The image has both the pleasing push of contrast we associate with film, as well as the smooth, languorous highlight rolloff. Colors are somehow both rich and restrained. The render looks real, but more importantly, it looks photographed.
This is great, but there is a challenge with round tripping.
By rendering the linearized input video with the photographic contrast and highlight compression of the ODT, we would lose our seamless round-tripping. The video results would probably look dark and dull. Because we knew what we expected our texture to look like at the end of the pipeline, the pleasing, subjective look of the ODT, that works great for 3D graphics, is not the right look for the video plate.
This is meaningful for motion graphics, color grading, and compositing workflows. If “working in ACES” means changing the look of every pixel before you’ve even started to get creative, that’s going to surprise and dismay many artists.
For example, if we rendered the vase above in front of a live video plate, the same post-processing that made the graphics look great, would mute out the photographed background.
If we want 3D rendered scenes to look photographed, do we have to let go of round tripping?
Inverted Display Transform
ACES has a solution for this too. You’ll remember that ACEScg is our working space for rendering and compositing.
When we bring a video feed (or texture) into Pixotope, we have to convert it from it’s native format to ACEScg. If we would use the simple Rec709 (with not backed in tonemapping) for inputs and an ODT with tone mapping for output, the video would look different.
Thankfully, ACES also allows for using the contrasty, soft-highlights Output Display Transform as the “from” in this conversion. In other words, you can invert the output transform for images you want to cleanly round trip.
Inverting all this not only allows for round-tripping, it also has the interesting side effect of plausibly surmising HDR values from an SDR image.
The photographed examples we’ve been discussing all have some kind of “shoulder” baked in. Inverting the shoulder-y ACES Rec. 709 ODT effectively un-shoulders photographed images, putting their compressed highlights back into a reasonable estimation of what scene values might have generated them.
The inverted ODT allows us to round-trip video though ACES, but since it does so by creating HDR values, it’s not appropriate for texture maps representing diffuse reflectivity.
Lighting for ACES ODTs
When working with ACES ODTs, you might have noticed that the 3D rendered images can seem flat compared to the input video.
When you invert the input video using the Rec. 709 ODT, the compliment to the rolloff curve causes 1.0 white to map to a very bright linear-light value: about 16.3 on a scale of zero to one. That sounds aggressive, but it represents about 6.5 stops of overexposure on an 18% gray card.
Artists working with a simple sRGB or gamma 2.2 “linear workflow” have been inadvertently training themselves to use conservative light values, because of the lack of highlight compression modeling high-end film or digital recording. If you lit your scene too bright, you’d get ugly highlights. But real scenes have big, broad dynamic ranges — which is part of why they’re so hard to photograph.
The virtual “sun” light that’s illuminating the rendered ball is set to 300% brightness, but the HDR values that light creates in the render get compressed down so much that I now want to push it more. Here’s the same scene with the light at 1,000% brightness).
If you’re not used to it, setting a light’s brightness to 1,000% feels strange — but in this example, that results in reflectance values of around 10.0, right in line with the HDR-ified highlights in the linearized background plate — as you can see in the underexposed version.
Astute readers will note that if inverting the ODT results in white being mapped to 16.3, then an ACEScg linear value of 16.3 is the darkest value that will be mapped to pure white in Rec. 709 — i.e. you need ACEScg scene values of greater than 16.3 to clip on SDR output.
Rendering to an ACES ODT encourages artists to create higher-dynamic-range HDR scenes, with brighter lights and more aggressive reflections. When you use brighter lights in a modern global-illumination render, you get more pronounced secondary bounces, for a more realistic overall appearance. ACES encourages artists to create CG scenes that better show off the power of modern CG pipelines, and, quite simply, look better, because they better model how real light works.
Even if that light is red, and the object is blue.
Colored Lights
Remember our blue billiard ball that went green when hit with a red light? ACES will help us with that too.
Our sRGB render failed in this case because of its limited color gamut. The saturated blue of the ball was near the edge of sRGB’s range of available colors. When we hit it with a strong red light, the results were out of gamut, so the closest approximation was returned.
ACES addresses this with the wider gamut of its AP1 color space. When you convert a texture from sRGB to ACES CG, you are both linearizing the gamma and also assigning new, broader color primaries. Visually, this results in a reduction in apparent saturation when viewing the raw pixels, so it’s easy to see how a once-saturated green color is no longer dangerously near the edge of the available range.
Things to Be Aware Of
Use sRGB to ACEScg for Textures
Or whatever the appropriate input color space is. Color texture maps shouldn’t try to represent more than 100% reflectivity, so don’t use the inverted ODT method for realistic diffuse surfaces.
Carefully Use Rec. 709 to ACEScg for Video Footage
The inverted-ODT-as-input method reconstructs plausible HDR values from an SDR source. Just beware of the aggressive mapping of near-white pixels into extreme HDR values, and the potential for saturated colors to get truncated.
Procedural Color Management is Better than Baking Conversions into Files
If you must bake out your ACEScg texture maps, remember that 8 bpc is not enough to store a linear-light, wide-gamut image. Use 16-bit TIFF, or EXR.
Color manage all color values, not just textures
A proper ACES color management solution includes managing the user-chosen colors for things like untextured objects and light sources. In my examples above, I had to rig up systems of converting my light colors into the color space I was rendering to, for proper apples-to-apples comparisons.
Don’t Try to Do ACES with LUTs
You can’t really emulate an ACES workflow using LUTs. Most LUTs are not designed to map HDR input, for example. It’s possible, but there are lots of gotchas. Native processing is better.