Important: Dolby (Atmos) and the dialogue level (aka dialnorm)

Ever wondered why your mix is quieter when encoded with the AC-3 (Dolby Digital) codec? Ever wondered why Atmos tracks often sound so quiet compared to other audio tracks? Want to know why the external Dolby Atmos renderer can be a quality control issue? For answers to these questions and more, please take a few minutes to read the following.

As I was recently confronted with this topic again, I thought I would take this opportunity to give you some pointers on an important subject.
It concerns Dolby’s dialogue level (dialnorm). As far as I can tell, knowledge of this subject is not very widespread. This may be due to the fact that many people have never had to deal with it, as the dialogue level only appears when encoding consumer tracks with an encoder (DEE, DME, Minnetonka SurCode etc.).
It is very important that anyone creating mixes knows a few important facts about this topic. It is important whether you are doing music or post production.

Dolby’s dialogue level was originally introduced as a way of preventing volume fluctuations when switching between different audio tracks encoded with a Dolby codec. It was a good idea. And it still is. But in my experience, Dolby’s Dialogue Level causes problems in practice. This is especially true in conjunction with Dolby Atmos.
I don’t want to go into too much detail here about how the dialogue level works. If you want more information, you can read Dolby’s official PDF document “A Guide to Dolby Metadata”.

So just this much: Dialogue level is a metadata parameter. Dolby itself writes that the dialogue level is perhaps the most important metadata parameter of all.
The dialogue level setting represents the average loudness of dialogue in a presentation. When received at the consumer’s decoder, this parameter setting determines the level shift in the decoder that sets, or normalizes, the average audio output of the decoder to a preset level. The scale used in the dialogue level setting is from -1 to -31 dB in 1 dB steps. Contrary to what you might assume at first, a setting of “-31” represents no level shift in the consumer’s decoder, and “-1” represents the maximum level shift. The formula is:

31 + (dialogue level value) = Shift applied. Example: 31 + (-21) = 10 dB.

As already mentioned, you set the dialogue level during encoding. On many encoders the dialnorm is preset to “-27 dB”. (Why Dolby chose this value is the subject of some speculation. The -27dB is probably because Dolby claims that it has found that this is the average level for dialogue in an average film.) This is the reason why very, very many Dolby audio tracks on DVD, Blu-ray, UHD-BD or streaming have a dialogue level of “-27 dB”.
With the new decoders (DME, DEE), this preset value no longer exists. Instead, the encoder performs a loudness analysis during encoding. Depending on the result of this analysis, the “appropriate” value for the dialogue level is selected.

It is this analysis that often leads to major problems, especially with Atmos.
With an Atmos master file (ADM BWF) as the source, the encoder’s analysis is based on the 5.1 presentation. This results in very low dialogue levels. (Mostly around -23 to -21 dB. But I have also had values of -17 and -13 dB.) So a TrueHD with Atmos track or an E-AC-3 JOC track will sound “X” quieter than the original. This is the reason why the Dolby Atmos track on a Blu-ray or UHD BD is almost always the quietest audio track in comparison.

The solution to this problem is very simple:

We set the “Custom dialnorm” or “Dialnorm” in the encoder to “-31 dB”.
This disables this “feature”. If we are using the Encoding Engine (DEE), we have to adapt the corresponding script by adding the following line in the right place:

<custom_dialnorm>-31</custom_dialnorm>

Also important: With the newer encoders, you can select whether the content is music. If you want to (or have to) use the dialogue level function, you should make sure that you really select “Music” for “Content”. Otherwise in many cases the original will get a very low Dialogue Level value during encoding.

Another important note regarding the external Dolby Atmos renderer:

Many people use the renderer’s MP4 export function to export a file for quality control. Some also share this MP4 file with their clients. The external Atmos renderer creates an E-AC-3 JOC track during export. And the renderer also writes the dialnorm metadata. But unlike the DME or the DEE, the external Atmos renderer does not(!) allow you to set your own value for the dialogue level!
You should always keep this in mind, as in most cases the exported MP4 file will be (much) quieter than the original Atmos mix.

5 Likes

Good info to know.

Dialnorm was a great concept with a terrible execution. I remember those old “Dolby Digital vs. DTS” battles in the DVD days that DTS would usually win because they had no dialnorm and were louder as a result.

DTS also has “dialogue normalisation”. It has the same range of values as Dolby (-1 to -31 dBFS). However, unlike Dolby, DTS does not use “-27 dB” as the default value in its encoders, but “-31 dB”. Furthermore, DTS explicitly recommends leaving the default value at -31 dB. As a result, DTS tracks are often 4 dB louder than Dolby encoded tracks. (Because usually nobody cares about the settings during encoding and leaves everything at the default values. Sometimes with serious consequences.)

Oh yes, I remember these discussions. :sweat_smile:
It’s really stupid that “louder” is usually equated with “better”. (I fall for it myself from time to time.) That’s why we take great care to make sure that all the audio tracks on our (UHD) Blu-ray discs have the same volume level. (Including menu and bonus material.)

1 Like

Hello Lukas, I don’t understand the need to choose “music” for export if there’s a dialog. But perhaps I’ve missed something in the logical continuation of your comment.

At the moment, I’m working on an Atmos audiobook project (7.1.4) with a large soundtrack, like in the movies (Dialogue, Foleys, ambiences, music). I’m exporting to ADM with Nuendo, then importing the ADM into the Dolby Renderer to export to MP4 (to check the result). It seems to me that I’m better off choosing “Cinema” if I want the processing to take into account that there’s dialogue, don’t you think? If I choose “Music”, there’s no such management. Unless you’re saying that you don’t want this dialogue management, which reduces its volume, - and that the “Music” export choice therefore protects dialogue from volume reduction. For the moment, if I try a “Music” export, it’s pretty much the same as my monitoring in Nuendo. Except for binaural, I think, where the voice sounds louder through the headphones. (I’m translating from French, so I hope it’s all right in terms of vocabulary and syntax)

The selection “Music” (as an alternative to “General”) mainly refers to the loudness measurement. Based on this result, the value for dialogue normalisation is set.
If you select “Music”, the setting for Speech threshold and Dialogue Intelligence for the loudness measurement is deactivated.

Explanation:

Speech threshold
Enabled only if Dialogue Intelligence is selected.
Defines the amount of dialogue in the audio program above which the dialogue loudness is used as a basis for loudness measurement. If the percentage of dialogue is higher than the threshold, the encoder uses speech gating to set the dialogue normalization value (otherwise, the encoder uses level gating).

Dialogue Intelligence
Applies the Dolby loudness measurement technology Dialogue Intelligence, which identifies segments of a program that contain dialogue (speech gating) and measures loudness only on those segments.

The more dialogue the encoder weights, the higher the risk of getting an extremely deviating value for dialogue normalisation. (Because the proportion of speech in relation to the rest is so extreme.)

You may want to experiment to see which setting works best for you.
However, if you follow my recommendation to use a custom dialnorm of “-31 dB”, then you can ignore this setting, as it no longer affects the dialogue normalisation, but only the loudness measurement. (The results of this measurement can be exported to a text file, for example.)

With a custom value of “-31 dB” for the dialnorm, the loudness corresponds exactly to that of the original.

2 Likes

Huge thanks. I’ll make a note of it and check it out!

1 Like