Can anyone here explain why ATMOS routing is so complicated? I’m trying to understand this from the architectural point of view. In surround, if you want the sound to come from the left top speaker you pan it there. If you want LFE, you pan it there. Same with the FX.
It was my understanding that the difference between regular surround and ATMOS was the calculations necessary to make smother movements or placement in the sound field. I don’t understand WHY all of this jumping through all of these hoops is required to achieve this.
This system is designed to deliver 4 to 6 height channels. So what’s the logic behind a REQUIRED 7.1.2 bed? Why isn’t that bed 7.1.4? The very first thing the mixer will have to do is make a 7.1.4 bed to offset this omission. If I want my reverb in the top front speakers why do I have to build 2 or three groups chained together to make that happen? Why can’t I simply send the reverb there the way I would in normal stereo or surround mixing?
I’m hoping if I can understand the why, it’ll make it easier to understand the way to do this. Things that literally used to take seconds to do are now turned into torturous hours of trying to figure out what kind of labyrinth routing structure will be required to make this same simple function happen.
Also, can we agree that this Ambisonic/Immersive fold down is just enhanced stereo at best? You build this mix using less EQ and dynamics because you’ve got this “bigger canvas” upon which to spread out the sources so they don’t “fight.” Then it get’s folded down into 2 channels and all of those issues come right back, which requires a stereo based approach to resolve. The best sounding ambisonic mix that I’ve heard always had the Dear VR plug in the master bus. Without it, it just sounds like the same old stereo mix (to me, anyway).
So can someone explain the science, separated from the hype, to me?
The main point of ATMOS is not the beds, but the objects. The beds are just there to simplify things and allow all this to scale. You could think of the beds just as a group of objects that have fixed panning positions within the group.
The whole point of Atmos is become independent of specific speaker positions and allow you to make a mix that works from anything like a 32 channels theater to a a mono speaker without having to make separate mixes (in theory, practice is not necessarily quite there yet).
So all positions are described in a 3D space as X/Y/Z coordinates and then mapped to whatever speakers are in the room when someone listens to this mix. Because of all of this 3D mapping there are no master or aux buses anymore, because it all gets packaged up in one very large file as independent audio streams with metadata. And then the actual mixing happens during playback, not when you render the file.
The lack of these master and aux buses is what makes your reverbs difficult, because you no longer have a common audio path where you can insert them. Instead you have to keep them on the individual channels and then find other ways of linking them.
It’s not more complicated. It’s just a very different approach to doing a mix and we have to change our thinking and let go of a whole lot of old ways of doing things, which why it appears so hard.
You have to stop thinking of which speaker you want to have sound in, and instead start thinking where in the room a sound should be positioned (horizontal and vertical), how big it is, and how it should move within the room, how it should reverberate in the room. Forget that there are speakers in the room, no longer relevant.
In an abstract way you could. But in reality that would be a truckload of objects, and not very practical.
So for your basic stuff like a wind ambience in a mix, it’s a lot easier to pan this to a bed bus where you just think do you want this front/read - left/right - top/bottom. These then all get mixed together and sent to a group of objects that panned into the traditional 7.1.2 positions.
But with DX always panned to the center, and if you keep the mono reverbs for DX along with all foley - you could skip the DX bed, and put a single object into the C position and use that instead.
The notion that a bed is just like hard panned objects comes from a video I just watched on the subject, need to find it again. As a mixer you certainly think of it as a bed, but the question is, does the Dolby renderer actually think of them as a bed too? Or when it comes down to mapping it to an actual speaker, does it treat it just like any other object.
I am thinking purely Dolby Atmos Music where basicly there shouldn’t be much, or any movement in a mix. So my inclination is more for hard positioning at a speaker or between speakers just like we do in conventional stereo mixes.
I also am interested in immersive binaural Atmos headphone mixes. Does weather a track is assigned to a bed or object matter for that?
Not sure I have any good answers to that, I don’t work on music mixes. Though since I upgraded my room to 7.1.4 recently I’ve enjoyed listening to immersive mixes on Apple Music and have a playlist where I have saved some good examples. You’re right though, that traditionally at least, instruments don’t move positions, and thus movement is less of an issue, nearly as much as in film mixes.
The good thing is that Apple Music and Apple TV+ now work well on a Mac that has a 7.1.4 audio interface attached. So it makes it really simple to spot reference material. I’m using the Apollo X8/X16 which allows me to solo speakers from the console. Just yesterday I watched John Wick 4 on Apple TV+ in Dolby Atmos in my room and had both the meter from my interface open as well as soloing various channels. It’s a good way to see how folks are using Atmos in various genres, and a good exercise to get inspired.
I have also listened to some of them on headphones, but I have to be honest, once you heard it in a room, it’s very hard to go back to headphones and even remotely enjoy that. Knowing of course that 99% of the audience doesn’t have that luxury and we do need to make mixes that sound great on headphones.
To your original question - if you have much less movement of your objects, then working with beds (or object beds as some do), may just simplify things. You’re still getting the advantage that the mix will adapt to the actual speaker config during playback (something that an old 5.1 mix wouldn’t), but you’re also not spending twice the time routing objects all the time.
If you want to get a good example of the value of objects in a film mix, see this demonstration by Steinberg. I’ve cued it to the right time where you see an animated film and the Dolby renderer window:
Well, Nuendo is mostly a post/film tool, so most in this forum will think of it in this context. If you work mostly in music mixes, there may be additional useful discussions in the adjacent Cubase forum. Not sure what the difference are between Nuendo and Cubase as it relates to Atmos.
That being said, Atmos is a tool at the end of the day which can get used in many different ways. Just like a compressor. But folks doing music mixes think of a compressor on a drum track very differently than folks working on a dialog track for a film. Still the same underlying tool though.
But you’re right - how beds and objects are used within Atmos is probably quite different between music and film. And it’s useful to frame that up so the conversation is headed in the right direction.
Thank you allklier - I started out as a music mixer, used my music mixing experience to pioneer film mixing in the 70s, and now I’m back to music. As all DAWs are virtual I’m sure N12 will be perfect for Atmos music. I will take your advice and check out the Cubase forums …s
Bare in mind that there is also a bug with the objects where if you don’t change the distance of an object (Near-Mid-Far) they will play on plain Stereo (I don’t know if that is also happening to you but is something I notified here in the forums and nobody has replied to it yet and is still being a problem.
But yeah, ATMOS is just a new way of thinking about mixing, is not about “which speaker the sound goes”, but rather “where in the space” I want my sounds to be and with what characteristics. I am mixing music in ATMOS in Cubase 12 and render the binaural downmixes to Youtube so people can enjoy the 3D binaural sound with any pair of headphones.
On Mac you have to open the Midi Audio Setup utility to configure your speaker configuration to 7.1.4 Atmos. The Apple Music recognizes that and uses it properly. In the Midi Audio Setup utility just route the 12 channels to first 12 inputs of the X16, and use the Apollo console to change the interface into 7.1.4 mode so your control features work in this configuration. Then you get to take advantage of all the speaker calibration, etc.
For Windows, you may have to go to the system settings where you have speaker config. Might work there too, but haven’t tried it yet. Wasn’t on 7.1.4 when I was last with Windows.
PS: On Mac has to be one of the two available ATMOS configs, otherwise it plays 5.1 JSOC instead, regardless of how many speakers you have.
This discussion didn’t really answer my question. I was asking strictly from the structural standpoint of this platform. I get the mechanical differences between fixed speaker placement and AI now doing the heavy lifting. But the concept of “just thinking about where you want to place the object” is misleading.
You’re thinking about where you want to place the source even as far back as basic stereo. That’s how we got “Phantom Center” (50% split between the left and right speakers). Same thing goes for surround, when you want the sound to not be quite in the left rear. You’re balancing between the speakers to create said phantom effect. But all I have to do is pan the signal to that balanced spot between the speakers. I don’t have to “prepare a space” for placement, which to me seems much more complicated.
From the moment you create an object you have to decide how to manage resources because you only have 118 (after the bed) openings within which to place it. It’s pointless to think of it as a stereo source because everything is mono within the object environment. So now, I have to create a group channel to house as many objects of similar status in order to reserve resources. Then I have to create environs for those objects because, while similar, said objects are not identical. This requires yet another group with which to place those “similar enough” objects. So, just “thinking about where you want to place the object in space” is already way more complicated and you haven’t even touched a knob or fader! “Thinking about object placement” is more like drafting a blueprint to build a house. With regular surround, you’re just panning and you’ve already got a “3D model” in real time. What’s more, you can actually “Sketch” with this technique.
While we’re supposed to avoid the LFE as a general rule for music mixing in ATMOS, I suggest that the real reason for said avoidance is that it’s not worth the stress you’ll endure figuring out how to make it actually come through the renderer! So, my question is still why is all of this extra routing required to get the same placement that used to be so easily achieved with simple panning?
I don’t think there is any AI in ATMOS. This is all procedural, no magic. It just basic 3D math of input coordinates mapped to available output coordinates with weights.
For the rest - you are still in the mindset that certain speakers exist in certain locations. And while this may be true in your mix room, that may not be the case in your listener’s case.
Example: you may have panned that sound center in terms of front/rear. In a 7.1.4 room you’re thinking I want to have that out of the sides. But that listener with a 5.1 setup doesn’t have any sides, he needs a phantom center of the front and rears. If you use a speaker centric thinking and mix that doesn’t work. He wouldn’t hear your sound at all, unless you did a 5.1 downmix for him. With Atmos you forget that there is a side pair, and just say center of the room, then the renderer will take care of the rest, whether you listen in 7.1.4 or 5.1. The result will be a close as is feasible given the different setups. And with ever more possible speaker configurations, doing various downmixes ahead of time is not practical.
Would it be possible for the software to reverse engineer your intention based on traditional panner location and do the rest? Most likely. Or we can just change how we think about it.
Well, I agree with allklier’s explanations in this thread of course. The thing about panning in both stereo and 5.1 surround is that you’ve baked your cake once you’re done. When you pan a signal to phantom center in stereo then that element, the voice for example, will be literally summed together with the electric guitar that’s left-only. The only way to extract that after the fact is through some fairly complicated software use. So while you’re obviously ‘panning in space’ you’re also committing to summing signals in stereo. And then the same applies to 5.1, 7.1 and of course beds in Atmos.
Sure it’s more complicated, but that’s because you have far more options. It comes at a couple of costs - signal bandwidth (probably hence the 128 channel limit) and complexity.
So as far as why it’s complicated I think perhaps the better answer is unfortunately a counter-question: How do you wish it would work, and how would that be done technically?
I still maintain that it’s not best practice to route anything to the LFE channel if it already exists in one of the mains. If you do that you double up on the signal level. Additionally the playback setup should be full-range if I remember correctly, so there’s no ‘need’ to split something off into the LFE from an object in order to have it represented acoustically. The other thing is that once you get to the lows you lose directivity, so there’s that as well.
As for pure low end signals - and nothing else - the solution is to simply create an LFE group dedicated to only LFE signals, and to route that to the LFE in the main bed.
So in other words there doesn’t seem to be a reason for ‘splitting’ a signal the way you imply it.
Echoing allklier again: The Atmos ‘trick’ is the scalability. If you went with “simple panning” in a 5.1 setup in your studio then that wouldn’t scale if I have a 9.1 system. I would get just as “poor” definition of sources that you panned to the left side 90 degrees from the listening position, whereas if you used an object it would localize to that specific speaker in my setup.
Wait a minute. I’ve done a 5.1.4 mix in a 7.1.4 template that clearly showed the phantom signals in the side speakers in the renderer. Are you saying that the printout of that 5.1.4 mix will omit this information without a 5.1.4 fold down?