Upgraded to 12-core Mac, have the same ASIO/CPU usage.

Totally agree.

I’m finding similar on a MP 5,1 /3.33Ghz /12 core /48GB /SSDs. 1024 buffer for UA Apollo & cards, with ASIO guard at ‘high’ for Nuendo 7 (doesn’t need high for Cubase 8.5). The load seems about the same as before on a six core MP (but the on-board ‘performance meter’ is pretty wimpy /uninformative about all that unlike some other DAWs - its an ASIO meter, not a CPU meter). Conversely, the MP upgrade made a huge difference to Pro Tools 12 HD & especially in terms of VIs. Ditto Final Cut Pro rendering and Logic X plugin instances etc. I suspect that Nuendo and Cubase are not as well multi-threaded (or hyper-threaded?) as they might be, perhaps a substantial rev for the future? in parallel, some of that may lay in the way that VIs /plugins are coded & I have compared the above between VST3 instruments and their AAX counterparts in ‘identical’ Nuendo and Pro Tools projects.

See for example:

Still, the excellent suite of render in place etc tends to offset that somewhat. In the Forum requests area I asked for more about that in terms of the performance meters for CPU usage at least - multiprocessor indicators etc & where we also know that VIs may be optimised in various ways (eg, some prefer to run as multiple instances vs. multitimbral mode). Be good at least to have far more professional feedback and diagnostics there on the DAW side - but then, perhaps the glaring absence of this (and the ASIO guard spin) indicates the absence of ‘proper’ multithreaded optimisation at this point in time? Would seem that this varies enormously across the sector, from platform to platform, but my bottom line observation would be: both Pro Tools and Logic use multi-core processors very well indeed. PT 12.4 in particular, I can’t believe the performance improvement there for my own upgrade from 6 to 12 core- the CPUs go on almost forever, plus the performance metering & cache settings give immediate feedback on how I might instantiate or re-jig various track loads.

I upgraded my 6-core 2010 Mac Pro to 2013 trashcan Mac Pro 6-core. Slight improvement in handling plugins. Slightly faster mixdowns (12-15%) in my tests. Steinberg mentions in one of their knowledge base articles that it is better to pick a faster processor over more cores. I would only be pretending if I said I understood why, but if I can save some bucks and get better performance, I’m fine with that.

Similar interesting discussion going on the Cubase forums, and about track /groups routing in particular. Porting my comment across:

Shall check that. It would be fair to say that most of my material is indeed sub-grouped & also making for bounce /export stems ease etc etc.

BTW, also agree about the earlier comment about ‘CPU speed vs. core count’. In my experience that has been true & in an earlier MP 5,1 I used a 3.33GHz 6 core for exactly that reason. On my ‘new’ MP however ( a custom refurb) I took this into account and it is a 3.33 GHz 12 core (i.e., identical CPU horsepower); I also went with 6 x Ram slots (48GB) which is recommended as having performance improvements for this machine (6 ram slots vs. say 8). Anyways, the point is that the CPU speeds are identical, the cores doubled, but with apparently no significant performance improvement on Cubase or Nuendo.

FWW, did some tests with a VI-only project: with and without auxs and groups. Can’t say I found any appreciable performance differences (at least, as identified by the Steinberg performance meters or the Apple Activity Monitor). Cubase forum seems to indicate otherwise, but has’t been my experience. I did notice however that Cubase 8.5 seems to do a little better with overall CPU load than Neundo 7 for the same project. Possibly newer code.

Otherwise, the ‘threading metering’ is a little odd: all 12 cores seem equally engaged (vs, different cores showing different loads, some with none etc say like Pro Tools), and, when viewing the 24 ‘threads’ the second thread of each CPU shows as doing bugger-all. Re. the CPU readout on Apple activity monitor, this would seem to show that there’s till a lot of Ram and CPU left idle in the background. One solution that works fine is to rewire slave another DAW as host for VIs etc: Ableton Live or Reaper for example. That certainly puts some serious VI grunt into the system.

Anyways, overall I see as just an interesting sideline really. Nuendo in general is very pleasing to work with overall in my experience. BTW, the Rewire implementation is the best I’ve used.

If the second thread in each CPU core is pegging, it sounds like you have a situation where Intel Hyperthreading is not working well with Nuendo on Mac. The second thread on each CPU core is the HT thread for that core.

I’m a PC guy, and Hyperthreading is working great here, but if I wanted to, I could disable it in the motherboard BIOS. Maybe do a little searching and see if you can try disabling Hyperthreading to test. There may be a utility or a script.

So forgive me if this is already covered, or just irrelevant. But one thing that came to my mind was that if you have a project that is large in a DAW that is “smart”, it won’t tie up resources unless there’s processing to be done. So the size of the project won’t be the only determining factor as to whether or not you will see the meters change.

In other words; if you have 100 tracks of virtual instruments, and you play from the start of the project, and at most your project has 40 of them playing at the same time, then that is the highest load the system will see. Loading 100 more tracks of VIs won’t change that if they’re not actually playing.

Or in yet other words: Perhaps running DAWbench is the better approach to test this, rather than loading up a project where the load varies over linear project time…

So a “dumb” DAW would automatically allocate resources as soon as you instantiate a plugin and tie those up. So of course with more DSP power it would look as if you got more headroom because the load is static.

Just a thought…

Hate to say it about Nuendo & Cubase vs. ProTools, but have been doing some work with that … short end of the back-story is that once upon time, I always liked to mix at 96k on ProTools (Lavery white papers etc, higher rez for plugs and mix). Gave that away given various issues with both Nuendo/Cubase not being so nice to up-sample and re-render, have settled more recently on 48k projects.

So here’s the thing: have converted a few recent Nuendo 48k projects up to 96k. No VIs, all rendered as committed final tracks to mix: Verbs, plugs, busses, group FX, VCAs and Master FX. Using UA Apollo, 1024 buffer and various settings for ASIO, best as ‘high’. Both Nuendo 7 and Cubase 8.5 can’t handle the 96k sessions (oh, say around 20 tracks, six busses) & continuously spike and shut down the audio. Hopeless.

Exactly the same tracks bounced out to Pro Tools HD 12.4 & with same mixing, routing, plugs etc breeze along at total 10% CPU usage. No contest. Think I’ll be back to Tools for mixing, but certainly prefer composing in Nuendo.

Mac Pro 5,1, 12 core 3.33Ghz, 48GB, UA Apollo, OSX 101.11.3

Have you guys run the Dawbench test yet? Or have you read test results from it being run?

I mean, if this is an issue it seems as if things then have changed. I believe it at least used to scale quite well.

So what you’re saying is that if you run the same content using the same plugins using the same audio interface you get different results?

I would say that for this to be productively beneficial the comparison really needs to be stripped down so that the bottleneck parameter can be isolated. There’s a lot to consider when moving content around between DAWs.

So you had the same experience in that PT was more efficient running natively on your PC compared to Nuendo? With the same issues popping up I mean…?

Well, that was hardly true whenever that was. The respective audio engines probably sound 99.9% the same anyway. I doubt anyone could hear a difference.

I think for the sake of being productive and actually walking away with useful information the comparison needs to involve a test that makes sense and is able to isolate whatever bottleneck parameter is causing the discrepancy. And such a test is hard to construct when you have different host softwares running on different computers (maybe), with different interfaces (?) and/or different interface drivers, with different plugin architectures (i.e. VST versus AU or whatever) etc. The only thing I can think of off the top of my head is:

Test 1:

1: Set up a simple tracks-to-main-stereo-out session.
2: Duplicate tracks with exactly the same audio on it (in respective software) until it ‘breaks’.

Test 2:

Switch audio interfaces/drivers and repeat 1-2 above.

Test 3:

Repeat tests 1&2 with more complex routing.

Test 4:

Pick one of the above as a new base, and add a plugin known to be fairly “equal” between platforms. Add a reasonable amount of them on inserts on that one track and start duplicating the track again.

Test 5:

As the previous, but either with a different plugin or with several plugins.

I’d think that at some point above you’d find just what makes a big difference. If it’s not adding tracks then perhaps it’s the plugin. Or perhaps the change happens when you switch interface/driver. Or perhaps it’s with multiple plugins specifically (I recall different DAWs having different ways of allocating resources on cores depending on if it’s simply always round-robin when instantiating or if is on a track-by-track basis (meaning two very heavy plugs on one track is far worse than on separate tracks)).

See what I mean? I think we can pretty much assume that Steinberg won’t ever lift a finger to do any tests like this, so if we want to make sure it’s their issue to solve we probably have to do it ourselves.

Thanks for letting us know. I was on the verge of upgrading my cheese grater, but I think I’ll just work with what I have for now. On a positive side note it actually really copes very well.


I see you’re using protools HD 12.4. Maybe the timeline cache was on ?
That alone would explain the difference in workload both can handle…
I’d love to have timeline cache on Nuendo/Cubase. With the amount of ram nowadays computers have, it is a cheap feature boost to add.

Although you say the upgrade is from 4 core to 12 core that is not really true sad to say. The Mac Xeon you mention is actually 6 physical cores with the other 6 being virtual HT% cores so in reality the change is from 4 to 6, not 4 to 12.
Additionally, the older 4 core chips (I used to use an old Quad Core CPU) are very efficient compared to any Xeon CPU - I have an E5 8/16 core monster and it is nowhere near as powerful as I expected, with one multichannel plugin actually bringing it totally to it’s knees unless I set the ASIO buffer up to 4096 samples or greater where an i7 allows the same plugin to work at 2048.
I never did work out why this is.

I believe (but may be incorrect) that not all plugins are created equally and there also seems to be an issue with loading where plugins all run on a single core no matter what - I can see one core maxed out with other (physical, not HT) cores barely doing anything at all yet my CPU performance indicator is bouncing into the red. I stress this seems to be always caused by plugins though, leading me to think not all plugins are multi threaded properly, if at all.

Thank you for posting this Headlands. I hope that Steinberg is paying attention. It would be great to hear some acknowledgement of this issue.


I’m pretty sure the clock of your CPU is what matters. More cores with lower clocks aren’t really an asset.

Do you guys use asio guard 2 in you tests?
It makes a huge difference for me.
For my tv post sessions my hexacore cheese grater is really still fast enough
Even when I have very large sessions with foley and various individual reverbs for each stem.
So this is really a rather theoretical problem for me. Sometimes it’s not quite snappy anymore and it could bounce faster, but I don’t think I can recall having stayed with a computer so many years before.

Last month I did a big test with every available DAW (I own several and I tried some demos) for Mac. Talking about performance, my results are:

  1. Reaper. This is a beast: great performance, quick and responsive. The interface is cryptic and it lacks of a lot of things that I want to have. It’s also dirty cheap.
  2. Logic Por X: almost as good as Reaper, very quick, and it can be less responsive with high count of plugins.
  3. Pro Tools 12: almost as good as the others. The buss system piss me off, that’s why I don’t use it for my mixing duties.
  4. Studio One 3: Yes, it works. the performance is far from Pro Tools.
  5. Nuendo. The worst performance of the list.

I did not tried Live, because I know its performance is worse than Nuendo.

By the way Digital Performer announced a BIG improvement in performance in this new release.

So, in my opinion, the performance is far for being competitive.

Interesting, related ‘interview’ on the Sternberg site, ‘Let’s Talk about Cubase’ at http://www.steinberg.net/en/artists/stories/2016/cubase_story.html “Georg Conrads, Technical Lead Audio Engine, Christian Dettner, Product Planning Manager, and Clyde Sendke, Director for Product Planning, to talk about its creative tools, the much lauded audio engine and more”.

  1. ‘Georg: At the moment, Cubase and Nuendo share their source code for the most part’.

  2. There are many reasons why Pro Tools has become the industry standard, but two of the main reasons seemed to be the “guaranteed processing capacity” and “I/O latency for professional use” that came with the use of the dedicated DSP card.

  3. My main point is that I always find it strange that manufacturers of native processing DAW software do not seem to be making any serious efforts to resolve issues related to latency. In that regard, they still rely on direct monitoring of the audio interface. In terms of processing power, computers have become more than fast enough, and there is always the option of selecting Universal Audio’s UAD if processing power is still insufficient. If only issues related to latency can be overcome, native processing DAW could become a system that could take on Pro Tools|HDX…

  4. Clyde: This is a very interesting question indeed. The first point I would like to highlight is that as of Cubase Pro 8, we have introduced ASIO-Guard 2 that minimizes input and output latency to a minimum 32 samples. Minimizing latency down to this level was simply not possible with previous versions.I do need to clarify one point here, and that is about not taking issues related to latency seriously. Overcoming this problem is one of the topics that we have focused on the most over the past few years. Unfortunately I am unable to provide any more details today. Please wait to see what the future has in store for Cubase and Nuendo.

FWIW, I find this somewhat misdirected: ASIO guard is about input latency and as they indicate, most now use inout monitoring and an audio interfaces that increasingly allow much control of that process, e.g.: RME, UAD etc with 'print to tape FX if required etc etc. Chasing the necessity to lower input monitoring latency through the DAW would seem well off topic these days & monitoring off-source is old school (to vs from tape), worked well then, works well now. I’d get over that one and concentrate on the mix, routing, output and CPU overheads associated with pretending that a DAW is a studio.

Back to the point of this thread: less about ASIO & input buffer then; more focus on multithreading and mix down power and clearly this is where Cubase /Neuendo lag against ProTools, Logic, Reaper and the rest.