How does Cubase utilize multiple cores?


Let’s talk about CPU audio performance.

I have always been under the impression that more cores are more important.
The amount of processing actually done during a clock cycle will vary from system to system but generally, a 2.4 Ghz quad-core will give you better performance than a 3.2 Ghz dual-core, right?

The term “multiprocessing” just says that the use of multiple cores is possible. It’s up to both, the system and the DAW, how the cores are used. Which means differences, which means different results.

Higher clock speed definitely helps calculating faster, but there are many situations where a faster calculation is less important than the parallel processing of multiple tasks. VSTis make use of SSE (2 or higher), which allows for a parallel processing of instructions, independent from the number of cores (SSE exists since 2000 or so). For one VST instrument clock speed is more important than core count, since they already process instructions in parallel, which means on a per-cycle-basis.

It’s dependent on whether or not your DAW, its included plugins, and/or third party plugins are written to utilize multiple cores. For instance; in Logic, cores will be read automatically (or input manually) and the application does its best to use all the power available. However the limitation on how the program or plugins were written becomes apparent when you start piling plugs onto a single channel or buss. That’s where the CPU’s single core speed matters more than the amount of cores.
That’s how I understand it at least. Is this right?

What are your thoughts about this?
CPU audio performance : Processors with faster cores v.s. a higher core count?

And most important: How does Cubase utilize multiple cores?

1 Like

A process has a main process thread. At least on windows the gui is not guaranteed by Microsoft to be threadsafe - meaning that many threads in same process simultaneously can not write to screen. You actually can, but it depends on graphics drivers to a degree whether this causes problems or not.
Application can then run many threads for various purposes.

Threads can be distributed by OS or by developer to some degree. Developer can request one core for a thread, but don’t think that is always met by OS.

So looking in task manager in windows(don’t know mac) you can see computer running maybe hundreds or more threads one can assume one core has to deal with many threads, and how many is affected by developer detail control.

Samplers and multiinstrument synths often has a setting how many cores may be used.

BIOS(at least on pc) has setting whether to allow hyperthreading or not. That is a way to utilize two logical cores on a single physical core. Also making it more efficient than if to switch threads on non-hyperthreading enabled cpu.

So look at that many threads are running on the same core - don’t overestimate multi core and it’s importance.
But there is less overhead switching between threads(called context switches) to run a cycle(like 20ms each or so) it is an advantage.
But not like quad core run double performance compared to dual core.

But always interesting to compare benchmarks how cpu and gpu has progressed.

There is a tab for video graphics comparisons and such as well.
And looking at prices you can see what is more bang for buck, kind of.

You can see that cpu benchmark pretty much correlate to cpu clock used. A 4.2 GHz clock is so and so percent faster than a 3.6 or 2.8 GHz of the same like a i7-7700 or a i7-7700K etc.
But Xeon is most efficient, and then i7 and then i5 etc - but can give better bang for buck looking in tables.

About Cubase one interesting thing I learned on forum is that multiinstrument samplers may benefit from running multiple instances instead of many instruments on one instance. They will then get separate thread - and thereby may run on different cores etc. But have not tested how true this is and if significant or not - and under what circumstances you benefit from it.

One thing that seems good about separating instrument instances might be to render just one instrument, as needed. Think multi outs are all rendered otherwise.

Not completely true if we are talking about plugins in parallel. On this six core CPU I can have 6 plugins that max out a single core for example.

What I meant - as your case 6 cores and six plugins - it does not mean 6 times faster processing running each on separate core.
There are still hundreds of threads that use these six cores, some in Cubase some on computer as a whole - if you see what I mean.

Look at if you have a number of inserts on a track - in the end those are to be processed in series. Even in separate threads you have to introduce waitforobject - as it is called in windows api - to syncronize.

Just that double number of cores does not means double performance overall.

ok. but can someone explain why multi cores are a better choice OR a faster core?
for example, a Quad Core i7 2.0 VS Dual Core i5 2.6?

If you look at benchmarks you can see that some i5 give better result that i7 if having higher clock.
Jump to dual core though is rather high, and don’t think you see any dual cores up among the best performing cpu’s(just check and see if they turn up even among middle rated cpu’s)…

Computer benefit from less overhead when switching between threads and processes having those on separate cores. They all share the same memory, but less things to save away and then restore to continue another thread or process.

A cpu core has very limited amount of internal registers, and it’s own cache of prepared instructions to process next time it get cpu clock cycles.
So for this to stay put to a larger degree - not having to save away, and then restore a lot of data - that gives performance boost on the total of it.

The hierarchy is basically for things running - processes each with own threads. So a process has it’s own memory space and have more things to save away and then restore - when switching and giving different processes cpu clock cycles.

Higher clock better low latency performance.
More cores more tracks.
But that is a really general way of putting it.
Take the amd Ryzen, it has a decent clock rate, but doesn’t perform very well at low latency buffer settings. But a really great setup for mixing huge projects.
Cubase like most daws uses one thread for every track, having that in mind can be helpful when creating templates.

1 Like

I think that is the most important point. If you have a single track loaded with CPU intensive plugins, routed through a group loaded with plugins, with low latency, it can bring your CPU to it’s knees because it has to be calculated on a single core. You may have many cores sitting idle but still have audio dropouts. In this situation, you would want a higher clock speed rather than additional cores. It’s also dependant on your audio interface and the quality of it’s drivers.

So, the amount of plugins you can run will vary greatly depending on how they are arranged, what routing you have going on etc.

Cubase is definitely capable of utilising all cores to the maximum IN THE RIGHT SITUATION -which means high buffer settings and plugins spread out over many different tracks. I’ve just started using an AMD 1920X Threadripper which has 24 cores and in the right situation, you can see all cores fully loaded.

1 Like

“I’ve just started using an AMD 1920X Threadripper which has 24 cores and in the right situation, you can see all cores fully loaded”
Don’t forget to let Steiny know how you did that.
They can`t get more than 14 Threads working when using windows 10.

The cores are probably being parked which makes it look like they’re all utilized.

Every application works with cores differently. Clock speeds indeed help latency, core distribution and workload is really application dependant, then there are also virtual and physical cores. I’ve never experienced quadrupled performance, seemingly there is considerable overhead to make it all work synchronized

Well I’m not a PC expert. I just know enough to get by. I’ve seen the Steinberg support info about core limits which I presume you’re referring to.

An app called ParkControl says all 24 cores are active and they all appear to be active in Windows Resource Manager (See attached picture). But perhaps I’m missing something?

Playing back 24 audio tracks takes that much of your CPU…? Do they have very heavy insert effects?

To see if there’s something special about your setup then you can run the test on the page you linked. (mmcss-test.exe)

They all have 8 x Waves Kramer Tape Stereo, with one instance on the Stereo Out, so 193 instances in total.

Oh I see. So can you run that test and see if it gives a different result than mine?

Looks exactly the same as yours…

But it does appear that all the cores are in use, which seems to contradict what Steinberg are saying?

It does appear to be the case! It would be interesting to get in contact with Steinberg over this.

No all core can be in use, some threads are not using mmcss priority. That can lead to dropouts, depending on buffersize and what else the core/thread is doing.
There must be a reason why MS decided to turn down the amount of mmcss treads supported, or they are just ignorant to the fact that there is software that can use a huge amount of mmcss priority threads.
Hope to hear back from MS’s Pete Brown, he promised to give some feedback on the gearslutz forum once he knew more.

I wonder how much of an issue this is, if you are not running any application other than Cubase?

Pretty sure it’s irrelevant whether you are running any other applications.
Cubase alone will exceed the realtime thread limit unless you use one of the fixes they have recommended.

EDIT: With your particular CPU the simple option to test would be to try with and without HT enabled and see which performs better before pops and clicks kick in.