We have GPUs for video, but what about specialized hardware for MIDI programming?

Sebastian_Alvarez · January 25, 2025, 3:00am

I’ve been around computers for over 30 years, so I know my fair share when it comes to them and their components. So this may very well be a stupid question, but I can’t get it out of my head.

For over a decade now, we have seen the field of GPUs (Graphic Processing Unit for those who don’t speak nerd) take over more and more functions from the CPU (Central Processing Unit) and make amazing leaps in performance, not just for games, but for video editing, 3D rendering and so many other things that involve computer graphics of any kind.

For example, if you’re a video editor, it used to be that you would need the latest and greatest machine, Mac or PC, to be able to edit video even in HD, but the huge advances in GPU technology mean that these days you can have a machine that is from 2012, add a $200 Nvidia RTX card and you can edit 4K video just fine. You can also render that video at speeds that your old machine by itself would take a year to do the same.

But when it comes to MIDI programming, even the latest and greatest is not enough for a totally smooth playback if you’re working on a project with 50 or so tracks, and heavy sampled libraries like OT Berlin Strings. Obviously a lot of this depends on the engine, and a bad engine can bring down the best DAW, but generally speaking, even with the fastest Mac or PC, with the fastest NVMe drives and so on, you still get hiccups until you have played the song or section several times.

So I guess my question is, for those of you who know Cubase but also all the super geeky tech stuff, there isn’t anything for MIDI programming that would be equivalent to the GPU for video editing, right? Like a PCIe card that has an “APU” (Audio Processing Unit), and offloads some of the processing?

I’m pretty sure the answer is no, and if it is, who wants to make one? Just kidding, but I bet if someone had the brains to come up with a card like that, they would make good money on it. Well, not as much as Nvidia, but still good money.

I’m editing to clarify something. I suppose an APU would be a good idea for everything that involves Cubase and other DAWs. I just mentioned MIDI programming specifically because it’s the thing that really puts a lot of strain for Cubase. I was working on a project that was 8 minutes long and had like 130 tracks, and was barely playable on my Mac Studio M1 Ultra, which is a very fast machine. Even disabling all the inserts and FX, it was still a nightmare. So I worked the MIDI part as much as I could without worrying about the sound tweaking.

When I was done, I used the excellent feature Cubase has that lets you export a project doing a mixdown of each individual track and placing all the tracks in the new project. Once it did that, the new project was a breeze. I was piling up audio effects like there’s no tomorrow, Soundtoys, FabFilter, Steinberg, Cinematic Rooms Pro, Plugin Alliance, Waves, bit of everything. And yet, pressing the spacebar kept playing without any hiccups even with 130 tracks and FX in almost all of them, plus the groups and the FX tracks. So the bottleneck is in the MIDI, and it makes sense. Whereas one track has to load one audio file and play it back with effects added, a VSTi MIDI track has to load thousands of tiny audio files that will be triggered with each MIDI note. That, times 130, obviously is going to cause a lot strain on any CPU.

Googly_Shakespeare · January 25, 2025, 3:21am

I may not be understanding your post, as it’s gone 3 am where I am and I’m ready for bed, however…

Requires minimal muscle, which why an Atari could handle it.

And this is why we need more muscle from our CPUs.
Actual hardware MIDI programmers were all the rage in the early 1980s, I even had one, a Yamaha QX5, I think. But you still needed synths to receive the data.

Sebastian_Alvarez · January 25, 2025, 4:04am

Right, but you’re talking about the old MIDI from the 80’s and 90’s, which was all synth based. I’m speaking of virtual libraries with sampled instruments where you have individual tiny audio files for articulations, round robins, and a lot more things depending on the instrument and the library.

That’s the part that makes MIDI really taxing on the machine, and I’m wondering if there is some specific card or even an audio interface that accelerates that part of the process.

For those of you who don’t know the basics (and you’re musicians and sound engineers, so it’s not like you have to), the thing that makes GPUs so fast is that while CPUs need to package instructions to process a wide variety of commands, GPUs narrow that scope to graphic processing. But, they have a lot more power than a CPU to process that specific type of instruction. So a 3D render that would take an hour in a CPU would take two minutes on a decent GPU for example.

Well, what I’m wondering is if that type of processor exists but for MIDI and VSTis, to offload some of the heavy processing to that specific chip instead of relying on the CPU for all of it.

Brian_Roland · January 25, 2025, 4:27am

Simple sample playback isn’t very processor intensive. Other aspects of an instrument plugin can be CPU hungry, but triggering samples alone usually isn’t.

If you have huge orchestral sample libraries in something like Kontakt, HALion, Opus, Vienna, etc), adding RAM will probably get you more bang for the buck than more/faster CPUs/APUs, etc, and attempting to get faster storage with more bandwidth. From there, having uber fast SSD storage on a format with lots of bandwidth can help speed up project load times…and it might help offset the need for more RAM a bit. Not so much ‘processor’ demands in this case.

Another option is to use something like Audio Gridder, Vienna Pro, Bidule, or even another instance of Cubase (synced via VST Link) to host plugins on a totally different system; OR, go back to dedicated hardware for hosting instruments (Montage, MOTIF, Fantom, Kronos, etc).

Where the processing power of a typical DAW comes into play are all the real time effect chains (Reverb, Chorus, Distortion, etc), or modeled synths that are trying to emulate analog circuit networks in real time (Stuff from Roland/Korg Cloud, Arturia Synth Emulations, Falcon, HALion ‘synth’ patches, etc). I.E. A ‘modeled/synthesized’ piano from Arturia on my system pulls 12% DSP from a single core. A rather huge ‘sampled’ piano running in EW Opus doesn’t even pull 2%. The Arturia modeled piano needs way more CPU, but is far more versatile as a ‘live/playable’ instrument and doesn’t take but a few gig on the hard-drive. The EW Pianos are HUGE on my disk drive, and are fine sounding pianos, but they’re like concrete…there’s not much I can do to get different/varied piano sounds out of them…they do better with a lot RAM, but use very little CPU.

Some of the emulation plugins are terribly inefficient too. I.E. Roland Zenology, or something from the Korg plugin lineup. That’s not because they use samples. It’s more a matter of them emulating all sorts of custom hardware in the plugin. Such sound engines tend to have a bunch of effects and stuff in-line that can’t be disabled without destroying the sound/character of the sound engine. So the CPU requirements get higher…they are emulating ‘entire machines’.

With HALion and Kontakt, I notice most of the programs/patches I use in them are very efficient…barely use CPU, but there ARE heavy synth patches for those engines that can be pretty demanding. Especially if you layer a bunch of them together.

Even with modern multi-core processors, there are limits as to what can be passed off to the different cores…as it ultimately all has to be ‘put back together’ at some point…in real time (but not really, that last stage of reassembling everything into your main bus outputs is what introduces so much latency). So, it’s not uncommon to see a DAW project ‘choke’ when the main CPU isn’t even breaking a sweat overall (one core maxes out, while others might not even get touched).

There are sometimes steps you can take to get better CPU core management out of project…but there are usually a couple of stages that are going to introduce latency…no matter how many cores and ‘other processors’ you have in the mix. In general, if you use lots of shared aux-send type of effects across many plugins, that’s going to force all the different threads to have more in common, so they’ll have to ramp up one core and rely on it more, and you might even need bigger buffers to buy the system time to get all these ‘ducks in a row’ (latency).

It’s not something to get overly bogged down thinking about tho’. Just make music. Disable effects and use small buffers when you need to do things in real time, with remote controls. Otherwise, use large buffers…make them as large as they need to be in order to render things without hearing glitches. Since you’re working with draw style tools, and have visual cues in front of you, the ‘latency’ added from using big ASIO buffers isn’t as big of a problem at that point.

This takes us back to your question about custom chips and processors (DSP) for audio purposes. They DO EXIST. Dedicated processors for audio have been around almost as long as solid state computers. Right off the bat, the typical audio interface already has some degree of DSP onboard. Specialized DSP cards have existed for rendering audio and such for a long time…less common for consumer end needs though, and the typical musician’s DAW isn’t going to support it (Post Production features. Nuendo ‘might’ support some of that sort of hardware?).

For today’s ‘musician/performer’ DAWs, some things exist that are pretty well supported.

I.E. You can buy consoles and audio interfaces that can take much of the ‘real time effects’ load on, and do real-time processing apart from the rest of the computer. You can get huge commercial consoles with loads of offboard processing built into each channel, and you can also find more modest interfaces from outfits like UA that can host plugin effects internally and have DSP power onboard.

Are they any good? Certainly, but they can be pricy, and they also come with rules and restrictions on setting them up and using them properly. They’re more about running effect chains, and aren’t intended to run ‘instrument plugins’. Depending on what you’re trying to do, the benefits might not be worth the costs.

MattiasNYC · January 25, 2025, 5:21am

I think people are possibly going to be confused when you say “MIDI” since literal MIDI takes pretty much no effort at all. It’s really the processes that are triggered by MIDI that are resource intensive.

As for actual GPU use for audio I have seen one recent project for it, but it seems to be focused on basically running the company’s plugins that process audio rather than running virtual instruments.

I don’t think there’s anything out there that offloads VSTi work to GPUs. If there was I bet it would widely used by now.

Johnny_Moneto · January 25, 2025, 8:51am

Others have already pointed out to you that you seem to confuse certain words. Pure “MIDI programming” most certainly puts the smallest amount of workload onto a computer system.
The most affected system components for working with sampled instrument libraries are the storage units (HDD, SSD), computer memory (RAM), the address and data busses on the mainboard (more importantly how other computer components may interfere with data throughput) and to a lesser extent the CPU.
In order to create a specialized “APU” to tackle your issues you basically need to build an entire computer.

If you feel that the “graphic people” have received much better technological advancements in the past, here is a small reminder of how audio systems got more potent for “orchester people”: RAM got cheaper, RAM got faster and SSDs arrived. Current SSDs can move more data/second than first generation SSDs.
Our beloved friend Hans Zimmer had an entire room stacked with hardware sampler units in order to have a virtual orchester available through the computer some 20 years ago. Nowadays you can have the same musical power on a laptop, sitting somewhere in a cafe.

Sibben · January 25, 2025, 9:29am

Graphics is where you see the dedicated hardware most prominently because it’s central to computing, as almost every computer system has a screen. Not only is the way we interact visually with computers important to us, many of the things we’ve liked doing with computers was historically of a visual nature, playing games, or working with images or film. This has driven the evolution of dedicated graphics hardware, and you can see how that computing power has been utilized for other applications, like AI but also some experiments into audio (GPU Audio).

For someone to create dedicated hardware/software solutions that become a central part of the technology, you need it to be that universal. Virtual instrumentation is just way too niche, although one could imagine such a thing. It would likely be some form of processor with large amounts of memory and storage. This is something computer systems are already capable of, though their not specialized for it, so there’s really not much of a need. With enough RAM you can do orchestration good enough.

There was however one audio related area where computers did struggle, recording and playing back time critical audio. So there is dedicated hardware for that: your audio interface.

noise · January 25, 2025, 1:55pm

wouldn’t render in place take care this issue?

Sunnyman · January 25, 2025, 2:45pm

That’s a cool discussion here.
I like to emphasize the parts of numerical and algorithmical efficiency. That seems indeed a problem still today for many pieces of software.

mkok · January 25, 2025, 6:38pm

Hmm nothing to do with midi. It’s the instruments that are getting cpu and memory heavy. Midi takes very little power at all.

Title is confusing to most.

Sebastian_Alvarez · January 25, 2025, 8:59pm

Fair enough. I think it’s because people who have been doing this for a while have a different notion of MIDI, and I started studying this two years ago, so my idea of MIDI might be mistaken.

I’m old, but I’m not familiar with the MIDI of the 80’s, 90’s, etc, with the 5 pin connectors and the old Ataris and all that. Until recently, to me the word MIDI was associated with little Casio keyboards, or tiny files that played notes with a rather boring electronic sound that had no substance to me besides the actual music if it was any good.

Then several years ago I realized that MIDI had grown to a point where those same notes in a tiny MIDI file could be loaded in a current DAW and they would trigger actual recorded instruments. I remember being blown away by watching videos on YouTube of people playing a whole string section with Symphobia, and I couldn’t believe it, but this was in 2014 and I didn’t have the money to buy expensive stuff like that.

So my idea of MIDI is basically MIDI but associated with VSTis.

Sebastian_Alvarez · January 25, 2025, 9:08pm

Perhaps I have the wrong notion of what “MIDI Programming” means. Based on my short time studying this, MIDI programming is not like programming an app by writing code. It’s all the things involved in creating and manipulating a MIDI track. For example, changing the length of the notes, the velocity, editing the CC1 and CC11 curves, same for any other CCs that a specific instrument might need. Would that be called “MIDI Programming”?

mlib · January 25, 2025, 9:18pm

MIDI is a communication protocol. The MIDI you are using today hasn’t changed since 1983.

Reco29 · January 25, 2025, 9:27pm

And the mighty General MIDI shall remain in power for generations to come

Nobody has picked this User name so far. Tempting…

Johnny_Moneto · January 25, 2025, 9:31pm

That’s all ok, but MIDI data has just the size of a few bytes. In a big project it occupies maybe 200 kilobyte, where one of your sample libraries has probably more than 2000000 kilobyte. MIDI is not bringing a computer down, it’s the plugins.
That’s why “midi programming” is an unfortunate formulation.

Sebastian_Alvarez · January 25, 2025, 9:32pm

Well, in my experience it’s the opposite. It’s like I was telling yesterday about this project that has over 100 tracks, and I did it in part as a way to push the Mac Studio to the limit, so it has several instances of SINE with Berlin series, Cinesamples via Kontakt and some Musio too, and Eastwest, VSL, a bit of everything. On top of that, I just started adding audio FX as I wanted, from a lot of different vendors. The thing was barely playable, sometimes hanging the machine for several minutes.

So one day I thought I’d export all the tracks to a new project, so basically keep the audio FX but take out the MIDI/VSTi part, and it was a breeze. Almost every track had one or more audio FX and then the group tracks plus the FX tracks had lots of them. And this all played without a hitch.

But the project that still had the MIDI tracks, even removing every single audio FX in each track and all the other tracks, still unplayable.

So at least in my experience, the triggering of notes is the bottleneck. The more notes per bar, the more chances of choking. The SINE engine is especially bad at handling violin runs, perhaps it doesn’t load them into RAM when it sees them coming. Because I can tell you one thing without any doubt since I saw it for over a year. When I have a section that I can see the playhead getting close to a violin run, especially if it’s on several tracks at the same time, that’s when it pauses and the audio performance monitor goes to red for a moment.

This usually happens the first and even second time I play that same section, until it loaded into RAM completely. And this is even if I let the project load and walked away for a while. Because as you all know, just loading a project and it showing the main window and mixer doesn’t mean it’s fully loaded. It keeps loading all the instruments in the background, and the larger the project, the longer that takes.

And while that other project was on the Mac Studio, I just built a new PC mostly to deal with this, because I don’t want to waste time bouncing or rendering in place, or freezing tracks and so on. So I built it with 192 GB of RAM, an i9 14900 KF, and four 4TB NVMe drives that benchmark at 7 GBps. And I didn’t install all the libraries to one of them (4 TB wouldn’t be enough anyway) but I spread them across the drives so that they don’t all load from the same SSD.

And with this beast of a machine, I still get some hiccups. I can tell it’s a lot faster than the Mac Studio, and I can see some projects going well over the 64 GB I had in the Mac Studio. But it seems to me that it’s not just up to the machine, it’s up to how well the software is programmed to properly use all that extra power. And by this I mean Cubase and all the engines. It seems to me that SINE is not very efficient when it comes to RAM.

Brian_Roland · January 25, 2025, 9:46pm

I see what you mean. 100+ tracks through VSTi instances is huge.

Instrument severs apart from the DAW hosting machine will probably be very helpful for projects that large.

Sometimes it can help even if the server (VE Pro (paid)/Audio Gridder (free)) is running on the same system, but of course it’s best if you host some instruments on a totally different machine.

How many typically are ‘sounding’ at the same time?

That’s a lot of ‘live instrument polyphony’ to ‘mix’ together in a time frame of milliseconds.

To take advantage of different CPU/GPU cores, discrete processing has to be done (needs a different kind of code and compiler to free it up from the cores locked to the audio clock), and that still jacks up at least one CPU core somewhere that has to wait around for the right moment to inject its two cents. The cross talk to time everything right would be intense.

I’m not saying it’s impossible to do…just that it’s not easy, and probably needs custom chips and ‘instruction sets’ for compilers with real-time music/streaming in mind.

Would a GPU help? Maybe…it’s a good question. I don’t know of any instruments that use it though.

The DAW is likely choking because one CPU core is maxing out…trying to ‘mix together’ all the data being processed by other cores for the ultimate ‘mix’ you can ‘hear’.

Mrhehon · January 25, 2025, 9:50pm

And yet you don’t seem to understand the difference between midi and VST instruments. Two very different things.
There is no difference between the midi of old and the midi of today.
The midi of old was used to trigger hardware units and the same midi of today is used to trigger both hardware units and virtual units.
Maybe Google “Midi” to answer your original question.

MattiasNYC · January 25, 2025, 11:55pm

“We” are nitpicking, but it is not “the triggering of notes” that is the bottleneck. All MIDI does is tell something to do something. It is very light work.

It is the actual creation and processing of sound that is heavy work. VST instruments that model or play back samples, along with processing within those instruments (and whatever you add).

When you say “MIDI tracks” perhaps you mean “instrument tracks” which is not the same thing.

jeffrey.gerber · January 26, 2025, 4:08pm

You are proposing a specialized processor to help with the load. The computer industry now has the NPU (Neural Processing Unit) in addition to CPU and GPU. A current problem with NPU is the need to change the software to be tailored to make use of it. We also have the concepts of CISC (Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer). Historically we have seen a lot of RISC devices but more recently with ARM we are seeing CISC. Again, this is a software making use of the hardware type of problem. With the advent of the latest Qualcomm Snapdragon X Elite ARM (RISC) processors on Windows, not only are they RISC but they have an NPU. RISC has an advantage over CISC is that the way you compile software can allow for greater optimization. Optimization here is the key word in all cases. So, you are right in that having specialized processors can allow for optimization and more efficiency in the software workflow. Cubase is currently leading the efforts with the above-mentioned processors and people like Dom Sigalas have put out some videos speaking to the marked power and throughput they’re noticing.