Using the GPU on Apple M-series

Hi from the north,

I just tried out and bought the Wavemachine ReBeat, a drum stem separator that solved a problem I had with a buried hi-hat.

Been following ReBeat on Gearspace and came across this post:
" – Just for some context, running ReBeat on a Mac M-series computer is about 12x faster than running on a PC (even a very fast one) or an Intel Mac. It has to do with the built in GPU which ReBeat takes advantage of."

As it stands today, Mac M-series machines are very much slower on SpectraLayers than PCs with good graphics cards.

Can SpectraLayers benefit from this same technology?

Kind regards, Terje

4 Likes

I have no affiliation with Steinberg, but since you posted on the public forum, I’ll give my opinion which is “probably if not likely.” And I think you’ve got a slight generalization in your post - as I understand it, SpectraLayers’ GPU acceleration only CUDA, an NVIDIA-only solution. So it’s not faster than “a PC” or “intel chip” because it’s a PC or an intel chip.

I have no doubt that the industry in general is going to begin investing dev resources into supporting the built-in GPU capabilities of Apple Silicon, particularly given that it will benefit ALL Mx users going forward without concern for the PC “build-dependent” model.

I would go as far as to day current plug-in developers would be smart to move in that direction for the same reason, and because this is now entering the realm of “competitive advantage.”

t

Yes Solstudio, I reacted to that too when I read it at Gearspace. It clearly says that ReBeat uses the GPU part of the Apple silicon processor to make it fast.

As I understood it from what Robin has told Apple has not been willing to open up for programmers outside Apple to get full access to the GPU facilities built in the M series processors to accelerate stem separation tasks.

ReBeat programmer Rim Buntinas tells the opposite, I think.

3 Likes

Yeah, I saw that Google AI said that too, but if that’s true then how did ReBeat “gain access” to the GPU functions? Apple has a fully documented Metal API platform for GPU-based development, and I’ve seen references that it’s “better” than OpenGL or CUDA, but I have no idea. But the API is fully supported, so I think it’s just a matter of adoption. It really wouldn’t make much sense for Apple to build an entire integrated processor solution with associated development platform to then turn around and limit adoption by bricking off the API. In my opinion, anyway.

I agree. Nevertheless they seem to obstruct third party delelopers.
Maybe they try to get people to use Logic with its stem separation.

I think he’s talking about how 3rd party frameworks don’t yet support the Metal model, not Apple “obstructing” anyone. Apple just want developers who were using stuff like PyTorch and CUDA to use PyTorch and Metal instead. I mean, I see they guy’s point, and I’m sure it’s frustrating, but it sounds like they’re just waiting for the 3rd party frameworks to catch up and provide support for other models, like adding Metal support instead of CUDA.

None of this really matters as the framework devs are going to do what the framework devs are going to do and devs like Steinberg are most likely going to keep using their existing frameworks, but none of that has to do with the statement that Apple is “not willing to open up for programmers outside Apple to get access to GPU facilities.” You can Google for Metal integration with PyTouch as well as the Metal API and see that’s not the case. Again, it doesn’t matter, but to me there’s quite a big difference between the industry working through adoption iterations and tool migration and classifying this as “Apple stopping people from doing it.”

Ok, obstructing was perhaps too harsh a word, but in the end that’s the net result from Apple not being willing to cooperate when people like SpectraLayer’s developer ask them to. But all this is beyond my horizon, I don’t know anything about coding.
I’m just a frustrated Apple silicon user seeing PC users with an appropriate graphics card achieving way faster results even though my computer should have the ability to be as effective.

I totally get it, and I’m right there with you. “Intent” means a lot to me, and where I see what I feel is a mischaracterization of the source of an issue, I at least say something. Again, not that it matters, but Steinberg was talking about an AI framerwork model, not accessing the GPU itself.

By way of example, I use TouchDesigner for audio-based component visualizations. It’s amazing. I spend more than I’m willing to admit on my M3 Max MBP as associated monitors. I’m in the same boat as you in that if I want to take advantage of faster renderings and optimizations, I would have to get an NVIDIA chipset as it too uses CUDA. Here I am sitting on a 40 CORE GPU, and I can’t do jack about it. It’s frustrating for sure. But then I realize that it really doesn’t impact my life. So I sit there for a few seconds, or minutes, more waiting for something to render. It doesn’t matter, really. If it DID matter, then I would have done my research beforehand and said to myself “Self, you need a PC with an Nvidia card to use this” and would have laughed at myself for considering it. But after-the-fact FOMO set it and it gets to me.

I too use SpectraLayers (not much, but I’m expanding my use case) and have not run into any issues with something taking too long. If it did, and it mattered, then I would examine my options. The internet tells me that CUDA has been around since 2006. I think it’s perfectly reasonable to weight out my options and wait for Metal/Apple Silicon to permeate through the market as it gains more support. And it will.

Just thought I’d say that as from a “feelings” perspective I’m right there with you.

1 Like

Apple Silicon’s main technology for hardware acceleration for the types of array processing needed for source separation / inference are Metal Performance Shaders (MPS), Nvidia’s are CUDA, but it all depends on what types of array processing is required. Some processing maths simply cannot happen using MPS because the equivalent maths used on CUDA isn’t available on Silicon and so those processes fall back to slower CPU processing on Silicon. Sometimes the quality of the inference (the separation quality) is lower on MPS too than CUDA and so you might drop performance to use CPU processing in order to increase quality at cost of speed. It all depends though. End of the day Nvidia is where the source separation technology is probably most suited because historically AI Training and Inference, Gaming, Audio Processing, Video processing, Rendering etc is generally done on Nvidia hardware not Apple hardware so where the focus has been for longer than Silicon.

Thank’s both of you for clarifying to me.
The situation seemed a bit different though - to me as a plain user at least - when the M1 computers came (I’m on a Mac Studio Max M1) and it sure looked promising with the new fast and awesome processors. Little did I know that gfx cards were used in sound processing.
But I have learned… :stuck_out_tongue_winking_eye:

1 Like

I disagree with this. However I don’t feel like I need to give an explanation, so to keep my response short I’ll respond by saying there doesn’t need to be an equivalent of mathematics, just a simplified version of it (kind of like the scenario of breaking down a huge algebraic equation to a simpler form).

In-regards to the main topic: Apple in-general is a difficult collaborator and they(Apple) only collaborate with others when it suits their agenda. My advice is to Steinberg is to reach out to Apple and then if no response shift the blame back to Apple and if enough consumers start complaining about using other products(like Windows) then Apple will eventually come around.

There is nearly always a simplified version of a function at a mathematical level because most code does have to fallback to work without hardware acceleration too, but I can give a real-world example where just last week I was programming a source separation function that makes use of Torch.fft which is very efficient Fourier transforms and works on CPU and GPU tensors. While it runs very well on CUDA it’s performance was worse than on Scipy FFTConvolve on Apple Silicon, so for Silicon systems the code switches over. This process is not quite as good as Torch, and obviously not as fast as on CUDA, but the performance is better and so a compromise is made. At some point Torch.fft might become faster on Silicon/MPS and then performance can be improved. The game of source separation simply is more developed on Nvidia that’s all, the hardware acceleration is already there to use and you can keep things running much more in a Tensor operation than converting back to arrays for the CPU.

2 Likes