I’m probably wrong, but live preview from a GPU process is probably not going to be part of the GPU patch which I guess is primarily about offline processing on the GPU to speed up wait times. You could try Goyo, that’s free and relatively good in preview. I usually just use the preview feature of UVR5 and do a few seconds and tweak until I’ve got the settings I want then run it full length.
Well, this thread quickly went into GPU acceleration which is very cool but I believe has nothing to do with this feature - or at least not directly.
When previewing in RX10, my CPU is bored and stays at around 15% load. It should be the same here - maybe a bit more because of the more advanced algorithms. Hey, it can even max out the CPU so that the preview is generated faster but it should never overload it or require any kind of extra GPU power.
Im fine , i’ve worked out i can use Dereverb preview on files that are shorter than 43 seconds so for now ,until till there’s some solution i can just make a small edit of the files and gather the setting required and apply to the full file .
It’s not as thou i use the feature every day so won’t cause me harm, be nice to have but i can work around it if need be
@ColdSteel I just tested Reverb Reduction on a 44100Hz Stereo file with a i7 12700K (so slower than yours), and no problem with real-time preview… Would you be able to capture a video of the issue (with the entire screen visible) ?
Your GTX 1650 will be able to benefit from GPU acceleration too.
GPU acceleration should be available both for offline (apply) and preview mode.
Pretty sure RX10 is still using a flavor of Spleeter, so much lighter-weight model/processing on the cpu than hybrid transformer types. Also has much a lower SDR! RX/Spleeter is currently ranking 17th best for vocal separation. SpectraLayers is definitely market leader now for a pro-level audio editor switching to Hybrid Transformer type.
Fantastic news , i know with new in depth processing SL’s will take a lot more resources , that’s the price you pay for using pioneering technology but this is very welcome . Thanks
BTW the files i was trying were 48khz
Yep, completely agree. I’m on Gearspace and have that webpage bookmarked I think. So we’re on the same page.
PS: Kudos on “advise” vs “advice”. I’m assuming it wasn’t an accident
The i9 12900K is not a slow processor, this is a programming and processing issue of usage of the AI/ML models.
AI aka ML trained models can be run on CPU or GPU/TPU, the problem is that running AI models on CPU’s is going to push them and it will take much much longer, even if the workload is distributed correctly over all CPU cores. Both will fail if there not enough RAM available.
To any programmer who has worked with AI/ML models this is day 1 on the job knowledge. Having a GPU with CUDA available (NVIDIA RTX series) and not using it when running any AI/ML model is just stupidity or laziness to not deal with the issues of working out if it’s being run on a mac or a pc with/without a CUDA capable graphics card.
I use various LLM’s locally on my PC, if I set it to CPU mode it will run them as long as I got enough ram, but will be painfully slow, and I have an i9 9900k OCd on all cores to 4.7Ghz, so your i9 12900K runs circles around mine, but the result would be the same, if even noticeable. If I use my RTX graphics card to run the AI/ML model the speed increase is night and day.
RipX which has some of the features in RX and Spectral Layers can use CPU or GPU CUDA, in CPU mode you might as well go out for lunch or watch a movie while it does it’s thing, GPU CUDA mode, you might have time to boil the kettle.
Also if an AI/ML model and what it’s processing is too big for the available RAM on a CUDA capable graphics card, it will start to use system RAM, and your back to almost CPU processing speeds.
So make sure you got a half decent CPU, at least 64Gb of RAM and an NVIDIA RTX4070 ti or better the RTX 4090. If you got a mac and it’s the £15,000 work station with a lot of unified memory/ram you might be in with a chance to compete with a low end RTX card then.
And hope Steinberg programs everything to use what hardware you got and correctly. CPU mode for everything = lazy mode.
@Darcey Just activate GPU acceleration (Edit > Preferences > System > AI Processing Device). It works both with NVIDIA, AMD and Intel GPUs.
It’s using CPU by default because not all GPUs are equal in term of capacities, drivers and stability, so to avoid running into too much troubles it must be set by the user in the Preferences.
Nice to know that the software at least has the ability to turn on use GPU, odd that it can’t detect the different types of Nvidia cards, RTX, GTX etc.
But at least “use GPU” can be turned on, I might actually be interested in buying Spectral Layers just over that piece of information, so thank you for that information.
NOTE: AMD GPUs don’t use CUDA but have something similar which is in use and in active development (I think they even open sourced it, where as CUDA is mostly closed sourced).
As for CPU’s, AMD or INTEL, it doesn’t matter, what works on one will typically work on the other.
SpectraLayers is not bound to CUDA, and can use brands other than NVIDIA to equal effect.
Many DAW PCs do not use discrete graphics cards (until recently, no need to), but instead use integrated graphics on the CPU, which offers little avantage in terms of ML workloads. That’s why it’s important to have CPU the default and allow the user to decide whether to switch to GPU processing in SpectraLayers.
Whether the DAW uses the CPU or GPU or not is not really the issue. It’s whether the separate piece of software (running within the DAW or standalone) is using the GPU or not.
Any software (not just VSTs) doing any ML processing should by default detect if a GPU is available, detect its capabilities and use accordingly. Defaulting to CPU is fine for small light ML models (which Spectral Layers is probably using) but it will be a lot slower than using the GPU.
Web browsers even do this, most of chrome runs on the CPU, video encode/decode and WebGL/WebGPU can/will use the GPU.
If I was to run the following on the CPU: ollama, codelama 7B, DeepSeek Coder v2 lite, Codestral 22B, Meta Llama 3 8B, Stable Code Instruct 3B, Stable Diffusion (SD1.5, SDXL or SD3) etc, they would be next to unusable. Only the ML processing part should use the GPU, if one is available and meets the necessary requirements, other parts of software can carry on doing it’s thing in whatever way it was programmed to do.
It would be illogical to not use the GPU if available for ML processing tasks.
PyTorch example:
import torch
# Check if GPU is available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# Example: move a tensor to the selected device
tensor = torch.randn(3, 3)
tensor = tensor.to(device)
print(tensor)
TensorFlow example:
import tensorflow as tf
# Check if GPU is available
physical_devices = tf.config.list_physical_devices('GPU')
if physical_devices:
try:
# Use the first GPU
tf.config.experimental.set_memory_growth(physical_devices[0], True)
print("Using GPU")
except RuntimeError as e:
print(e)
else:
print("Using CPU")
# Example: create a tensor
tensor = tf.random.normal([3, 3])
print(tensor)
It’s not complicated.
I agree, when I first discovered SL11 wasn’t really making use of the GPU even though selected in settings, I thought it was a bug because I’ve done some coding with CUDA and Demucs/MDX-Net etc with some source separation projects and it seemed trivial to run it on CUDA and whatever GPU the user has, it’s mostly just a memory allocation issue how far you can push the shifts and higher end settings…
I’m sure there’s a valid reason SL11’s demixing software is a CPU technology for launch and then gets patched to work GPU later, but it does concern me why would you ever begin a new model / algorithm for CPU today, especially as many SL10 users will have CUDA available anyway?
Can’t go into all the technical details, but long story short - you don’t embed the whole Python/PyTorch ecosystem in an audio production application (it takes several gigabytes, and it’s super painful to maintain across a wide variety of different systems). It’s great for R&D, but not for deployment.
For deployment lighter systems are used (in that case, ONNX / ONNX Runtime), which are more stable and easy to maintain, support a wider range of GPU than Python/PyTorch, but it comes with some restriction regarding the neural network nodes you can convert, and what node can be GPU accelerated. This need lots of tweaking for each model so it fits the ONNX ecosystem.
Lots of models changed with SpectraLayers 11, and Unmix Song requires no less than 3 different AI models to reach that level of performance. Unfortunately, one of these model did not properly work when executed on GPU, it requires more fine-tuning. When/if that remaining model works properly on GPU, then the whole Unmix Song module will be fully GPU accelerated.
Appreciate the reply, it makes sense, especially with the preview side also a requirement/feature. Thank you for everything you do btw, it is appreciated.
… except in cases where one does not want the GPU used, because one’s particular CPU is faster.
It’s not complicated.
I’m not aware of any model capable of GPU inference that would get through it faster on a CPU than a GPU, however basic a GPU is it is designed for massive parallel processing across many more cores than available on any normal high end consumer CPU.
This is an interesting conversation. I do have an i7 6 core (running 12 threaded cores) with 64G (3200 DDR4) of Ram, and an older AMD RX570 video card. I am not having these Reverb problems in SL11P that are being reported here. I can audition the DeReverb module with either and then process either at about the same speed.
However, currently my stem separation is definitely quicker with the RX570, this is true. However, do any of you here (using Windows) think that, with all DAW software, that there needs to be a repetition of a process to somewhat educate the system? For example, with the DeReverb, have all of you tried using CPU at least 10 times with various songs - not just the same song?