Allow Cubase to export projects in 44.1 kHz at 96 kHz FOR REAL

cparmerlee · May 12, 2022, 1:25pm

Just for the record in case anybody is still scratching their heads, the 4 clicks you refer to are the ones mentioned here, right:

roger-s · May 12, 2022, 2:14pm

Well, I guess it´s time to contribute to this discussion again. I have read every comment carefully.

So here it goes, sorry for the long reply:

I think we all can agree on that the average customer is unable to appreciate the difference between a standard recording and a high quality recording, so I think, it is unnecessary and not contributing to the discussion itself to re-use that argument over and over again.
As I put it in a more personal comment, I wasn´t talking about how customers would react on a difference quality, I was talking about to FEEL THAT YOU HAVE DONE EVERYTHING POSSIBLE TO GET THE MAXIMUM OUT OF A PRODUCTION. As a personal motivation, I always fight for best quality, we all do, for that reasoon we record numerous takes of vocals and other instruments, just to get the best out of it.
In digital audio we have the situation, that higher sample frequency has multiple impacts:

more high frequency content (the most obvious for all)
more precision on what has been digitalized, as we do have more samples and therefore more information. Compare it with pixels of a JPG, more pixels = more information.
Point 2 also leads to the following point: an information, that isn´t there, cannot be considered in processes that follow. Hence, we have a LOSS OF INFORMATION. Yes, 44.1 kHz maybe close enough, still I want the maximum quality possible.
usually, audio processes ALWAYS modify the audio signal, but at higher sample frequencies the results of these (undesired) modifications have LESS IMPACT, so we do get here an IMPROVEMENT IN AUDIO QUALITY. Actually this is one of the main reasons why at some point HD production and 64 bit audio processing became relevant.
Taking the argument from point 4, audio processes are indeed accumulative, so applying x number of processes on a 44.1 kHz file will yield in a worst result than doing the same processes applied at 96 kHz. This definitely is one of the major points in terms of audio quality.
The input filter and reconstruction filter usually are not a brickwall filters, but have a slope, as they introduce less coloration to the sound. This means, that we are losing actually high frequencies from the moment we record at 44.1 kHz. The frequency range isn´t lineally represented up to 22 kHz, but lowering from way before that frequency. Recording/Producing at higher sample frequencies results therefore in that the cutoff points of both filter move towards higher frequencies, which represents another IMPROVEMENT IN AUDIO QUALITY.
The argument of "You cannot hear the end result when exporting at a higher sample rate is firstly relative, and secondly a problem audio producers do face since the beginning of recording. When recording to tape, we always had inevitably tape hiss on the recording, which means, that I cannot amplify this signal without making the noise audible. A limitation since always., Nevertheless, this limitation is theoretically gone when recording at 24 and 32 bit. So, with todays technologies, who would simulate intentionally tape dropouts, unaligned record or play heads, tape distortions/saturations, background noise. If you can avoid that, YOU RECORD WITHOUT THOSE ERRORS, and if desired, you apply them in the mixing process. But you always should try to reach for maximum/best result, and this begins with the recording.
If a plugin is uncapable of changing sample frequency or does apply undesired changes, then it is simply a programming error. I say this from a point of software tester I do for a living. So this is a technical question, which when unresolved, becomes a quality issue. We shouldn´t simply accept, that a plugin throws out a (drastically) different result.
We have a resurgence of old audio distribution formats. Vinyl discs are sold more and more, and can have a frequency range of up to 30 kHz, so we have more room for high frequencies. But converting my stereo file for vinyl cutting brings other limitations, like bass amount, stereo width and others. I can´t hear that limitations when exporting my audio file from my DAW. Same when thinking of selling music on a cassette, we are unable to anticipate with certainty how it will sound on a tape. Same situation, when creating an mp3. Just to hear the result as mp3 wouldn´t make me export the mix in mp3, nore cut high frequencies, that are eliminated anyways.
There is also a market for HD recordings.

I am really greatful for any input. Also, I did not expect so much discussion on this topic.
Keep it coming.

MattiasNYC · May 12, 2022, 2:54pm

Comparing it with JPG etc. is a bad analogy. And the “more information” you end up with is completely wasted if you’ve bandwidth limited the signal beforehand as you should do according to the theorem. So really you’re just back to the same as your first point, which is that you describe higher frequencies with a higher sample rate if you also move the cutoff frequency up. If you don’t, then you gain no more valuable information. In this case more precision = higher frequencies, not better described frequencies.

Also, there are numerous plugins that use oversampling to overcome the issues you bring up. So a case can be made for simply using better sounding oversampling plugins.

roger-s · May 12, 2022, 3:17pm

Well, as put in my last post, this is not about wasted or not wasted information. This is about higher accuracy when processing the audio.

Also, the argument to change the plugin is beside the point. The whole idea is, not to have to change the workflow, and still get the best result. Changing a plugin for another is simply unacceptable. You wouldn´t change a simply one synth for another, because they sound different.
You usually don´t get a 100% equivalent software with oversampling just because.
And as an analogy: I wouldn´t certainly ask an art painter to use a color or to use a different brush from a different manufacturer, because for the painter it´s not only the colour, but also the texture of the paint.

I hope this helps to make clearer my point.

MattiasNYC · May 12, 2022, 4:01pm

It’s not easy to know what people mean when just reading text. You wrote “more precision on what has been digitalized”, not 'when processing the digitalized audio". In terms of understanding digital audio those are two very different things. “Digitalize” to many I think implies conversion, meaning that the recorded/digitalized/converted audio has more precision, and “processing” does not imply that.

mlib · May 12, 2022, 4:42pm

I can’t agree with this sentiment—and I bring it up as the whole feature request is in ways based on this sentiment that higher fidelity should be prioritized at any cost.
In a real world scenario, recording studios, production houses, engineers, composers… all work with time and budget constraints. Every business looking to be profitable need to consider their return on investment.
This is why no mix engineer is going to spend an extra 8 hours on that snare drum track for an overall 1% increase in quality. It simply is not worth it. You assign your resources to successfully complete the project. No more, no less. (In most cases.)

This is an incorrect analogy. Higher sampling rate does not improve any qualities of lower frequency content (already explained very eloquently by @MattiasNYC, @RichardTownsend , et al.).

roger-s · May 13, 2022, 2:33am

Of course you are free to disagree from whatever I write and publish here, but we are moving around in circles. I think it is necessary to explain (on last time), what the idea of this feature request was about, and what it was not about.

What it is about:

When my DAW gives me the option to export at an higher sample rate, then of course I want the higher / broader frequency response. What else would be the purpose of that option? It´s useless to have a 96 kHz file, with the frequency content of up to 22 kHz (not even linear). So if I work at 44.1 kHz (because I can use more instruments for example), when I export to 96 kHz, I want that audio to contain frequencies up to half Nyquist. The technical implications have already been mentioned, as well as the fact, that there are DAWs, which do actually support it. Also, audio processes are more exact at higher frequencies, and it is necessary to remember, that every alteration of an audio file automatically implies a (minimal) quality loss. This loss is simply minor at higher sample frequencies. Please read about the current situation when exporting audio in Cubase/Nuendo at the very beginning of this thread.

What is it not about

How Nyquist theorem works…I guess we all now enough of that.
If the client notices or not the difference. This is not about a client, but me.
If it is worth the effort, I already decided that for myself.
If I spend or not 8 hours on a snare…if I decide to do so, I will do it (and I already have done it, not only on snares…and yes, there are others, who also would do it for the sake of personal perfection and satisfaction). And you will too, if the client demands it and is willing to pay for it.
If it´s a business or not…I am well aware of cost and time limits, and exactly because of that, I build my own studio (as many of us here have), so this is simply not the point either.

I am talking about what is technically possible.

The feature request is about WHEN PRODUCING WITH VIRTUAL INSTRUMENTS to work at 44.1 kHz but to be able to export at a higher sample frequency. There are sample-based instruments with 96 Khz samples, so of course I want to take advantage of that, but if I work from the very first draft at 96 Khz instead of 44, than I will only be able to use less than half of instruments, because obviously higher sample frequency means more CPU impact.

The benefits of higher sample frequencies are not only in a gain of higher frequencies, but in the PRECISION of the software algorithm that create and process this audio and the GAIN IN QUALITY that comes with it. This also has been discussed, and this is, what I want to take advantage of too.

Probably there are (at least) two groups of users, those you care about every detail of their production, and those they don´t. I am from the first group, a commercial studio by nature is usually from the second, as time is money. I respect both positions. Whatever makes anyone happy…

But … as with every long conversation, the thread gets too long, and the essence is getting lost. So, as long as Steinberg doesn´t implement this function, I will have to adjust my workflow…which is, what I wanted to optimize. Thinking of how to resolve a technological challenge distracts from the creative process, and for me, it´s about that.

As special comment to @MathiasNYC:
Sorry if I haven´t been clear enough in my explanation. But I am glad to have been able to clarify.

A special THANKS goes to @cparmerlee, @fese and @Adonde.

Adonde · May 13, 2022, 3:27am

Idk. I hear a big difference when I mix at 96k with plug-ins which generate saturation and other harmonics. That’s totally different than listening to music at 96k which is more debatable. Anyway just saying that being able to render at 96k does have advantages especially if you use lots of saturation.

cparmerlee · May 14, 2022, 12:07am

That’s an interesting observation. I wonder if we are all saying mostly the same thing but getting stuck on terminology. I believe what you say about being able to discern saturation/harmonic effects at 96 vs 48, where it may not be so easy to discern “regular” music.

In this thread, the term “frequency” has been used often, and I have suggested the word “complexity.” instead. In your example, I hope we can all agree few human adults can hear above 18K – maybe dogs can do 25K but humans can’t. Therefore, if we were talking about “frequency” LITERALLY, your observation would be mathematically impossible, as 48K can represent “simple sine wave frequency” way above human hearing.

So if your observation is true, and I don’t doubt it, we are talking about effects that are not literally “frequency”. That’s why I mention the word “complexity” meaning how , when we mix a lot of sounds together, we have a complex wave, not a simple sine curve. And when we stack dozens of these mixing/processing operations , we can get cumulative errors… Of if you don’t like the word “error”, how about “cumulative differences” in the results when done at different precision levels.

I think much of the conventional wisdom here is wrong (or at least a little off track), mainly because it was learned in an era before it was normal to layer 15 effects on a guitar track.

Adonde · May 14, 2022, 12:42am

Perhaps it is a semantic issue as you mention? But I’m mostly referring to the audible aliasing that happens on some plugins which generate harmonics. Quick read about this here: https://www.newfangledaudio.com/post/introducing-saturate-1-10

But actually the worst offender was the Steinberg Mystic (or Spectre) VSTs which sounded way different after I switched my project from 96k to 48k. Almost like there was some lo-fi effect on them.

FWIW, 96k audio is super useful for sound design as it stays brighter during extreme time stretching and other intense processing precisely because its ultrasonic frequency content can be shifted in to the audible range. Of course it is definitely a waste of CPU in some situations. But not all.

cparmerlee · May 14, 2022, 1:57am

I can well imagine that could make a big difference. And once again, to the extent that most of our conventional wisdom comes from the tape & analog console days, time stretching just wasn’t a thing then – certainly not a thing where people expected t maintain fidelity.

Once again, that points this thread toward special cases and away from the more mainstream usage. But I would argue the special cases have a way of becoming mainstream as time passes. For example few people would have done vocal time alignment (which may involve a lot of stretching) a decade ago. But it is pretty common today.

MattiasNYC · May 14, 2022, 2:21am

Again, a complex waveform can be reduced to just a bunch of sine waves at different frequencies and amplitudes. You’re right, it’s not “a simple sine wave”, it’s many simple sine waves. That’s all that complexity is.

Neither technology nor physics nor math has any clue whether a complex signal is the result of one distorted tuba or 15 effects on a guitar. It doesn’t matter what the signal is, it’s just a bunch of sine waves.

I’m not saying there aren’t benefits to processing audio at a higher sample rate, just that complex waveforms are the same as sine waves added together.

cparmerlee · May 14, 2022, 3:11am

Right. But then they are no longer sine waves. That’s the point. Look at them on an oscilloscope.

jwatte · May 14, 2022, 4:03am

Any signal can be broken down to some number of sine waves with known frequency, phase, and amplitude. That is one complete basis. You can express any possible time-varying signal in these terms. These terms are kind-of useful, because our ears quite literally (physically) are made to detect frequency and amplitude, with some additional capability of relative phase mainly between the two ears, for stereo.

There exist other bases. Wavelets are one, and the sequence of lifted Haar cascades are even somewhat tractable/useful – but those aren’t something your ears can detect directly, because your ears detect frequency and amplitude. And transforms exist that go between representation A and representation B, with full preservation of precision.

What someone most likely hears, when a plugin sounds different at 96 KHz than at 48 kHz, is “bugs,” or more charitably expressed, “design limitations.”

There are three kinds of such design limitations that come up with some regularity:

A plugin may be tuned for a particular sampling frequency, such that when the knob says “1 kHz,” it really means “48 samples.” When you run it at a different frequency, 48 samples means something else, but the knob may still say “1 kHz.” Same thing for time delays, and a few other time-varying parameters (feedback, etc.) I would classify this almost squarely in the “bug” category, but it’s still something that happens often enough that you may notice.
A plugin, especially distortion, wave shaping, and other non-linear plugins, will generate additional harmonics (frequencies.) That’s in fact the whole point of distortion/saturation/“fattening” plugins! What’s problematic, is when those frequencies are generated above the Nyquist frequency you’re processing at. At that point, those higher frequencies will be folded back below Nyquist, as aliasing. If you sweet a “perfectly sharp” square wave in frequency, you can easily hear this aliasing as squishy noises that counter-sweep in a different direction. (And this is why “perfectly sharp” square waves, don’t actually sound good.)
A plugin that runs at 96 kHz instead of 48 kHz, will generate much less aliasing – not just half as much, but frequently much, much, less, depending on how fast the generated harmonics falls of. Thus, once you get back down to the audible range, there’s not a lot of aliasing left for the ear to hear.
This is also why we use anti-aliasing filters in analog recording gear, and brick wall reconstruction filters even in oversampling D/A converters.
Some plugins will be careful to anti-alias-filter before any frequency hits Nyquist, but if you run at 48 kHz (Nyquist at 24) and want to let through signals up to 20 kHz, you only have 4 kHz – one-quarter of an octave at that point – to cut out 60+ decibels, and most filters that can even get close instead generate noticeable phase distortion, or pretty long additional latency.
Meanwhile, if you run at 96 kHz, you have the entire range from 20 kHz up to Nyquist of 48 kHz, (and then folded back down again!) to cut out audible overtones. That’s more than two octaves’ worth!
Thus, the effect of anti-aliasing filters, even when the plugin correctly scales parameters, and correctly applies anti-aliasing, can be noticeable, and is many times more drastic with 48 kHz sampling rate than 96 kHz sampling rate.

Thus, the desire to record and mix at 96 kHz can be totally legitimate and rational, even though every signal is fully described by frequency plus amplitude, and even though the ears cannot hear frequencies above 20 kHz (or, in the case of most modern humans, less than that.)
But, again, because the sound changes when you do that, you really don’t want to be mixing/monitoring at 48 kHz, and then re-instantiate all plugins at 96 kHz just for rendering something you’ll send off – whatever gets rendered, won’t be what you heard.
Thus, it’s actually the conservative, safe, and sane thing to do, to require changing the project sampling rate to something new, if you want to render to a new sampling rate. Like, even if the Steinberg/Cubase developer work was literally free and guranteed to be without bugs, I would not want the export dialog to render at a frequency other than the project frequency. That would be a bad thing.

RichardTownsend · May 14, 2022, 7:26am

I don’t really know how to convince you of this - but the conventional wisdom is mathematically proven to be true! It’s not even physics, it’s maths.

Any signal really can be represented by the sum of sine waves of different frequencies.

RichardTownsend · May 14, 2022, 7:26am

That’s useful, thanks.

MattiasNYC · May 14, 2022, 2:36pm

Then look at it ‘backwards’: Ask yourself how you could possibly both capture and reconstruct a sine wave at 4kHz with a sample rate of 48kHz. How could that possibly work? Why don’t you end up with a different output? Why don’t you see a complex waveform on your oscilloscope measuring the output?

Or if you add 500Hz and 1200Hz to a 100Hz sine wave and record that at 48kHz - why would that be stored correctly and then output correctly, verified with an oscilloscope, if ‘anything is possible’ because ‘complex waveforms’?

The sampling theorem accounts for all of this using math. It works. It doesn’t stop working because you think the waveform is squiggly rather than a smooth sine wave. The system itself has no awareness of whether the signal is a sine wave, three sine waves summed, a square wave, a trumpet or an elephant farting - it just does what it does with no regard to just what the waveform is.

Why would that work with some simple signals (sine waves) and with some complex signals and (supposedly) not with others?

You should probably just read the math involved in the theorem and study the technology used at this point if you don’t agree with this basic concept of ‘modern digital’.

PS: You could also maybe consider that we’ve had digital audio based on this theorem for decades by now, so you’d maybe think that those involved in both theory and practice would have agreed with you at this point - because you can’t both be right. So it seems a bit… “unlikely” that you would be right, and everybody else doing this for a living would be wrong.

jwatte · May 14, 2022, 7:18pm

But everyone was against Copernicus! And jet fuel can’t melt steel beams! And my cell phone reception is so much better after I got the vaccine! And math can’t describe what I feel in my heart (which is totally an organ that can perceive sound!)

Also, I have some $1000/meter speaker cables for sale. They look really nifty, and therefore will make your music listening experience more enjoyable. (You don’t even need to plug them in to get the perception benefit!)

… alright, it’s probably time for me to stop contributing to this thread …

cparmerlee · May 14, 2022, 9:08pm

I am not debating whether or not a complex waveforms can theoretically be decomposed into 100 or 1000 constituent sine waves. or even that they “perfectly” match the original real-world complex waveform.

I am simply saying that is irrelevant. it is like claiming the world is loaded with fractals. Fine, but what can you do with that information? Not much.

In the digital realm, we deal with sampled sound, not sine waves. Does this look like a sine wave to you?

That is what today’s software deals with – complex waves, not some theory about breaking them down into individual sine waves. Now, perhaps there are some VSTs that indeed do “reverse engineer” the underlying sine waves. That’s fine. I don’t care one way or another. I simply observe that in a world where the processing begins and ends with high granularity samples of ever-changing wave shapes, the accuracy of that sampling surely can be a significant factor.

MattiasNYC · May 14, 2022, 11:00pm

You simply don’t understand this.

We’re not saying that there’s some processing going on that literally deconstructs complex waveforms into sine waves, we’re saying that because it is a possible way of describing a complex waveform the process that is employed works. The essence of complex waveforms is what makes this work - not actual deconstruction. If you don’t understand that difference and keep saying that because our waveforms are complex it doesn’t matter that they can be described as sine waves then I really don’t know what else to say.

Read the theorem. Like, literally read the actual theorem and related theory.