Frequency selection based on custom parameters in SpectraLayers Pro 11

nottwentyseven · June 10, 2025, 5:45pm

Hello,
I would like to make a quick selection based on frequency range and level. My goal is to select everything breathy & sibilant in a vocal track and move it to a new layer. So in short I want to split the tonal & noise parts to make manual de-essing or just tweaks to the tonal part.
So far I couldn’t find anything to make a selection based on custom parameters.

Thank You in advance!
Niklas

ctreitzell · June 10, 2025, 10:53pm

What do you mean by “custom parameters”?

Anyway, sorry to say this, yet, short of RTM response:
Frequency Range Selection tool is what you want>
Paste to another layer
Then you can run “Unmix Levels” module where you can custom select the level threshold/ crossover

You might want to run “unmix noisy speech” depending upon how clean your recording is

To “preview” you selections; I often use the Rectangular Selection tool

You could also use the creation of Spectral Regions to help guide your freq range selection

nottwentyseven · June 11, 2025, 6:51am

Thank you for your response.

To clarify my perspective, I have often heard SpectraLayers referred to as the Adobe Photoshop of audio, primarily due to its layered workflow. As a video editor, I frequently employ keying techniques to separate specific objects or colors from scenes. Since color is essentially frequency and gain (brightness), I primarily adjust the keyer’s input and tolerance.

However, I can define a mask by first using a luma keyer (threshold set at varying brightness levels) and subsequently adding or subtracting from this mask using a chroma keyer (threshold set at varying color/frequency) or other available keyers. From my understanding, it is also possible to define a mask selection on audio based on multiple set parameters. For instance, I could select all audio within the frequency range of 3 to 5 kHz, within a range of level from -5 to -9 dB.

Given my familiarity with programs that separate tonal and noise information from a signal (Melodyne, iZotope RX, SpectraLayers, etc.), I believe there could be a similar approach for audio work. Separating the desirable portion of a vocal from the sibilant noise into two distinct layers would enable not only individual adjustments in level but also the application of compressors or other effects.

While I am capable of manual de-essing on a few vocal tracks, when dealing with lengthy dialogues or even podcasts, it is advantageous to have a „detect sibilance tool“ and two separate layers of sounds to work with. This approach provides a quick means of identifying and attenuating problematic audio segments.

I will implement the method you suggested and keep this thread updated.

Sunnyman · June 11, 2025, 7:05am

This is an interesting point you make: Spectral Layers ↔ Photoshop. I made this analogy myself sometimes ago. But there might be a catch, that does not seem obvious at first. When working with pictures (say black and white to make it easier), the x- and y-coordinate is of the same quality, while in sound, the x-coordinate is time and the y-coordinate is frequency. (The intensity is the brightness). And because of the nature of FFT, the mathematics behind it, might make it a bit more complicated. It’s like a partial differential equation with different qualities, like space and time, not only space alone.

So, working in and on the spectrum might not be exactly like working on a picture in Photoshop.
Working in the time domain seems much easier, than in the spectral domain - let alone both on the same object (like a selection in x and y). For working in the time-domain, one uses a well established program like Wavelab or Audacity or any other wave-editor. Spectral editing is a bit more complicated.

nottwentyseven · June 11, 2025, 7:25am

Indeed, your observation is entirely valid.

In accordance with the method suggested by @ctreitzell, I proceeded as follows:

Initially, I conducted a broad frequency-based separation. (Splitting the audio into two frequency layers)

Bildschirmfoto 2025-06-11 um 09.19.241010×946 81.2 KB
Subsequently, I refined the selection process by utilizing the „Unmix Levels“ module

Bildschirmfoto 2025-06-11 um 09.20.211218×816 86 KB
Finally, I obtained a distinct layer containing highly resonant high-frequency components, which closely resembled the sibilance I intended to separate

Bildschirmfoto 2025-06-11 um 09.20.341218×816 75.6 KB

Upon further processing of the individual layers, I encountered the typical type of artifacts such as smearing frequency holes. These artifacts resemble those produced by resonance suppressors, such as soothe2.

Perhaps adjusting the FFT size or treshold type the “Unmix Levels” module can mitigate the impact of artifacts, ultimately bringing me closer to achieving my “advanced de-esser”.

Again.. referring to Melodyne’s ability to separate tonal and noise parts of audio without any noticeable artifacts is truly impressive and let’s me attenuate just the nasty part.

Therefore, I have two options: either attempt mixing my processed layer in parallel or conclude that manually clipping the vocal sounds may be more natural than solely attenuating a specific resonant part of the spectrum.

nottwentyseven · June 11, 2025, 7:59am

Here’s an update: changing the FFT to a broader setting helped reduce artifacts, but setting it to the finest setting actually made them worse.

Comparing these settings: The fine FFT setting sounds much like soothe2 on a very aggressive and precise setting, while the broad FFT setting sounds more smooth like a dynamic EQ that reduces the resonances with a more natural sound.

Sunnyman · June 11, 2025, 8:23am

It would be nice, if SL would explicitly show, which operation is dependent on FFT-size or not.
This could be done in a different color of the text or a tool-tip, for instance. This should be optional due to settings.

ctreitzell · June 11, 2025, 2:12pm

For 95% of my work I set FFT to 2048

@nottwentyseven again, I have said this over and over and over on this forum:
as @Phil_Pendlebury says in his tutorials: “with SL, you are best served with a “re-balancing” of levels”. At this time, we aren’t getting crystal clear solo stems from recordings…maybe someday…maybe soon…but not currently. Nevertheless, my goodness can we do a lot with SL that wasn’t possible before.

When unmixed sections of the program material are completely removed or muted; there will be artefacts most of the time…sometimes you can get away with it, but usually not. Thus, leave the removed sections in but decrease the level…you might have to go so far as to write Envelopes ( I certainly have done a lot of envelope editing in SL.

In my workflow for dialog NR, I run a DeClicker and de-ess before I put any audio into SL; that’s just my idea of most effective chain of events in my work. Then use SL for things other software is not capable of…like mixing transitions where a comp of VO might be so tight that a consonant is happening at the same time as a vowel. The consonant will be HF and the vowel LF and SL is wonderful for these types of things.

I actually think applying visual workflow to audio is a distraction, but that’s me…I’m an audio focused person.

All this said, music and dialogue are different mediums, so there is that to consider as well.

nottwentyseven · June 11, 2025, 8:41pm

Thanks a bunch for your input! @ctreitzell

I’ve been in the audio post-production field for over 10 years myself, and I don’t think audio and video are the same. As I mentioned in my first post, it’s just an analogy I’ve heard a few times. I’ve worked with spectral audio editing for a couple of years using iZotope RX, but the idea of separating audio into different layers like SpectraLayer is still new to me. I’m mostly curious about how these processes can be non-destructive.

Do you have any other suggestions for a non-destructive way to work with de-essing on long audio tracks? I know about manual clip gain and fancy de-esser plugins, but I’m looking for a way to separate the sibilance from the rest of the audio to get a better overview of really long voice overs for example. I found the SpectraLayers De-esser Moduler to be quite limited. Also being able to listen to the delta signal while previewing a module would be great (only hearing what’s being attenuated by the de-essing, since it is a destructive change in audio).

Regarding Wavelab Pro, I really like their Rainbow waveform feature. It’s great for setting custom colors for different frequency ranges, and it makes it easy to see sibilance in long 2+ hours of voice tracks.

Thanks again!

ctreitzell · June 12, 2025, 9:20am

I’ve explained quite a bit on this forum my workflow for non-destructive editing in SL. I create an unwanted noise layer and mute that…I always use the same color scheme

What I find missing in SL for my workflow is some way to track and then be able to undo clone stamp editing when covering audio that has been removed to the unwanted realm. Clearly some specific history of certain tools would create quite bloated files.

@Sunnyman has previously asked for the rainbow WF from WaveLab IIRC
I have not used WaveLab too much…several years ago