Sound Separation

Jessica_Johnson · July 2, 2023, 4:18pm

I have a question. So I know that when there are two channels you use the unmix tool, but I see that it is not separating the sound if both channels have the same sounds. This could also be for just one channel. The select similar does not really do a good job at doing this. Would there be a way to use noise reduction to pick a sound and split it into another layer without removing the other layer. If I don’t want to remove the sounds just yet, I just want to choose what sounds overall I want to separate. I feel if there could be an option on noise reduction or these other tools that could have instead of just signal and reduce noise for example, have split into new layer that would be great! Could this happen. If noise reduction can already be used to do either or signal or noise reduction, can it be easily added to do this?

Unmixing · July 3, 2023, 12:19pm

I’ve noticed a couple of posts about this which leads me to believe that either users are not aware of how select similar functions or are using it incorrectly or are not aware of tips and techniques to use when using select similar. Can you post an example of select similar not working for you so I can understand what you are experiencing (you can send me a message privately). I would prefer you send me the project file with selections saved that way I can load the project file and apply the selections (that you attempted to use for select similar) and understand what you are experiencing when you say that select similar doesn’t work.

I would be more than happy to do a tutorial demonstrating how powerful select similar is.

Sam_Hocking · July 3, 2023, 1:47pm

You can duplicate a layer and use a selection as your noise profile and denoise it out of your layer, then phase invert and merge that denoised layer into the original layer which will leave your demixed noise on its own. Really depends on the type of audio how well that works. Select Similar can work but not always as you’ve found so then it’s a manual selection task through the length of what you want separated.

Unmixing · July 3, 2023, 5:43pm

Maybe I’m the one that is missing something here. Maybe I’m not understanding something. Maybe I am not comprehending something because users keep saying that select similar doesn’t always work and I never ran into a situation where select similar has failed me (if anything, select similar tends to over deliver).

Is there a project file or an audio example where you (or someone here) can post where select similar doesn’t work or has failed. I would like to at least come to some sort of understanding of why it is failing for everybody and offer help(tips/techniques) if I can.

Sam_Hocking · July 3, 2023, 7:01pm

Any low mix level frequency bound pattern moving in an out of stereo field will not match because it’s a repeating pattern of varying tonal, transient and power, it also might not be desired to only select similar. As you see in all the videos covering its use, they’re always using easy audio examples with plenty of space and high tonal and transient difference in the surround audio so select similar easily works in those circumstance. Take any audio where you can just about hear what you want to select and select similar will either not work or you’ll have to lower the similarity window to to the point it’s actually selecting dissimilar sound.

Unmixing · July 4, 2023, 1:39am

I have just tested some audio examples and every single audio I tested with select similar works. Even with bad stereo field.

I must be missing something or not understanding what other members are talking about because select similar is functioning properly for me and I cannot find an example where it failed me. I even attempted to use extreme situations and it always worked.

I wish someone can post an example of select similar not working so I can get an insight into exactly why it doesn’t seem to work for them so I can help. The best way to demonstrate that select similar doesn’t work is to either post a spectralayers project file or send me the project file and then save the selections within the project file of the same selections you are using to use select similar with, so I can load and open up that project file, apply the selections(that was saved) and try the select similar process to gain insight into why it is failing for some users.

Jessica_Johnson · July 4, 2023, 8:57am

So, I am actually not working on music entirely. I am actually working on audio of where my child was abused in his PreK special education class. One of the issues is that the music is entirely too loud to hear everything they are saying. It is like Mozart and a few other classical songs. I have been pulling each frequency out individually because I am afraid if I use the remove, I may remove a whisper in the process. So, I have been working on taking out static and humming too. I actually just got done reviewing Phonexia which is an artificial intelligence software for law enforcement, government, and military. I have a degree in criminology so I wrote a paper in return for letting me test their software. Police didn’t really help me as they said they could not take audio as evidence. Them, DCS, and the School failed my child. He is 6 years old with PTSD. It has been unimaginable. That is a whole another story, so I started learning how to do audio forensics myself. I have tested any program I can get my hands on. I know it can’t be used in court this copy anyway. The police have the original so when I do finally share it with the news, and I plan to, if anyone disagrees with what is said on the recording, they can have my Metadata/my hash of it, and they can send the original to the TBI, which is our state enforcement to overview. However, so far I have found Spectralayers to be much better in pulling out sounds as I can do it manually. When I select sound like the same note in a song using the similar tool, it is a hit or miss. It has honestly been faster for me to just pull it out myself. Exhausting because I have to listen to the recording over and over of my child being hurt. The audio recorder I placed in his leg braces, so it does have feedback. I have been working on this for a few months now. I had to stop working to home school him full time, because I will never trust them again. I do the audio when my son is at therapy away from the house. I don’t want him hearing it. I just want to get it finished and transcribed, experts were out of my pocket range, because my son is in therapy 5 days a week now. So, I am doing the best I can and when I share I hope that someone will actually reach out to help. I hope the FBI because my son has autism, and it was abuse off and on all day. I am a behavioral therapist so I noticed his behavioral change really quick, but not quick enough. [quote=“Jessica_Johnson, post:1, topic:857231, full:true”]
I have a question. So I know that when there are two channels you use the unmix tool, but I see that it is not separating the sound if both channels have the same sounds. This could also be for just one channel. The select similar does not really do a good job at doing this. Would there be a way to use noise reduction to pick a sound and split it into another layer without removing the other layer. If I don’t want to remove the sounds just yet, I just want to choose what sounds overall I want to separate. I feel if there could be an option on noise reduction or these other tools that could have instead of just signal and reduce noise for example, have split into new layer that would be great! Could this happen. If noise reduction can already be used to do either or signal or noise reduction, can it be easily added to do this?
[/quote]

Puma0382 · July 4, 2023, 10:32am

Hi - in the meantime, whilst waiting, maybe post some work/examples demonstrating your expertise, that others might take on board.?

Instead of just project, files and screen-grabs, if you could make a short video of the workflow used, even better.! I for one, would be really keen on seeing some ‘real world’ results of this functionality. Sounds like it could significantly speed up the workflow…

Sorry I can’t be much help myself, since I only own Elements (v9) edition. ‘Select Similar’ is Pro only.

(PS - My thinking with all this is, once you’ve posted yours, it might kick-start others into providing their examples, for further debate or comparison etc, etc…!)

Unmixing · July 4, 2023, 1:19pm

@Jessica_Johnson

Ahhhh okay, I sort of understand now. So with that scenario select similar can still work for your situation however I would recommend only having tonal selected (as vocals main power is their tonal aspect).

However because there are so many variations when it comes to vocals/voices, select similar might not be the best option for that scenario. I have used select similar to find similar tonal elements of vocals (for example background vocals that repeat in every chorus within a song) and I have used the power option plus tonal option to find vocals that have similar overtones that doesn’t necessarily repeat(as far as lyrics goes).

I cant speak for the main developer of spectralayers but (to me) Select similar was designed for automatically scanning the entire spectrum and finding similar characteristics spectral content to what you have selected. So for example, a snare drum in a song or audio recording has a consistent shape all throughout the song and select similar takes the geometrical information into account and more-or-so finds similar shapes/patterns/sequences that matches that original selection and selects them. Same idealogy applies to a kick drum or a hihat or a bass sequence that plays in a consistent sequence of each other or any consistent tonal element.

With vocals/voices (for example a podcasts of 5 different speakers speaking simultaneously or talking over each other/overlapping each other), the select similar feature can still work(like I said by selecting tonal and playing around with it) but it is less efficient and is limited to patterns and shapes(for example similar geometrical shapes like a circle or triangle or a square). For example, if you record your voice for 2 mins and view that audio in SpectraLayers and you say the word “drink” at the 30 second mark of that audio recording and then say “I drink water to keep myself cool” at the 1 minute and 45 second mark of that audio recording and use select similar for the 30 second mark and apply it to the work “drink”, then select similar should be able to select both of the words “drink”, however select similar wouldn’t be able to select the words “I, water, to, keep, my, self, cool” because the spectral content of those words are not similar to the spectral content of the word “drink”. Although it is double (like I said by playing around with some of the settings, like turning off and deselecting the noise option and having tonal selected), it is not efficient to unmix voices/vocals that way.

If anything, I would try the unmix voices feature or try to use voice denoise and try the other voice option.

To me, (the way I understand your scenario) it appears like you’re applying a selection to one section of the voice (one section of the audio recording where the voice is isolated by itself) and you’re applying select similar expecting select similar to find all similar characteristics related to that voice throughout the entire spectrum. If that is the case then my response to that is that it is doable but not efficient and would take a very long time (because there’s so many variations with voices). My other response/answer is that the main developer of spectralayers is active in these forums and reads all of these topics/posts and
could take into account to improve Select Similar to find similar content outside of the normal geometrical patterns and shapes and consistent sequences that it is currently bound to. For example, (along with match tonal, transient, power, noise within select similar) the main developer could improve select similar by adding a option to match voice prints. For further example, each one of us has a unique voice fingerprint and selecting my voice(fundamental frequency + overtones + noise) in SpectraLayers would have much different data then selecting your voice. The main developer could add a match voice option to where you can select a small portion of a voice print(fundamental frequency + overtones + noise) and have select similar scan the entire spectrum and match all voice prints to your original selection (that way you are not bound by geometrical shapes/patterns and sequences).

Sam_Hocking · July 5, 2023, 12:48pm

I think the best approach for this use case is actually run the audio through the higest SDR vocal algorithm ensemble that is freely accessible. The highest SDR at the moment that I think would work best is an ensemble of MDX-Net: 292, 496, 406, 427, Kim Vocal 1, Kim Inst + Demucs ft. This is a two stem process. Vocal + Instrument (non-vocal audio) and is non-destructive. ie the two together null back to the input. It does require a strong GPU and lots of VRAM to run but I’d be happy to pass this file through for you if you like. You could also use UVR5. You will need to subscribe to get all of these models easily and save time searching for them. To put the vocal ensemble SDR into perspective. SpectraLayers V10 sees a vocal SDR of around 9.1 but MDX V2 ensembles sees 10.6. or even higher and in UVR you will be able to tweak to find the balance for best separation paramaters. You will need a decent GPU with probably 8GB of VRAM to run them relatively quickly at high settings.

Jessica_Johnson · July 5, 2023, 5:47pm

Sam, that would be amazing! I actually am about to order a new computer because the one I currently have does not have a very high GPU. Do you have suggestions in computers? I am looking at the mini desktop computers. I may need to zoom to learn these. I am willing for any help I can get. Especially for people more experienced then I am.

Jessica_Johnson · July 5, 2023, 6:01pm

I keep looking for when technology will finally be released that has not only the capability to identify a song, but remove the song through identification of the song into a separate layer. There has to be a way for this technology because so many programs can quickly identify a song. If there was software to do that, then it would be so much use of the Criminal Justice system. So many crimes especially like my sons do happen with music playing because it blocks off the cries and also psychology has shown that with children feeling they will not be heard, will rationalize that there is no point in screaming or yelling for help. They feel enclosed. So, I have looked at Mathlab and so many programs. Even with the software like Phonexia, it was unable to really take out music frequency. Which is surprising because the software is used only for military, government, and law enforcement. It didn’t have much capability to reduce the noise other then noises like humming, or static. I know there is phonological software like Pratt, that will pick up on pronunciation, but I have not just found a one shop for a combined software with that type of capability.

Sam_Hocking · July 5, 2023, 6:25pm

I understand. I think what I would suggest you try first is the following page.

Online music/voice separator based on neural nets (mvsep.com)

This will allow you to upload the audio and demix it separating to 3 stems. You want to select:

Separation Type: MVSep Demucs4HT DNR (dialog, sfx, music)
Model Type: Try both options.
This is a very good performing model for separating dialogue from music considering it’s free.

If you want me to take a look with the tools here, I have created a samply.app upload folder for you here to share the file. There you enter your email and the upload is only between you and I.

Jessica Johnson Upload Portal

Jessica_Johnson · July 6, 2023, 3:42am

Hey I signed up and was not sure if you got anything?

Sam_Hocking · July 6, 2023, 9:52am

Yes all received. Ledt a message for you on samply, you should get a notification. Will look into it as soon as i can.