Unmix dialogue and SFX from film soundtrack

Sub3OneDay · October 11, 2024, 9:32am

Is it possible to unmix the dialogue and sfx as layers from a mixed film audio track. I have extracted from the video file the fully mixed audio track and I have successfully managed to extract most of the dialogue using unmix vocals but I wondered if it were possible to leave just the music as a layer - maybe by using the soundtrack music as a reference file and then tell spectralayers to extract everything else to end up with dialogue and sfx on a layer?

RadioTal · October 11, 2024, 10:13am

You cannot, but you can attempt using several passes and the new transfer feature.
Eventually Postproduction features may come, Dialogue stems, Foley stems, SFX stems, Ambience/bg stems, Music stems In its full surround 7.1.4 glory.

ctreitzell · October 12, 2024, 12:23pm

Welll, it depends upon exactly what @Sub3OneDay wants to achieve.

Assuming, you want to do zero manual separation works, I agree with @RadioTal. And further I agree with @RadioTal, even if the endeavour was undertaken to to attempt separation manually, remnants will remain from the overall final mix/master. At the end of the day, no matter how much work goes into the unmixing, there will be some areas that will contain the freqs from more than one source. Case in point a string section, which has been deemed currently impossible to unmix…I repeat, “currently”…

That said, some incredible results can be realized. A rebalancing of sources stems back into a “remix” is certainly attainable.

But if someone is trying to unmix an electronic synth version of some symphonic piece from a soundtrack in “original channel stems” which has been unmixed from a mastered film audio, even with years of manual work you might never get there.

If you expect to get results into solo’d tracks with no remnant “bleed” with only modules and no manual work, it’s not going to happen (yet).

Just look at the Unmix Noisy Speech module alone; it works wonders (it is absolutely incredible in IME), yet the majority of inhales, sighs, chuckles…certain consonant high freq material and low freq energy is remixed into the noise layer. Also incidental ambient noises like traffic, animal sounds and various foley will find its way into the noise layer. Thus, there is a lot of manual work to separate the unmixed noise layer into wanted and unwanted “noise” and possibly voice remnants back to the speech layers or their own layers.

In all of my documentary IV work I have the following layers:

Speech
GoodNz (wanted)
UNz (unwanted)
Speech Rems
If there are particularly stubborn areas (like a mic blast) I will Unmix Levels and use groups named with the content.

SL11 is simply incredible for what I have been doing: documentary film audio.

Jari_Junttila · October 12, 2024, 1:21pm

There is or at least was model for SFX in UVR, but wasn’t really good by reading others, you should check Audio Separation community on Discord, there is a lot of information

Sub3OneDay · October 14, 2024, 12:08pm

I’m basically not really looking to do it super accurately, I just want to remove the music from the film sound so that I have only dialogue and SFX and therefore give myself an unscored film to practice writing a soundtrack to.

ctreitzell · October 14, 2024, 12:36pm

oh, you should be able to do that…the FX will be the hardest to unmix, I should think

Sub3OneDay · November 3, 2024, 8:05pm

Managed to do most of it - as you say SFX was the hardest and didn’t get it all but as it’s just for practice/learning it’s enough.
Multiple denoise of remove crowd noise seemed to help a lot

ctreitzell · November 4, 2024, 2:15pm

My film distributor is working on his own film. I introduced him to SL and he spent a day or two watching me. I didn’t hear from him for a while, cuz he was busy with SL learning curve and he told me the same thing that you are saying…that Unmix Crowd Noise multiple passes does a lot of heavy lifting. I still haven’t tried it…guess I should