Enhanced Training for Instrument/Voice Extraction

Crotchety · July 31, 2025, 2:14pm

I think this is a feature request but I may have missed something…

I’m trying to extract a clavinet from a spiky guitar track. Fortunately, there is a short intro of just the clav playing the riff it repeats throughout the tune. I’ve registered the intro but it’s done a poor job of separating the two instruments.

There doesn’t seem to be a way of training the AI on a range of clav parts to give it a fuller idea of the range of sound it’s looking for. But I’m not really sure what ‘training the AI’ involves so I may be barking up the wrong tree.

I’m sure you’ve got some thoughts on this so let’s 'ave 'em!

SLL · July 31, 2025, 3:12pm

I agree. The whole implementation of the instrument separation function could and should be made better in terms of workflow and solid results, which aren’t really to of much use. I think you need to have some super duper clean core instruments (and just 2 instruments mixed together), to make it work alright.

Man I’ve really thought, that this could work wonders on separating instrument stuff from the rest of the source material. But no matter what I did, the results was unusable here. I would have thought, that if I could listen and hear the instrument clean, with nothing else playing, that I then could record the footprint of the instrument, and separate it. But it doesn’t work on mixed material of all kinds. So yes, I wish for a better implementation of everything. As said, workflow (it doesn’t say much about how to use the function… it needs to be better) + the most important thing… the results need to be a lot better.

I think that all Steinbergs video extraction examples, use the same audio file for separation. Audio which really suits the extraction routine good and by that, gives you decent good results.

I also hope that the manual / instructions will be better. If you ask me, the manual needs to be much better. Right now you can only see very basic information. To be able to use the different stuff in SL, you have to find Youtube videos, which is okay.. if you can find any video s that describe the function and workflow.

I think Cubase is a lot better in terms of the manual. The functions are more and better described.

I cross my fingers, that SL will be updated for the better

Crotchety · July 31, 2025, 3:23pm

Yes indeed! Some explanation of concepts would be a help.

ctreitzell · July 31, 2025, 4:17pm

You are using the cutting edge! At this time in history, this is it guys! SL12 was just released! If you want better, you will have to wait, pure and simple. There are discussions on this forum citing “better” unmixers and comparisons on YT and the web. We’ve also had plenty of discussion about possible stem “creators” which might possibly regenerate whatever cannot be unmixed.

This certainly helps, thanks for testing things and reporting back; I and others deeply appreciate all you music unmixers testing SL.

Agreed…and that is marketing rub…hence, many of us waiting for the trial and reading y’alls’ feedbacks. My current focus is not music unmixing and I see many new things in SL12 which will benefit my current tasks.

This is clearly an issue, I agree…I been complaining about the Owners Manual since day one. I started with SL10 trial last year. The manual has not evolved; I doubt very seriously if it will. I learn new stuff every day with SL…because I work in it for hours on end. There is A LOT of manual selection and trial and error and eventually success if you keep at it long enough.

To get started, I agree, go watch @Phil_Pendlebury and peers. That said, the majority of how I, personally work in SL is not covered in any video I have ever seen online. I see a lot of testers investigating how effective the unmix modules are and not much else. There are some vids on advanced uses; yet I haven’t seen any of it doing what I’m doing. The SB how-to vids have source material that is rather tailored to the outcome they want to show, like you say about the SB launch vids. Again, we are at a point in time where the unmixing of stems is a starting point; then, get yer head down and manually edit.

As I’ve been saying about unmixing of music since I crept in here last year, the tech is evolving and not really there yet…so for unmixing of music, I have been biding my time using SL for what it excels at.

al808 · July 31, 2025, 4:46pm

Well, have you considered making a video to share your discoveries ? that would be appreciated 'm sure.

ctreitzell · July 31, 2025, 4:56pm

Of course I’ve considered making videos
but that’s a time drain I can’t cover at the moment
heck, screen shots are too much effort for me

I have written a lot about my journey here on this forum

Jonas2021 · August 1, 2025, 4:56pm

Yes, I would also love video tutorials, but also with the limitations of SL.

For example, that fully mastered tracks, distorted with saturation with bus compression glued Drums, limited to the maximum also have their limits somewhere.

With what possibilities to break them down best into individual layers and what’s really to expect at this time.

ctreitzell · August 4, 2025, 1:09pm

well, and I’m not spending much time on music, I did spend the past weekend getting to know SL12 and unmixed quite a bit of music
stereo mixdowns of:
(all original music, mostly produced homebrew by me/ my friends)
3pc (prog) rock band (vox/bass/egtr/ drums)
4pc (prog) rock band (vox/bass/ 2 egtr/ drums)
multitrack original pop rock songs (8track)
3pc live recorded to DAT with PZM mics (roland octapad/gtr/cello or bass)

I’ve never even tried unmixing a master or mix by someone else

aaaaand there are a lot of options for unmix starting points, so I don’t foresee a concrete recipe for unmixing at this time. A great deal of time and effort would be required to arrive at some form of repeatable workflow.

With the NR of on location interviews and “multi” mic audio editing is pretty straightforward.

What we really need for unmixing is a software which asks the editor what the sounds are that the unmix software is attempting to process

Crotchety · August 6, 2025, 5:42pm

You should definitely try. Success depends on the quality of the engineering so it’s a useful indicator of how us amateurs are doing!

For an unmixing workflow I’ve hit on starting with Vocal/Drums/Bass: the algorithms do a pretty good job due to their distinctiveness (but it’s also worth setting all the options in UnmixSong, just to see what it makes of it). This leaves Other containing a mish-mash of instruments fighting for the same space and why specific training becomes necessary.

I envisage building a library of presets of ‘trained’ instruments that can be shared between users and even built upon. But, like I said, I’ve no idea how to get to the point where there’s something to post.

ctreitzell · August 6, 2025, 10:08pm

it’s not always just other with a mash up…my elect guitar often can be found in the vocal layer

As far as me unmixing a popular song…I have so much of my own material mixed down to stereo or recorded directly to stereo that need various things…like rebalancing…and NR…I can certainly see the point of trying a mass produced piece as an exercise

ctreitzell · August 9, 2025, 1:09pm

OK, looking over some unmixing of music I have been doing with SL12 over the past week, something that really stands out to me is the tech just isn’t there yet

Case in point is this section in one of my previous band’s recordings. This screenshot is essentially a solo bass part. Drums have been muted here and guitar is green (as the layer layers panel says).

granted this is a big gliss of an octave on an electric fretless bass…so, yeah, I want SL to ask me if these are different instrument components across layers or if it is one instrument. I realize I can manually merge, but that’s not really what we are talking about here

I haven’t tried Unmix Instrument for this particular bass part…I probably should

If you want to hear the song it is here:

Crotchety · August 9, 2025, 4:54pm

I had a listen to that bit you mention and tbh the bass isn’t particularly bass-like at that point. Looking at the spectrum, it might be the style (the gliss) that’s throwing the algorithm off. It seems to be handling the more conventionally-played notes.

ctreitzell · August 9, 2025, 5:59pm

really? isn’t it though?…and SL12 does find some of these glisses as bass in busier sections

granted this guy is by far the best bass player I have known in my entire life…and that’s all bass right there

that’s why we do need to be looking at a future functionality that asks for timbre confirmation by the user and then learns from the responses…which is the point of this thread

again, I’m not complaining, just discussing findings

Joey_Kapish · August 10, 2025, 6:29pm

The tech is indeed there but the approach to this problem is being solved the wrong way. I tried to help the main developer as much as possible and I already knew (beforehand) that this was going to end up turning into a never-ending feature improvement where the developer ends up chasing down every instrument to train data on in order to implement it into spectralayers.

Crotchety · August 11, 2025, 2:53pm

What does that involve exactly? Is it something we can do at our end?