Can I train SpectraLayers Pro 7 to recognize and isolate guitar in an audio recording?

My particular interest is in isolating the guitar parts in instrumental recordings (e.g. guitar with orchestra or with a band) for purposes of picking up subtle clues that help me determine the exact notes that are being played as well as the position on the fingerboard. Generally, the guitar recordings I transcribe are clean with little effects other than a small amount of delay or reverb. They may be either acoustic (classical) or electric (typically without distortion).

I’m wondering if it is possible to create a unique layer or track for the guitar by using the AI features in Pro 7 or by manually selecting the guitar part in the spectrum using the available selection tools?

It seems like it would be easy enough to select the fundamental frequency of the guitar part in a song using the manual selection tools, but is there a way to simultaneously select the associated harmonics that may be overlapping with other instruments in the mix?

Or is there a way to use reverse the unmix process (for example, to teach Pro 7 to recognize the components of the specific instrument)?

To be clear, I have not yet tested the product and have not yet purchased it. I currently use Audacity to slow the tempo of a song without changing the pitch. I also have used the spectral display to look at the progression of notes in a run or slides, etc. This has been very useful. But I’m wondering if I might be able to use the Pro 7 functionality to go to a new level?

1 Like

The easiest way to find out is to download and run the demo. I’ve never used the stem separation feature. As with all stem separation algos, though, its effectiveness is probably dependent on the source material. You can only find out how effective it is by direct testing.

Thank you, Pionzy. I agree that the best way to find out about a product is to try it. And I hope to do that in the very near future.

The reason for asking my question on this forum is that I am trying to learn, not how well SpectraLayers Pro 7 does what it is advertised to do, or whether it does that function better than other similar products in the marketplace, but whether it can do something it was not designed to do and is not advertised as being able to do. Usually, these are the sort of things that only the developers and super-users have insight into.

From the tutorials I’ve watched, Pro 7 provides users a simple-to-use, automated capability of unmixing audio into 5 layers: vocals, piano, drums, bass, and “other” (with other being everything that is not vocals, piano, drums, or bass). And from the YouTube videos I’ve seen, it appears to do that quite well. My interest would be in separating guitar out of the “other” category." I can see that there is not an automated way to do that, but I am wondering if there is a manual way to do it.

Say, for example, I have a CD of a guitarist playing with an orchestra with no vocals. The program would automatically produce four layers - piano, drums, bass, and other. The other layer would include the rest of the orchestra as well as the guitar.

My initial impressions are that it probably can’t be done - at least not without more access to the AI engine. But, I wouldn’t want to come to that conclusion without asking the question.

Danfromlittleton
For the AI process (unmix feature) it will depend on each piece’s guitar timbre in contrast with the background.
There are other factors for your ability to apply the extract tools to achieve what you want.
In the developer words about it, see this thread.

Also copied the link for a successful experience of an user who was just starting to use Spectralayers for 1rst time, on an early version of SL7 (before important fixing updates were applied).

Thanks Nspace for reposting that earlier thread. I was thinking along the same lines as Robbie D who wrote:

"Often, when I want to do this kind of work, I may have a small sample of the same instrument I want to isolate/extract/unmix (maybe from a intro, break, or different take).

Is it possible you could add the option to “feed” the AI a clean snippet of the instrument you want to unmix to give a more accurate result?"

That would be the case for me as well. I too was wondering if it would be possible to select a short passage where the guitar was the only instrument playing and have the program analyze that section and then apply the spectrum or the tonal, transient, and noise components to isolate that instrument for the rest of the song? This would be similar to the approach used for reducing noise by analyzing a small section where there is only noise and then applying that spectrum to remove noise from the rest of the piece.

imo, all of this is dependant on spleeter. I believe the spleeter engine is being trained to extract all sorts of stuff via contributing developers…at least from what I read on their forums. I know that guitars are hard…in spleeter-based programs like slp, they always end up as “other”. :slight_smile: And sub-extraction then doesn’t work, although I can draw parts out via Isolate/Melidyne/Slp…not always easily of course.

Imo, the spleeter guitar-learning submission-extraction algorithms are gonna be a few years out.

DosWasBest - Thanks for your response. I was guessing it might be a while before the technology gets to a point where you can extract guitar from an audio track. I’ve seen articles though where people are working on extracting individual instruments out of orchestral recordings - so I know people are pushing the envelope. Can you say more about drawing out parts via Isolate/Melidyne/SLP?

I believe the problem with guitars is simply this - when does a guitar sound like a guitar? Unlike many instruments, guitar audio is incredibly versatile, and usually significantly processed through overdrive, fuzz, and many other possibilties, which all complicate the harmonic content of the audio.

I believe this is why we don’t see a standard ‘guitar unmix’ option.

I could be wrong, cos I’m quite new to spectral editing, but this was a question I had and this seemed to be the answer.

But the geniuses behind this software may suprise us with a solution given time.

Yes, for that reason I’d say they could start concentrating on (steel strings) acoustic guitars.

Thanks to everyone for your comments.

When you don’t know the ins and outs of how something is done inside the black box, everything seems possible. But it seems like it should be feasible to provide the user a way to input a clean sample of the instrument they want to isolate as a way of training the program to recognize and isolate that spectrum. I might want to isolate guitar, someone else may want to isolate the saxophone. But that would be a nice capability to have.

If that isn’t possible, would it be possible to use the draw tools to select the fundamental frequency of the part you want to isolate and then train the program to pick up both the fundamental and the harmonics of that spectrum?

danfromlittleton,
Good that you acknowledge the outsider’s view problem. It is a good start for a conversation.

Then, novel ideas might come from out-of-the-box views.
The ideas shared here come as what us long time users might have thought at some point to improve this tool, and they look interesting indeed.
i. To concentrate in distinctive guitar timbres as a start (like the metal string example), might result in better unmixing outcomes.
ii. To have a process that receives a user selected sample and reading it, automatically selects the best unmixing algorithm so then it could apply certain values at critical articulation points of the selected algorithm for best results.
iii. Some others have pointed for scripts (user selected) that would permit sequenced passes of certain algorithms for best results. Scripts to be applied within Spectralayers selection tools before/after user preparation.

And then some other automated, user applied or in between ideas that have been talked about or thought by some of us… all done out of the box of the specialized AI research and far from those like Spectralayers developers whom do both research and also have long experience in its application.

Will any of such proposed methods see the light and be developed into SL sometime? We don’t know now, but in the future nobody knows for sure. It is interesting to share this talk though and to be open to learn about this rapidly evolving field.

*One point, when you mention ‘train’, AI uses multiple readings and examples to learn and later generate its data points for best results. The noise sample reading that denoisers use, is a very simplified take on Machine Learning (though useful in pure cases), but precisely the quantity of chances it has is a relevant factor in the quality of the resulting algorithm, so one sample is not enough to “teach” an AI process.

There are some pretty amazing advancements being made in so many fields. I’m sure the things we are talking about will eventually happen. I was thinking back to the first talk-to-text program that I bought many years ago. I had to read one of several canned passages to “train” the program to recognize my accent. Even with that, the accuracy was low and it took more time than it was worth to do the manual edits. I recently used the free Google Docs voice typing feature to create Word documents from several recorded interviews and was quite impressed with how well it worked with no training at all. I’ve also been impressed with how some of the companies I do business with use voice recognition to verify my identity. That works really well these days. So no telling what we will see in the future for recognizing other instruments.

Going back to the idea of manually selecting and isolating a part, I’m thinking it should be possible for a program to automatically identify the harmonics of a signal based on correlation with the fundamental frequency (assuming you select the fundamental through the use of selection tools). I’ve messed around a little with photo editing (I use GIMP which is like Photoshop). There is a selection tool called the magic wand which is useful for selecting contiguous pixels based on color. Could something like that be used to select a note from the attack through the decay? That could be especially useful in removing or isolating a violin or other sustaining sound. I’m also wondering if it might be possible to manually use the tone, noise, and transient components to recognize and isolate a new kind of instrument.