I want to try something, I want to try to unmix multiple voices (using Spectralayers) and wanted to ask if people here can share ideas or links to examples of multiple voices that are mixed together. Can you give an example of multiple voices talking over each other or singing together (It can be anything from a podcast to an opera symphony performance of multiple voices singing in unison harmony of triplets/quartets).
OK, here’s a hot mess for you. This is a mix of 100 tracks of Ted Talk lectures. The levels & pan positions are randomly changed throughout, often by large amounts.
To confirm I’m able to do this I’ll “hint” out certain words I heard.
“You can’t be serious” why would try to embarrass me by “compromising” my skills? With the vocals being heavily modulated(whether the formants are being shifted around or played in reverse at different pitches), this seems doable.
Believe it or not (I know this was meant as a joke but) this is actually unmixable. The problem is “time”. I dont want to spend too much “time” unmixing this. I’m looking for good examples (not easy examples, but REAL WORLD examples) where I can demonstrate unmixing capabilities.
I actually took the liberty of unmixing the whole audio.
All the algorithms I ran it through failed (it couldn’t distinguish between the mid bass because the algorithms mostly considers sub-bass as bass) so I had to manually rebalance the mix and was able to unmix every component(strings, guitar, mid bass, kick, snare, cymball, etc).
Unfortunately, I had forgotten that this forum has strict copyright policy so I am not able to upload here, hopefully the first few seconds doesnt violate anything (as it is mainly for demonstration purposes and not copyright abuse)
Pretty much the same results we all get.
Another example of currently impossible situations…a future benchmark for code-writers/trainers…is the situation with the first five seconds of this-
What’s happening here is a mono-mixed Gibson acoustic guitar and Gretsch 6122 Country Gentleman guitar. The goal…clean isolation of each. Not just at the intro, but all the way through…and not simply by one guitar having its level canceled by the other…which causes dropouts…not to mention would take hours to reconstruct the resulting holes.
Impossible to pull apart stuff like this. Spectrally or otherwise atm. Maybe within the decade…but not circa 2023.
I’ve tried (not this song, but countless other similar gtr mono mixes), I’ve paid people all around the world to try, I’ve monitored the various demucs groups to observe status of separation models.
In time, we’ll all get there.
So after careful consideration, I went back and tried to unmix the vocals and came to the same conclusion. Before I uploaded the unmixings I started to doubt if I unmixed it correctly (because it does sound kind of strange and it sounded like there were more vocalists there).
At first I believed I had it wrong because sometimes when you play audio repetitively for long periods of time, you start to get ear-fatigued (and that’s what I first initially believed happened when I replayed the final result and started noticing it sounded strange). Sometimes when you play something for long periods of time and then come back to it (maybe a day or 2 later) you sometimes hear it differently (or start noticing oddities/artifacts you didn’t notice before). Thats what I initially believed happened when I replayed back the audio but I felt pretty confident with the final result and went ahead and posted it. When you commented, I got the impression that you got the impression I initially got and so therefore I felt it is necessary to explain my theory.
First of all, I believe the audio (with the many vocalists) gives the illusion that there are more vocalists, but I dont believe so. I believe that there is a chorus/flanger effect on both of the vocals and those effects are multiband split between highs and lows (making it appear there is another harmony there).
I am 99.99999% sure that there are 2-3 vocalists(I believe it’s 2 of the same vocalist where one of the vocalist did a overdub) and I believe there is a flanger/chorus effect on both of those vocalists (making it seem like there are more than 3 vocalists).
I’ve compared this audio to other audio and the harmonics (for 5 singers) doesn’t add up. I’ve studied triplet and quartet unison harmonies/melodies and as a musician myself, I know what a triplet looks like on a spectrum and I know what a quartet looks like on a spectrum and I believe (whoever mixed this) wanted to give the illusion that there are more than 3 singers but it’s not. The harmonics “first train to cali” doesn’t add up to 5 separate singers (the math doesn’t add up).
However it does indeed sound like there are more than 3 singers, however I am confident that there are 2-3 singers and there is a chorus/flanger effect on the vocals giving it the appearance that there is another vocalist amongst them. When I played back the audio (especially the high pitch unison harmony) it does indeed sound like there is another vocal recording overdub take, but I believe that is an effect.
There’s only 2 scenarios, if there is indeed a 4th vocalists then whoever mixed the audio intentionally cut off the mid-to-low end of the audio and left the high-mid to high end intact (because there’s no harmonics on the higher pitch audio to connect to the lower end harmonics), which wouldn’t make any sense because that wouldnt sound right on any instrument. The second scenario (which I believe) is that there is a multiband chorus/flanger effect applied on both the lower end and high end.
I also read about it here Flanging - Wikipedia and “The Beatles” were notorious for using the flanging effect (which you can see/read for yourself in that article).
You can (like I did) experiment for yourself. Play something and record it and hear it and then study it on a spectrum. Then add a flanger/chorus effect to it and look at it on a spectrum. Also I would advise to study triplets and quartets on a spectrum and look at the harmonics. Sometimes what you hear may not be entirely what is true, you might hear 5 separate vocalists but I believe it is 2 separate vocalists (maybe 3 maximum) and the chorus/flanger effect gives off the illusion there are more vocalists amongst them.
I’d like to hear other people’s thoughts and what they think because this is indeed an interesting conversation.
Great to read such critical thinking and analysis, thanks @Unmixing !
Not important at all to your awesome analysis, and I don’t even know if you were trying to say otherwise, but I don’t think “First Train to California” was sung by the lads from Liverpool.
(PS, I hear at least three voices (the two saying “first train” at the start, and a bass one in there at the end), and with a gun to my head I’d say there was a fourth one in there somewhere by the the time the last note was song)
I think in 2023 that 99% of the times the word “lad” is used, it is in reference to four specific people.
In the fun universe of demixing, a new Beatles topic has appeared on the www this week having to do with the 1966 “Paperback Writer” single.
Seems someone has discovered…quite by accident…that at the end area fadeout of the track (and also maybe earlier) …there are the sounds of a slow, hunt & peck manual typewriter.
Could be. Some say it’s a hihat. Some say it’s random out-of-time clicking.
Only a dedicated demixing sleuth will be able to reveal for sure
A very universal four voices unison, 7:46 to 8:10
A very distinctive counterpoint in four voices, plus Benny’s marvelous composition and playing in his GX-1 Synthesizer mostly. In unison 1:58 to 2:14
Thank you for providing this example. The unmixing was the easy part. Repairing the stereo field (which was several damaged) without interering with the phase was the difficult part.
I just tried uploading the example but could not upload file because the file size is too large. Sorry.
Ah, You may try uploading just the unison seconds; 7:46 to 8:10 for Beethoven’s 9th and 1:58 to 2:14 for Chess’ Quartet.
Looking forward to listen what may be achieved at 2023 with unmixing dexterity and SL9.
See DM/Private message.
Your work on the 9th is very interesting.
How you were able to separate what you call the deep male voice to the soprano in itself is surprising. Also, the Voice 2 (baritone) extracted track has several quite good parts.
And then the melodic instrumental as it is, might serve for its MIDI notes extraction.
In all, the 9th four movement unison extract is a quite difficult part to unmix with several voices at times on similar timbres and hitting on same notes.
I wasn’t aware the work you did could be attainable, not without minute work and using various tools. Inspires oneself to delve more into Spectralayers possibilities. Thank you
Thanks, like I said the unmixing (surprisingly) wasnt the hard part, the restoration of the stereo field was the difficult part (well, not really difficult but extremely TEDIOUS). If you look at the stereo left channel, you can see that the left side is so damaged that the harmonics are blurred(heavy reverb). However I wanted to demonstrate that I can work with any material and can restore without reconstruction or interfering with the phase. If I had spent more time on it, I could’ve unmixed the noise layer and broke that down further and merged/compiled those bits back to the other layers.
Unmixing is like the scenario/idea of breaking the elements down to its bare components and then putting pieces of the puzzle back together (trying to figure out what belongs where) and solving a puzzle. That’s why I like Spectralayers so much, because it allows you to solve that puzzle. It’s kind of like the idea of using legos/lego blocks to build a structure or unique architecture. When you switch out of composite view and start merging/compiling the bits back together you get a sense you’re putting a puzzle back together. It’s kind of fun