I finally got some time this afternoon to give a (relatively) quick test of Omnivocal. I decided to use the U.S. national Anthem, “The Star-Spangled Banner”, both because I wanted something in the public domain, and because it is one of those songs that can be a challenge for real singers.
I started by playing a quick one-take piano accompaniment, unfortunately on my 61-key keyboard controller and playing to a click. Not great for a piano part, and I made a few mistakes that I had to edit by hand since it would be impossible to overdub based on how rough everything was. I specifically picked the key I would do the song in, which led to my creating the Omnivocal part with the male “singer” for range reasons.
After that, I overdubbed the melody, again just playing live, to get the notes in. I made a few mistakes, but cleaned those up in editing and also tweaked some of the timings later on to try and improve how lyrics sat against the melody.
It was decidedly challenging getting the lyrics in because Omnivocal couldn’t transliterate some of the words to its phonetic spellings, and, sometimes, even when it could, it wasn’t the pronunciation that worked for me. Thus, I spent a fair amount of time entering phonetic spellings directly, including working around some things that looked right, but didn’t sound right. For example, on a number of occasions, I ended up adding an extra consonant (e.g. a double “T”) on the front or back of a word because the obvious “spelling” felt like it was dropping the consonant. Other times, that didn’t work, so I’d do something else to work around it, like force a breath afterward.
Melismas are a part of the song, and I did at least find that Omnivocal’s use of the hypen in its phonetic spellings (mostly) worked to cover this need. But, of course, melismas totally messed up Omnivocal’s positioning of phrases (a reasonable thing since it couldn’t know the intentions as to what syllables got stretched across notes).
Once I had the melody and lyrics working, I overdubbed automation of Omnivocal’s various controls, really just based on feel. I also turned the formant “down” (a bit lower) as it felt a little better on the front of the vocal tone for this song. (This is all working with the male vocal thus far.)
The last thing I did on the recording front was overdub pitch bend to scoop up to some of the notes, setting the pitch bend resolution to a whole step. Mistakes in that area made me edit the pitch bend data in a few areas, which is a real pain in the you-know-what. But I do think the pitch bends helped the overall expressiveness of the part.
As for mixing, I went reasonably simple, I used the UADx Ocean Way Deluxe for both the piano (IK Multimedia’s PianoVerse NY Grand) and the vocal, added Waves CLA Vocals for processing the vocal, just using a preset, used and SSL-style compressor (from UAD) then a tape emulator and IK’s Lurssen Mastering Console, then a limiter on the stereo bus to tame things a bit.
The result of the process thus far is this recording:
I also wanted to try the female vocal, but I didn’t want to do any significant work to do that, so my idea was to duplicate the tracks, transpose them, switch the voice to the female, then render the result. I used a MIDI Modifier on the piano track to bring the piano up 7 semitones (from C major to G major). I tried doing the same on the Omnivocal track, but it was out of tune. I learned that I also had to do the same shift in Omnivocal itself, and it was necessary to do both parts of that shift – i.e. if the MIDI Modifier shift wasn’t done, it was also out of tune, but if the MIDI Modifier part was done without the Omnivocal shift, it was out of tune. This doesn’t make sense to me – why wouldn’t hte MIDI Modifier shift have been enough?
Anyway, once that was done, I didn’t even attempt changing any other parameters, but rather just rendered a mix with the female voice:
The piano is obviously not optimal here since I was just transposing that up quite a bit rather than playing what I might have played in that key, but c’est la vie!
My thoughts? It turned out better than I’d expected, but it’s hard to extrapolate from this to my “real life” use cases. There’s definitely a lot of work on the pronunciation front that needs to be done manually. Not necessarily a problem, but there were issues I could not resolve, despite trying multiple workarounds. Perhaps if it were buried in a mix, and layered with my own background vocals, it might not be a bit issue – and that is probably my main potential use for this product. But this wouldn’t fool me into thinking it was a human singer, either – or, if it was, it would have had robotic processing on it.
(I know that can be in vogue, but it’s decidedly not my style.)
FWIW, I probably spent on the order of 3 hours on this altogether. Most of that was prior to getting to the point of mixing.