OT: Vocal synthesis, integration with Dorico

I’m currently exploring AI vocal synthesis options and considering either ACE Studio or Synthesizer V Studio Pro. I’d really appreciate insights from anyone who’s worked with either (or both) in terms of:

  • Vocal quality and realism
  • Workflow efficiency, especially when integrating with Dorico

One specific question I have is about tempo synchronisation. Has anyone managed to sync either ACE Studio or Synth V with Dorico — for example, using ARA Bridge Mode or any other workaround?

I understand Dorico doesn’t currently support ARA directly, but I’d love to hear how others are handling vocal timing alignment and integration between these tools.

Thanks in advance for any advice or experiences you can share…

I tried Ace Studio about 6 months ago and it was not ready yet for classical voices and choirs. At the time there was no trial version. I had to pay and request a refund (within 90 days, if I remember right).

Choirs were simply made up of lots of solo voices, so it really didn’t work.

In terms of workflow, it was an offline process, but it worked quite smoothly. I didn’t try syncing. I loaded audio from Dorico and Ace Studio into Cubase, which is where I finalise Dorico renders anyway.

My position on this matter is that it is urgent to be patient. There is Cantamus that some people here already use. There is Cantai.app which is being developed and uses the same engine as ACE but that is still in alpha state, and probably new solutions will arise in the forthcoming months…

3 Likes

Personally I think Cantamus is the best of the bunch, but it looks as if development has stopped.

I had tried ACE Studio’s PDF to MusicXML a while back and found it completely worthless. Just this week they hit up a good friend of mine to try to get her to collaborate on an AI Trombone. She told them to eff off.

4 Likes

Well, I can only speak for SynthV, here is what it doesn’t do:

  • Read MIDI or Notation directly from the host, instead, MIDI needs to be dropped into its own timeline.
  • It’s not completely real-time, but faster than, say, NotePerformer, the timeline renders automatically upon any change in fractions of a second.
  • Not really suited for “Classical” styles, although there is one voice with an “operatic” mode, which sounds decent, but nothing exists for a regular choir or something…

What it does do:

  • Sound terrific, especially when you take the time to tweak the modes a little bit for pronunciation or intensity purposes.
  • Transcribes vocal audio into a track that then plays back. Lyrics are hit & miss, but easily corrected. Again, no lyrics are taken from the host. If you are familiar with EZDrummer by Toontrack, it’s just like that.
  • Dorico doesn’t yet transmit tempo changes to VSTi, but SynthV will happily sync along with a constant tempo in Dorico.
  • Large selection of voices
  • You could fairly easily build a choir using multiple voices and more subdued modes. I’d wager 4 instances with 6-8 tracks each, and maybe 4-5 different voices should do the trick nicely… :wink:

From the top of my head, that’s all for now, I’m happy to answer more questions…

B.

P.S.: Here’s a rather old experiment I did using V1, in hindsight, it could have used a little more humanization editing:

I’m primarily interested in musical theatre, which I feel aligns well with the voice types available in Synth V. That said, tempo changes are quite common in the genre, and until Dorico can transmit tempo data, or Synth V can interpret a tempo map, this remains a significant limitation for my current needs. Still, your mock-up is incredibly impressive and shows great potential.

1 Like

I just found the one “operatic” mode I was referring to: https://dreamtonics.com/en/synthesizerv/
Then, scroll down to “Felicia”. Should suffice for a demo…

Just got confirmation from ACE Studio this morning that their software supports importing tempo maps, preserving any tempo changes, and it handles embedded lyrics as well. I’m hoping to set aside some time in the next week or two to give it a proper trial run.

If the results are on par with Synth V, ACE Studio could be a compelling alternative, especially considering the cost advantage when working with multiple voices.

Just to be clear, the tempo issues are on Dorico’s side, not SynthV’s…

If Synth V were able to import a tempo map exported from Dorico, wouldn’t that effectively solve the synchronisation issue, at least for non–real-time workflows?

I’m not sure if SynthV can import tempo maps, actually…
And the full-blown sync is available only in ARA mode so far as well, I just checked!

I think you’re absolutely right. It would be fantastic if Synth V could import a tempo map directly, or if Dorico added support for real-time sync via ARA mode. Exciting times ahead, let’s hope…

I DEEPLY hope that some developer (perhaps Cantai? We’ll see) will take seriously the development of a choral vocal synthesis VST. Meaning NOT “pop” (with all the obligatory scooping), and NOT operatic (with all the obligatory, heavy vibrato). Many developers seem not to appreciate what so many choral conductors and composers would like: the market is absolutely enormous for this- just universities and church choirs alone would be a huge share of it.

4 Likes

Re: Synth V and Dorico, that can be effectively done; however, due to non-support of ARA in Dorico, tempo changes need to be straightforward:

  • aligned to timing you can snap to in Synth V. The settings available varies between Synth V versions 1 and 2.
  • linear, non-gradual tempo changes — accels and rits are not your friend: they will drive you nuts. You can sort of get them to work by figuring out what the tempo in Dorico is at every 1/4 beat, then setting those tempo values in Synth V. The results of doing that are passable, but not great, and it definitely takes a lot of effort to pull this off for longer passages. If you make changes to your tempo or add measures, making things work again takes additional effort.

In terms of gradual, non-linear tempo changes, I remember Ace Studio being a lot more flexible, within Dorico, though it has been a while since I last experimented with it.

Since I use a lot of rits and accels in my writing, I plan to switch to using Ace Studio with Dorico. When I move the project to my ARA-compatible DAW, I may switch the song to Synth V.

Though programs like Ace Studio, Synth V, and Cantai offer a lot of great potential, they definitely have their limitations. Great for mock-ups and personal projects, but the switch from Dorico to DAW can be pretty hairy, especially if your workflow requires a lot of back and forth iteration. Getting the phoneme pronunciation correct is often ‘a bridge too far’. Ultimately, human singing surpasses what singing synthesis can do. Much easier to direct a good singer than it is to iterate and reiterate singing in an interface.

I’ve seen comments indicating that streaming services have issues with songs that use AI-aided singing.

As an aside - Great turn of phrase, MarcLarcher: “urgent to be patient” :grinning_face_with_smiling_eyes:

Being patient but starting to doubt…

1 Like