No. I am in the process of trying to re-unify all license types back into a single global lifetime license for all platforms. Please bear with me.
Such amazing news. You’re really an honorable man.
Latest news from Discord:
Text version:
Hey, @everyone ,
Cantai is coming to your DAW! (pre-alpha version shown- features and look may change)
A standalone editor bringing AI vocal synthesis directly into your production workflow. A revolutionary lyrics-first approach, all the same voices found in the notation-based Cantai offerings, with fine-grained expression controls — phoneme editing, pitch envelopes, and energy curves are all available for advanced users, but handled automatically if you just want to write and go.
Works with Ableton, Logic Pro, FL Studio, Cubase, Reaper, Pro Tools, Studio One, and other DAWs with VST/AU support.
But that’s only half of it.
If you’re using Sibelius, Dorico, or MuseScore Studio, the Cantai DAW plugin also acts as a rendered audio playback link. One plugin instance, each vocal part automatically routed to its own mixer channel — fully mixable, fully independent.* Render updates from your notation software are instant. No import. No export. Just a live link between your score and your DAW.
Two modes. One plugin. Editor if you’re working in a DAW. Playback link if you’re working in notation.
Alpha testing begins March 20. React with
if you want in.
*Automatic multi-output routing supported natively in Logic Pro, Cubase, Reaper, Pro Tools, Studio One, and FL Studio. Ableton requires manual output routing.
The DAW version is included in the Cantai lifetime license.
— Richard
Slightly OT, I had a listen to an official choir demo in Synth V version 2 dating to about a month ago.
All I can say is that Richard de Costa has plenty of room to deliver something good.
After a couple of minutes of listening, I felt ill. It was really horrible.
It really depends on what you tell it to do, because the defaults are hardly optimal. I have found that it is way too generous in terms of pitch variability and timing… Selecting all notes and tightening up those two parameters alone will drastically improve the default output. Altos are also particularly prone to overly wide vibrato, at least in the “Choir voices three” collection. That said, with a little bit of tweaking, you can get something that is astonishingly good, given the fact that it’s completely synthetic.
Isn’t that the truth! ![]()
Exactly! Certain singers have a very generous pitch and timing variability but you can easily change all that – I find reducing to around half usually works well, though occasional leaps need extra treatment. Increasing smoothness and roundedness globally is also usually desirable. My first hour or so with SynthV didn’t really impress until I realised that the default settings are rather strange for most tastes. And you MUST use the Fx settings to introduce some reverb and arguably also increase the contrast with the Equaliser sliders.
It will be interesting to see what Cantai come up with with the promised new choir but as things stand, there is a long way to go to bridge the gap (if indeed that is what’s planned, as it seems the main focus will be on ease of use) on a musical level.
Still, if @RichardTownsend would care to listen to one of the many things I’ve done with the choirs (several links in the made-with-Dorico subsection), then he might change his mind – but perhaps not as we’re all different.
Cantai’s drawback is its developer’s limited power & resources. Cantai’s Richard is a practitioner who knows exactly what we (musicians, composers, conductors, etc.) want (it’d be his wishes too, I suppose). The problem is how & how soon he can materialize them with his constraints.
That’s why he missed his own deadlines many times.
That’s why he collaborated (had to, perhaps) with his current partner (which then brought another complexity on Cantai’s pricing).
Thanks David @dko22!
I had a listen to your most recent posting.
I still feel the same - it’s very much a personal thing, I guess. For proofreading purposes it’s great, for assessing the emotional impact of a piece it’s just awful IMO. (I’m one of those people who doesn’t have much in the way of a visual or auditory imagination).
It’s a kind of uncanny valley phenomenon I think - the sound is close enough for me to assess it as being sung, but the performance, being so inhuman, makes it sound terrible.
As things stand, I still prefer the Ah sounds from Hollywood Choirs via NPPE - they sound much more like real people to my ears.
fair enough - although almost everyone who has listened to my new versions prefers them to what I previously did with EWQL Wordbuilder choirs, I myself quite like the EWQL choirs in certain situations (particularly I like the boys choir in Symphonic Choirs) and there are a small handful of works I won’t convert for that reason. And just occasionally SynthV does get on my nerves – though not necessarily for the same reasons you don’t like them ![]()
I did make one hymn demo using full lyrics, which were actually quite understandable, but then I went back and simplified it to one verse of each voice type on “la”. It sounded simpler, more real, but in a good way; less uncanny valley as you say. Seemed an appropriate compromise, that would help the singers learn their part without the creepier-feeling version.
Dev Log: Building a German Voice
We just wrapped a studio session with a singer to start building Cantai’s first German vocal library — and I wanted to share a bit about the process, because it’s genuinely fascinating how this works.
Pre-order Cantai for Dorico now!
Why German, and Why Now?
German is one of the most requested languages from our users, and for good reason. Between lieder, opera, choral works, and contemporary composition, there’s an enormous body of music written for German text. If you’re scoring anything in that tradition, you need a voice that can handle it.
The Session
Here’s the thing about building a singing voice: the training data is the voice. Every stylistic detail the model learns comes directly from what the singer actually performed in the studio. So the repertoire choices matter a lot.
We built a setlist of German lieder and arias — real songs, chosen to give us broad phonetic coverage across the language. German has some sounds that don’t exist in English (the “ch” in ich, the Umlauts), and the best way to capture those naturally is to record them in context, embedded in actual musical phrases, rather than running through clinical phonetic exercises.
We tracked across a full dynamic range — quiet, intimate passages through to full operatic power — in a dry studio environment. Clean signal is everything here; the less room sound baked into the recording, the more flexibility the model has downstream.
How It Becomes a Voice
This is the part people are usually most curious about, so here’s the short version:
First, we take the recordings and create a precise time-alignment between the audio, the lyrics, and the notes. Every syllable gets mapped to exactly where it sits in the performance — when it starts, when it ends, how it transitions into the next sound.
Then we extract the acoustic detail: the pitch contour (not just “what note,” but the continuous movement of pitch — the vibrato, the subtle slides between notes that make a voice sound alive) and the spectral texture of the singer’s tone.
All of that feeds into a diffusion model. If you’re not familiar with the term: imagine starting with pure noise — like static — and then gradually, iteratively refining it, guided by the score and the acoustic data from our session, until what’s left is a clear, natural vocal performance. The model learns how this singer sounds when singing German, and can then generate new performances from any score you give it.
What’s Next
We’ll be evaluating the first renders soon and filling any gaps in the phonetic coverage. More updates to come — and if there are other languages you’re hoping to see, let us know.
This is really neat to read. Thanks for sharing!
interesting! As I’ve written a few things in German, I’d like to hear the results. Incidentally, the “ch” sound does exist in some forms of English like Scots even if it doesn’t in American English. By the way, is it a male or female voice or both?
While I’ll hazard a guess that a female voice will be recorded as well, it is clearly a gentleman in the picture above.
ah, that’s true – I’d forgotten to scroll up for the photo ![]()
Good point! Handy if one ever wants to set the (ahem) “poems” of Ewan McTeagle.
https://montypython.fandom.com/wiki/The_Poet_Ewan_McTeagle
![]()
or, more seriously in my case the “Loch Tay Boat Song”. You have to respell it " Loh". Same thing with German (even though German uses the Spanish language model as US English is hopeless – I’m referring here to the SynthV implementation)
German!
I have lots of works auf Deutsch, and more coming, so THANK YOU SO MUCH.
![]()
This is a different “ch” sound, being produced in the throat. And it also is very common in German phonetics („Dach“, „Lachen“, „ach“, „doch“…)
The “ch” sound referred above by @turingopera
the “ch” in ich
is produced at the top of the mouth with the help of the tounge („Ich“ liebe „dich“, „flüchtig“, „nichtig“).
Both combined in one phrase:
„Ach wie flüchtig, ach wie nichtig“
That’s a song (from around 1650) by Michael Franck


