[Spoiler alert: this is not the extant answer]
As the Dorico team do not tell us about any new features in any upcoming update, I suspect all they will say is they cannot rule it out in the future as they might be equally vague because they might not actually know themselves. (The future is a “continually moving feast” so to speak, or famine, we do not always get what we want (or cannot afford it).)
The Dorico team is very small as you know (even if it is Steinberg as a whole, meaning integrated into Cubase as well.) Developing it would take considerable time (languages, pronunciation, dialects, genres.)
Another question is, when will we be able to upload a musicxml file to an AI process and it be rendered with vocals (if not already), and would it be better than Cantai?
Cantai, as you have been following has a render process you can already try and it has improved over time. It is crafted by Richard and his team. Any AI process in the future may largely be automated (scanning scores then matching it with different performances with different singers). But any “errors” might take longer to be corrected because it is automated. With Cantai (currently at least) we can simply ask the Cantai team to look at a pronunciation of a word or syllable in that language (which is what has been happening if you are able to follow some of the Discord discussions with him and the team.) As you know, because you have been following its development, it is a long process of commitment and dedication (by a human team.)
Perhaps for you, it will come down to, how long you want to (or have to) wait, either for Cantai (which is already making progress and you can evaluate it now, we assume it will improve) or some kind of AI process (presumably many already in development) or perhaps Steinberg itself.
Thinking of Sample libraries, most of us are still searching for our “Ultimate” sample library, running to and from or between NotePerformer, Spitfire, VSL, Cinesamples … the same might be said for what might be eventually hundreds of AI vocal processes for music, perhaps including Steinberg/Yamaha in the future. At some point in the future, we might prefer one AI process for some kinds of music we are creating or arranging, but another for other kinds/genres of music, or perhaps like to mix the singers from two or three different AI programs for our ultimate rendering.
Aside: each performance of any (human) choral or vocal reading is nuanced, notwithstanding the conductor, players, instruments, studio/location all reacting at that time and place (then microphones, recording and mastering engineers…) We find we prefer one recording of a piece to another or like more than one and cannot decide which is the “best”.
What do you want to achieve with the vocals using Dorico—an acceptable demo or output rendering directly for a film score or a pop ballad for release on an album or just for entertainment?
How close does it have to be to something you wish to have heard as a recording … as humanity might have performed it?