Speech to text functionality

petervanrees · September 10, 2022, 7:36pm

Hi all,
The latest RX10 editor has quite a sophisticated speech to text functionality built in, see What’s New in iZotope RX 10: Background Noise Removal Software - YouTube. I never cared much for the RX editor, wavelab is far superior. But this would be a very compelling reason for me to switch to the RX editor, my work being primarily voice and text orientated.

Are there any plans for something similar in Wavelab?
Best, Peter

PG1 · September 10, 2022, 9:50pm

Maybe. What is your audio field?

petervanrees · September 11, 2022, 7:40pm

Thanks for your responding. My main audio activity is educational audio, audiobooks and e-learning.

Puma0382 · September 12, 2022, 1:35pm

Forgive me, but I don’t see this ‘Speech-to-Text’ facility.? I must have missed a trick… Your video link does not reveal any ‘Speech-to-Text’ function.

Anyway, be careful to read what’s included for each editions tool set - depending on your needs, you could be shelling out £282 (for Standard) or £754 (for Advanced). Although granted, there’s quite a bit you get for your money.! Prices from Plugin Boutique.

petervanrees · September 12, 2022, 2:11pm

Have you actually watched that video ? From 00:30 up to 2:20 is all about this functionality.

Puma0382 · September 12, 2022, 2:32pm

Yes, I did. I watched the entire thing, waiting eagerly for this ‘Speech-to-Text’ functionality to appear.

What I see in the duration you state, is regions of audio material already marked out and (I’m guessing, manually) labelled with descriptive txt. I am not yet ready to believe that the program itself generated those labels intelligently, from the audio signal alone. That is my understanding of a true ‘Speech-to-Text’ capability.

I’m ready to be quite wrong here - but please show me where/how the program does that trick…

AlexBarton · September 12, 2022, 5:06pm

have you tried it in RX10 yet? its not as good as you expect, a few more revisions it’ll be there though,
its quicker to edit in wavelab and send out to descript to process in the background

RX seems to take about 15mins to markup the text on a 45min file, I haven’t found the wildcard to search for a phrase over an individual word yet…but it is very clever

what maybe more useful when RX gets further along with it, is being able to open it as normal as an external editor, but being able to copy and paste from RX to Wavelab to edit the syllables/small word changes…as personally I think that would be much much quicker (and allow marker point uses etc) than digging around it in RX, and then re-saving

Puma0382 · September 12, 2022, 5:31pm

Oh.! I see… so it does try to automatically transcribe the audio, labelling with text as it goes - it just takes a while and the accuracy could be better.!? Ok, good to know…

In which case, I stand corrected @petervanrees - I’ll have a look for other info/video that demos this part of the process…

AlexBarton · September 12, 2022, 5:38pm

download the trial and take a look, yes thats exactly what it tries to do
yes it should be quicker and more accurate, it starts to populate the words from the audio as soon as you press the navigation word lane

Puma0382 · September 12, 2022, 6:16pm

Ok, will do…! (when I next think about it; it’s nothing I’m in need of - or afford - anytime soon)

darioagrillo · October 15, 2022, 11:43am

How can I export the text file after the recognition?
It seems it’s only for audio purposes (it would be agreable) like word substitutions