Speech to text functionality

Hi all,
The latest RX10 editor has quite a sophisticated speech to text functionality built in, see What’s New in iZotope RX 10: Background Noise Removal Software - YouTube. I never cared much for the RX editor, wavelab is far superior. But this would be a very compelling reason for me to switch to the RX editor, my work being primarily voice and text orientated.

Are there any plans for something similar in Wavelab?
Best, Peter

Maybe. What is your audio field?

Thanks for your responding. My main audio activity is educational audio, audiobooks and e-learning.

1 Like

Forgive me, but I don’t see this ‘Speech-to-Text’ facility.? I must have missed a trick… Your video link does not reveal any ‘Speech-to-Text’ function.

Anyway, be careful to read what’s included for each editions tool set - depending on your needs, you could be shelling out £282 (for Standard) or £754 (for Advanced). Although granted, there’s quite a bit you get for your money.! Prices from Plugin Boutique.

Have you actually watched that video ? From 00:30 up to 2:20 is all about this functionality.

Yes, I did. I watched the entire thing, waiting eagerly for this ‘Speech-to-Text’ functionality to appear.

What I see in the duration you state, is regions of audio material already marked out and (I’m guessing, manually) labelled with descriptive txt. I am not yet ready to believe that the program itself generated those labels intelligently, from the audio signal alone. That is my understanding of a true ‘Speech-to-Text’ capability.

I’m ready to be quite wrong here - but please show me where/how the program does that trick…

have you tried it in RX10 yet? its not as good as you expect, a few more revisions it’ll be there though,
its quicker to edit in wavelab and send out to descript to process in the background

RX seems to take about 15mins to markup the text on a 45min file, I haven’t found the wildcard to search for a phrase over an individual word yet…but it is very clever

what maybe more useful when RX gets further along with it, is being able to open it as normal as an external editor, but being able to copy and paste from RX to Wavelab to edit the syllables/small word changes…as personally I think that would be much much quicker (and allow marker point uses etc) than digging around it in RX, and then re-saving

Oh.! I see… so it does try to automatically transcribe the audio, labelling with text as it goes - it just takes a while and the accuracy could be better.!? Ok, good to know…

In which case, I stand corrected @petervanrees - I’ll have a look for other info/video that demos this part of the process…

download the trial and take a look, yes thats exactly what it tries to do :+1:
yes it should be quicker and more accurate, it starts to populate the words from the audio as soon as you press the navigation word lane

1 Like

Ok, will do…! (when I next think about it; it’s nothing I’m in need of - or afford - anytime soon) :face_with_monocle:

How can I export the text file after the recognition?
It seems it’s only for audio purposes (it would be agreable) like word substitutions