Translation traditionally happened by typing text in a text editor or in a CAT tool. The introduction of speech to text introduced a possibility to translate by voice as well. How does that work and does it impact quality and productivity?
From text-to-text to speech-to-text
Since the very introduction of the computer the use of a keyboard has been the most important input method. Everywhere on the globe people are typing texts in known text editors like Microsoft Word. Translators are using this input method as well, translating a whopping 600,000 words per year at least in CAT tools that speed up their productivity.
Not long after the invention of the computer a new technology was introduced: speech-to-text, changing the input method from text to voice, and enabling people to input commands and text simply by using speech.
That technology however never gained the ground keyboards had won earlier: first of all the technology was not initially good enough to recognize voice correctly and render it into great texts, but for people used to typing text it needed too many changes in workflows and habits to use speech-to-text effectively.
Ongoing developments and some persistence on the side of software developers however has ensured that speech-to-text technology nowadays is very effective and easy to use. Lowering prices for peripherals, like a microphone, and the increasing capacity of computers has worked to the advantage of speech-to-text technologies as well. Today many people all over the world, including many translators, are using Dragon Naturally Speaking to use their voice as an input method for emails, documents and other kinds of digital processes.
Speaking for myself I had never used Dragon Naturally Speaking for a couple of reasons. The price tag of the software is a bit high, I was afraid it would not be very effective, and given my average productivity I worried that the software would decrease my daily quantity of translation. It was only a couple of months ago that I bought Dragon’s speech-to-text software to see if my fears were justified.
Starting to use Dragon Naturally Speaking
Dragon Naturally Speaking works quite simply, but in the past months it has proven to do a great job. Users need to download (a massive) software installer, which then installs the speech-to-text software on their computer. After several configuration steps they can simply start talking to their computers; afterwards, the software converts the spoken text to typed text. Apart from the software only a microphone is needed. That’s an investment well worth considering as not every microphone is usable: a cheap microphone with poor technology results in many cases of unrecognized text, which renders Dragon totally useless. Using a more expensive microphone however yields great results, which reduces the amount of correction needed and speeds up productivity.
In fact, translating with Dragon Naturally Speaking is as easy as 1, 2, 3. First of all, users should download the software, which is enormous, and has a size of about 1 gigabyte. After downloading, they need to install and setup the software, configuring the main usage scenario, and testing the microphone to train the software to recognize their voices. Soon after that they can start using the software as it will record the spoken words and use the recording files to train the recognition engine to continually improve the speech-to-text output.
Command bar of Dragon Naturally Speaking (Dutch version)
How does speech-to-text translation work?
Avid readers of this blog might remember that SDL Trados is my most used translation software. It may therefore not be a surprise that Dragon Naturally Speaking in my office is used most in combination with Trados Studio. That, however, is not the best match at the moment as there is no native integration for Dragon Naturally Speaking in Trados Studio (although there is in MemoQ). Nevertheless, it yields great results as the engine is from the Dragon software, which does a proper job in converting spoken text to typed text.
Once Dragon Naturally Speaking is started, it will wait in the background until it registers spoken text. This text is then recorded and converted on the fly to typed text. In the case of SDL Trados, the text is displayed in a dialogue, where users can edit any discrepancies or incorrect recorded texts. Users can then insert the converted text in Trados Studio by pressing the “Insert” button or using Alt + O. They can then confirm the translation by using the general shortcut (Ctrl + Enter).
You can view a short video recording of that process below:
How does speech-to-text software impact the productivity?
As mentioned above, I was afraid that speech-to-text software would negatively impact my productivity given a high translation output. Indeed, you need to speak clearly, the software needs some time to convert the spoken text to written text, and you need to indicate some commands to confirm the text. Yet using Dragon Naturally Speaking has not shown a decrease in my productivity. Until this point it has not show an increase either, but that may come once the engine is better trained and I know all available shortcuts. Speaking out the translation loudly, and editing and confirming the converted text requires about the same amount of time as manually keying the translation into Trados Studio.
Speech-to-text software can therefore be used in normal and general scenarios but it can also be of use in certain situations, like after a poor night’s sleep– when typing isn’t as assured as on better days, or in cases in which the client requires a more fluent or a more speech-like translation. And as the gentle learning curve to master the ins and outs of this kind of software is easy, it make speech-to-text software well suited for every translator who wants to have a backup method of translating texts.