When translators say they are using translation software for their work, it often raises eyebrows. ‘So you are using Google Translate?’ is not an unfamiliar question. Translation software however does a whole different job. In this blog post I explain what translation software actually is.
Computer aided translation
My blog post about the future of CAT tools raised a couple of questions from people outside the industry. ‘What is a CAT tool?’ an acqaintance asked. ‘Is CAT a brand?’ While there has been much written about CAT tools already, (see my colleague, Anuschka Schutte, here), I will explain in this blog how a CAT tool actually works and how it can improve the quality of translations.
CAT actually is an acronym for Computer Aided Translation. It is simply a piece of software that enables translators to translate their work. Basically, translators open a document (‘source file’) to be translated, do their work and export the translation (‘target file’) in the same formatting as the source file. A CAT tool is thus software that enables a translation, but does not do the translation. Therefore a CAT tool is different from Google Translate (and competitor products like DeepL) as it does not translate the text itself.
There are many CAT tools out there, as I already showed in my blog post ‘How many CAT tools should you need?’ In the blog post I list about 25 different tools, of which most work offline on computers. Recent technologies, however, have also opened opportunities to create CAT tools online, which make it possible for translators to work on the go (hence the blog post about the future of CAT tools).
Of all the CAT tools SDL Trados and MemoQ seem the most popular. While there are great differences under the bonnet as to algorithms, features and compatibility, they basically work in the same way: source file in → translation → target file out.
Project management interface of SDL Trados Studio 2017
The advantages of a CAT tool
The advantages of using a piece of CAT software to translate a source text are many, because a CAT tool actually does much more than simply provide a work environment for translators. The translation software breaks a whole text into segments from one sentence, structuring the text and making it easy for translators to focus their thoughts and translate sentence by sentence. At a segment level however it is possible that some sentences or word groups occur more than once in a text.
In the example below, (taken from the poem ‘Annabel Lee’ by Edgar Alan Poe), the red colored lines are repetitive text.
The CAT tool recognizes that these lines match 100%, and possibly has already filled in the translation of the first occurring sentence to assure consistency in the translation. However, apart from the 100% matching sentences there can also be sentences that are really close to each other, but still differ in some respects. These sentences are called ‘fuzzy matches’. CAT tools can fill them in to improve the productivity of translators, but they need to be amended in order to make sure that they are translated correctly. The image below with an orange background shows the sentences that are close to each other, but still differ in wording (and therefore can have a whole different meaning than the 100% matches).
To ensure the consistency of translations and improve the productivity of translators, all translated segments are stored in a translation memory (TM). This database contains all translations for a particular document or project, and CAT tools will look up the best possible match in the database for each segment. Apart from the TM, translators can also attach a terminology database (TB) to a project. This file can contain terminology for a particular field or industry, offering suggestions when a term occurs in the source document to make sure that the translation fully reflects the correct terminology.
Modern CAT tools already save particular terms or word combinations as a term in their translation memories, making it possible to ensure consistency without a terminology database. In the example of Annabel Lee below, the first occurrence of the ‘term’ ‘Annabel Lee’ will be stored in the translation memory, afterwards it will occur in every sentence containing ‘Annabel Lee’. Hence these terms are highlighted in green.
In short, a CAT tool:
- structures a document by breaking it up into different sentences
- improves productivity by looking up sentences that are close to each other
- enables translators to store their work in a translation memory
- improves consistency by enabling a terminology database for particular terms
The disadvantages of a CAT tool
In the past years I have been translating millions of words with a variety of CAT tools, and they have saved time and energy while greatly improving the quality of my work. Yet CAT tools are not entirely flawless. One of the biggest problems is their compatibility with different source files. All CAT tools can handle major file types like Microsoft Office files and web pages. When it comes to complicated file types however, they are often not supported. Out of all the major CAT tools, only MemoQ supports InDesign design files (INDD), and even then, only after conversion in the cloud. XML files are supported by almost all CAT tools, but they require advanced knowledge of regular expressions – which is often beyond the scope of many professional translators. And in practice, PDF files often turn out to be problematic as well. While many CAT tools can convert them faithfully to Microsoft Word files, and even preserve the formatting, some tools get stuck when saving the translation – making it a pain to deliver a file on time.
These problems may be some of the reasons why many professional translators have not embraced CAT tools until now, and still translate without them.
Using CAT tools in real life
However, that brings us to one of the major advantages of CAT tools I haven’t mentioned in details till now: CAT tools have a knack for preserving the formatting of the original document, making it possible to deliver translations with exactly the same formatting as the source document.
To show proof of that I will describe the process of translating the poem I used above (but please note it is only a literal translation for demonstration purposes, without the rhyme and linguistic quality of the source text).
When I open the file in my favourite CAT tool, SDL Trados, I can add a translation memory and Trados will then split up the file in segments, each having the exact length of a sentence. As you can see below the formatting of the original text is already present in the CAT tool – a first major advantage.
The source document opened in SDL Trados Studio 2017
I can now translate the file and, after confirming each segment, it is stored in the translation memory. The second advantage shows up when I am translating the segment containing ‘Annabel Lee’: Trados recognizes that I have already translated the name in a previous segment, and automatically suggests it when I hit the first letter of the name.
Automatic suggestion from a previous translation
That advantage becomes even more clear when I reach the segment that mentions ‘kingdom by the sea’: I previously translated a similar segment and Trados calculates that the new segment is a 89% match, already supplying it to make it easy for me to amend it.
Example of fuzzy match (89%) in SDL Trados Studio 2017
Once I have confirmed it, Trados automatically applies my translation to the other segments in the poem containing the same sentence:
Example of 100% match and 99% match of “In this kingdom by the sea”
And when I make a mistake, the built-in spellcheck (Hunspell or MS Office) shows a red line to warn me to deliver a correct translation.
The last sentence is not exact a 100% match as it contains a comma instead of a full stop. Trados is therefore showing it with a suggestion, so I know what and where to edit.
Suggestion for the last sentence: the comma should be replaced with a full stop
Now I am ready I can save the file and then another wonder happens: the translation is saved with the same formatting as the source file, enabling me to send it to my client without further editing and enabling the client to use it without delay.
Side-by-side comparison of source and target file
The power of a CAT tool
A CAT tool therefore is not an automatic translation tool or translation engine, but only a tool that helps translators in the translation process. It ensures a real human translation, made possible by professionals, but also saves time and resources for clients to re-format a translation and speeds up the time to market for books, leaflets and manuals. Translators are not designers however, so do not expect them to solve technical design matters. Design and linguistic errors in a source text cannot be solved by translators (although a good translator will point you to them), but having CAT tools at hand is the first solution for improving the quality and consistency of translations – opening opportunities to have even better translations than the original source text.