Feeding the translation robot

Posted on 5 April, 2017

Feeding the translation robot

Everybody who has ever had their hands on a machine translation task knows that the output of a translation engine in most cases compares badly to the work of a human translator. But although the lack of quality in translation engine results can give reasons to laugh, translators can still influence future output. Considering the pathway and approach to great future results is not only useful for later but offers some insight into our current translation practices as well.

Machine translation as a resort for the future

One of the arguments in the discussion among translators on the rise of machine translation is that translation engines are unable to deliver creative translations. If translation engines are able to translate documents, it is supposed they can only do short documents or repetitive tasks. Until now they simply could not cope with marketing texts or any other texts where translations could not be logically or statistically determined.

It is understandable why why translation engines until now have not been able to convince the translation community of their proficiency. Today most of them are based upon statistical machine translation (SMT), a technology that uses algorithms to calculate the best possible match for a sentence. Put simply, in SMT the translation engine first makes an overview of all possible words, word combinations, and their translations in a massive corpus. It then uses those input data to calculate a translation. This mathematical model offers no room for creativity: if a = 1 in math, it cannot be 2. So if euro in an English text is dollar, it can never be translated as pound (just to state an example that is understandable for everyone). SMT will therefore always stick to the rules and never add the flair and puns that can be found in perfect creative translations.

Researchers and companies alike are now putting their money on Neural Machine Translation (NMT). NMT is a hot topic nowadays, with Google claiming that ‘in some cases human and [Google’s Neural Machine Translation] translations are nearly indistinguishable’.
Still, the majority of translation engines do not make use of NMT for financial and logistical reasons, while Google’s claim is highly contested by industry professionals. Fuzz and fire in the camp of MT professionals will ensure that human translators are still in the lead.

The lack of promising results in terms of creativity and the current lead of human translators in most fields does not mean that we should rest on our laurels however. Working with many technical clients and agencies I see that more and more companies are doing experiments with translation engines. Some of them are starting out of the blue, investing in a machine framework, using a cloud operator or building one themselves, while inputting all the content they can find. Others have already progressed somewhat, fine-tuning their machines with ever fresh input content and feedback from users. Although it will still take years before MT has a massive market share and is outpacing human translators, I cannot deny that it is going on and slowly progressing.

It will therefore not be a bad habit to start looking to translation engines and learning how to use them. Stepping on the MT train at a later stage is still possible without taking too much risk, but the earlier one gets involved in and used to machine translation, the better s/he can influence future output results.

Influencing the output of translation engines

One way to influence the output results is by taking care of the input yourself. Only by translating content that is used to train translation engines can you make sure that you will reach the quality level you wish to. If the translation engine is set up properly you can ensure that future translations by a machine will stick closely to your style and won’t require much in the way of costly and time-consuming edits.

At the same time it must be said that training translation engines to suit your style and needs is only possible to the maximum extent if you own a translation engine or if you are the only translator for a particular client and/or domain. If you are part of a team of translators it would be much more difficult to influence translation engines. You will therefore never benefit that much from the translation engines you helped to train. At the same time you risk seeing your client move and use your valuable input while never making use of your translation services again.

So translating for translation engines has the most benefit and the brightest future when you own one. If you are considering adding machine translation to your skills in the future, it might be best to consider investing in your own engines. That is why I invested in Slate Desktop (review) last year. With Slate Desktop I can create my own translation engines offline and retrain them again and again to improve the quality of the machine translations until that quality is close to my own translation style.

Of course there are alternatives, like SDL Language Cloud. My preference for Slate Desktop then was based on the fact that it does not rely on the cloud and therefore to me it seems less vulnerable to data theft.

Lessons from a translation engine

The very fact that I started using Slate Desktop taught me some important lessons. The first was that creative translations are rendered useless by an algorithm and that engines will not easily be able to cope with creativity in translations. In the past I had always translated creative texts directly with my CAT tools while now I use my translation memories to generate a translation engine. That engine, however, did not know how to make use of the texts. It was simply unable to translate creativity and flair into a fluent sentence because it only used a mathematical model for translation. This resulted in useless sentences with multiple nouns and exclamation marks that completely missed the point (or puns).

The second lesson I learned was that using synonyms did not work well either. Whereas in ‘translations with flair’ a synonym was sometimes the best option to convey the message, my engines got stuck on that flair.

A third lesson Slate Desktop taught me was that you can only train engines well with highly specialized translation memories. Because I often used a general translation memory to store all my creative translations, that way of working proved to be completely useless. Creative translations are simply too varied to be useful for a mathematical translation. On the other hand, the machine translations sometimes offered help in that the sentences generated contained words I did not come up with myself and which I sometimes had not used for years. In that way the engine threw up alternatives I could use to bring my translation to even higher levels.

Translating for machines

The only way to circumvent the above problems is by adapting your translations to the logic of a translation engine – which actually is a 360 degree change in the approach to translation. Indeed, translations should have style and have to be written as if they were not translations. Translating your texts so that mathematical algorithms can deal with them is then really a bit awkward. Yet this is a great approach to make the most out of translation engines.

If you are using a translation engine (or plan to use one), it would be good to take note of how it produces translations. Are they awkward, illegible, or utter nonsense? You then have a chance to influence future machine translation results by adapting your translations to the logic underlying the engine. Basically it boils down to the following points:

Make sure you do a literal translation. Literal does not mean that the syntax of the translation should mirror the syntax of the source text, but that you leave as much flair and creativity as is possible.
Do not split translations into more sentences than is strictly necessary. Sometimes it is unavoidable, and your translation engine will learn how to deal with it, but it may still have difficulties with this even after long training.
Translate all words differently. In many languages there are synonyms and other words that in particular contexts can be translated identically or just differently. Make sure that every word has its own particular meaning to avoid nuanced differences in machine translations.
Make sure that every tag is in the right place. Slate Desktop does not place tags in the translation, but other translation engines do. By positioning tags in the right way you make sure that the engine will do it itself in the future.
Use a specific translation memory for each and every domain (or client, but domain seems to work better). Every domain has its own specialties and oddities, and training an engine with a specific memory will avoid confusion in its ‘brains’. Indeed, a ‘nut’ in technical documentation is entirely different from a ‘nut’ in a food recipe, isn’t it?

Back to creativity

One might argue that this approach is a genuflection to translation engines. Indeed, if this is how to approach translation for the future, creativity will die and machines will win. That, however, is only a part of the truth. As with each translation, a creative translation – or even transcreation – has to be checked and edited after it has been translated. That is the same approach you should use after starting to use machine translations. After you have trained your translation engine and obtained satisfying results (i.e. legible sentences without too many editing requirements), you can safely let it loose on your translations. As soon as it is ready for it, you can output your translation and start the creative process. Of course, for small tasks avoiding the machine translation step can save you time, but for larger tasks this approach may work well. Simply edit and adapt the robot’s output to give it your human touch.

And never feed your creativity back into the robot. Robots simply cannot cope with creativity. Period.

Save

by Pieter Beens

7 Comments

Caroline P
Thank you! A very interesting and forward-looking article.:-)
9 April, 2017 at 15.48 Reply
Tom Hoar
Pieter, thank you for this landmark perspective. You’re a pioneer among the growing number of translators who are benefiting from this new desktop technology. It’s nice to see you experienced the same conclusions as the academic researchers describe in their reports with “all those lines and numbers” (George Carlin’s Hippy Dippy Weatherman). That is, some types of work, like creative writing in literary works and marketing campaigns, yield inferior results with this technology. We have a longer list on our website under the “Domains” section.
As you said, companies are experimenting. I created Slate Desktop so translators can fight fire with fire. I’m starting a whole new section on our support site with articles that describe improvement strategies for expert users. Fortunately, we’ve designed the software for beginners to experience good results from the start.
12 April, 2017 at 11.58 Reply
Terence Lewis
Hi Pieter,
Yes, I think Slate Desktop is an excellent way for translators to use proposals presented by an MT engine to advantage in their professional work. I’ve been doing a lot of practical work building Dutch-English/English-Dutch engines with an open source Neural MT toolkit over the past four months. If you ever want to discuss how this might help you please don’t hesitate to contact me at terence.lewis@language-engineer.co.uk or support@mydutchpal.com.
23 April, 2017 at 10.03 Reply
Yifan
How about mixing human and MT? Submit to MT only a portion/section of text that the human translator judges MT can do? Then MT is merely a reference tool and it doesn’t even matter if MT can do well with sentences of complicated structures.
I am a user of GT4T. It enables me to submit a portion of text to MT I choose and pastes translation. It gives me translation ideas when I am stuck, or most of time at least can save me some keystrokes.
Dallas
25 July, 2017 at 16.40 Reply
- Pieter Beens
  Sounds great, but to what extend does it improve/hinder your productivity? Selecting parts of sentences, submitting them and copying/pasting them back sounds like an intense job.
  26 July, 2017 at 07.12 Reply
Pingback:Longread: Alice in Machine Translation Land - Vertaalt.nu
Pingback:Longread: Alice in Machine Translation Land – Mr-Translator

Previous post The fun of etymology

Next post Five indispensable online conversion tools for translators

7 Comments

Caroline P

Tom Hoar

Terence Lewis

Yifan

Pieter Beens

Pingback:Longread: Alice in Machine Translation Land - Vertaalt.nu

Pingback:Longread: Alice in Machine Translation Land – Mr-Translator

Post a Comment Cancel Reply

Will the interpreter survive technology?

The impact of poor translations

The end of humanity in project management

Why you may fail in 2019

Five trends that defined the translation industry in 2018

Recent posts

Search the blog

Machine translation as a resort for the future

Influencing the output of translation engines

Lessons from a translation engine

Translating for machines

Back to creativity

Related Posts

Longread: Alice in Machine Translation Land

Two other ways to use machine translation

Neural networks and the future of translation

7 Comments

Yifan

Pingback:Longread: Alice in Machine Translation Land - Vertaalt.nu

Pingback:Longread: Alice in Machine Translation Land – Mr-Translator

Post a Comment Cancel Reply

Recent posts

Search the blog