Neural networks and the future of translation
A conference with big brands like Apple, Google and Facebook as the main sponsors does not happen that often and certainly not in the translation industry. Nevertheless that is exactly what happened last weekend in Copenhagen, where the 2017 Conference on Empirical Methods on Natural Language Processing was held. You guessed it: participants to this weekend conference with the long and difficult title dived into machine translation. The sponsorship by the big brands clearly shows a huge interest in MT by companies that spend billions of dollars on research that will probably pay off some day. What is all the fuss about?
The facts
Machine translation is a hot potato that has been engaging the moods of translation professionals for many years already. While it often leads to fierce discussions in the industry, not many companies have shown their interest openly. So far, only a few players in the technology market have offered publicly available machine translation services, like Google with Google Translate and Microsoft with Bing Translate. However, the emergence of the cloud, together with powerful computers to compute complex calculations within nano seconds is changing the landscape. After Google and Microsoft other major tech companies are trying to secure a market share as well. This interest can be explained by the forecasts market research companies are projecting: in April this year Global Market Insights, Inc. announced that the machine translation market will hit $1.5 billion by 2024. More recently, in August P&S market research claimed that the market will reach $2,275 million by 2023.
In order to harvest some of the low-hanging fruit, Facebook has been experimenting with machine translation of its content for a couple of years now. Earlier this summer, Amazon announced that it is planning a service to rival Google Translate. Then there is Baidu, the Chinese counterpart of Google, applying machine learning to languages in order to offer machine translation. And finally, Apple is in the race as well to create technologies for machine translation. The company even launched its own blog about machine learning.
While all these companies can have various reasons to bet on deep learning and machine learning, all their steps show is that there is something to win in these fields in terms of money. Deep learning and machine translations can offer cost reductions on translations at a low level. (Microsoft is already offering many knowledge base articles in different languages by human translators, and has many blog posts that are [sometimes poorly] translated by their own machine translation technology as well.) A second benefit is that these companies can offer instant translations of their own content and the content of their clients, thus abandoning the need to hire translators and wait until their content is translated into several more languages. On a higher level, the results of deep learning technologies, however, can be used for other applications as well, some of them probably not even existing at the time this blog post is published.
The emotions
The announcement of Amazon preparing for a battle with Google Translate and the publication of the sponsors for the 2017 EMNLP conference raised some interest from tech watchers and language industry professionals as well. That million dollar companies are diving into a complete new or different field is exciting for some people, while others regard the trend with mixed feelings. Overall, tech lovers seem to love the new path these companies are walking and cannot wait to see the first results, or even a total shake up of the market. Language professionals, however, seem sceptical, and above all, uncertain about what these developments will bring. Both feelings are fed by the cover-up these companies are using to mask their business activities. Amazon did not comment on the news about its plans, while Apple is speaking only in general terms and not explaining its targets clearly.
The developments
It is nevertheless a fact that companies have more technologies at their disposal that can be helpful in using Artificial Intelligence in their business, and machine translation is one of the applications. While in the last years statistical machine translation was the hot potato among translators, now neural machine translation is at the heart of the debate – often leaving the real translation professionals confused about what it all means.
In short, statistical machine translation (SMT) is a paradigm for machine translation that generates its translations on the basis of statistical models. The data behind this model comes from an analysis of bilingual text corpora. So the model analyses the texts, trying to figure out which word means what, and then uses a computational logic to translate new sentences on the basis of the existing corpora.
Neural machine translation (NMT) on the other hand, is a technology that uses a large neural network. It departs from the phrase-based statistical machine translation but uses deep learning. This approach makes use of so-called neural networks, a type of intelligent network with programs and data patterned on the operation of the human brain that learns from and adapts to initial rules. Neural machine translation is therefore more intelligent and adaptive than statistical machine translation while also using less resources than its predecessor. In an interview on eMpTy pages, Systran SEO Jean Senellart explains that NMT also needs less data to learn. Companies can therefore benefit from faster machine translation results for a lower investment. Nevertheless, professionals have mixed feelings about the results of NMT compared to SMT. Language technology provider Tilde published results of a comparative evaluation claiming that ‘NMT systems are up to five times better at handling word ordering and morphology, syntax and agreements (including long distance agreements) than the SMT systems’ and ‘Translations from NMT systems are more fluent and also more precise than SMT translations’. In a presentation at SlideShare (published in July 2016), however, two professionals from the language industry claimed that translators are still more enthusiastic about SMT results.
The future
Whatever the feelings of translators and sceptical professionals, NMT currently seems to hold the winning hand. The promise of technological advancement and the perspectives of more AI and lower costs (except for the investment in NMT) makes companies more prone to bet their wages on NMT than on SMT. If so, many billion-dollar companies investing in NMT and SMT will lose for sure, even if the results are good enough to use. For translators there is not much to fear however. Translators investing in SMT can only afford software like Slate Desktop, which has the added benefit of secure local translation engines. Translators therefore do not risk losing their data in the cloud. Most translators, however, do not invest in MT technologies and only choose to use them when it is safe, probably free, and demanded by the client. They only need to work with the results of the translation engine, no matter the technology that underlies it. From that perspective, it does not matter which road tech companies head down. Translators will ultimately follow that pathway.
Tom Hoar
Hi Pieter. Thank you for pulling together SMT/NMT ideas from many sources, and thanks for mentioning Slate Desktop. I want to point out that all of the evaluation statistics in the TAUS SlideShare presentation compare the changes in “big data” cloud systems when they migrate from SMT to NMT. They do not reflect a translator’s experience with personalized SMT.
For that comparison, readers can first read Isabella Massardo’s recent blog about her first EN-IT experience with Slate Desktop. Readers can see evaluation statistics for her Slate engine similar to those in SlideShare. Here’s my re-post: https://slate.rocks/review-who-is-a-translators-new-best-friend/
Then I compared Isabella’s Slate engine to Google’s NMT using Isabella’s own 2,353 human translated segments. She published my guest article on her blog and I re-blogged it here: https://slate.rocks/practical-mt-evaluation-for-translators/
In short, the SlideShare evaluation statistics show NMT generically improves 10% to 20% over SMT in “big data” cloud systems. Our evaluation statistics using Isabella’s translations with her Slate engine show 200% to 700% improvements over Google’s new-improved NMT, depending on which scoring system you use.
Finally, you missed a small but impressive “big data” cloud systems new-comer. Linguee recently launched the DeepL Translator. Translator reviews have declared a significant subjective perception of improved quality, but I can’t find any objective reports. In a few days, I hope to publish the same evaluation statistics results using the DeepL Translator on Isabella’s 2,353 segments. This will be an objective apples-to-apples-to apples comparison. Stay tuned!
Pieter Beens
Hi Tom,
This comment matters. Thank you very much.
Leaving DeepL out of scope was a conscious choice: I heard the rumors and did a quick check but did not dive into the technology yet. Perhaps I will write about it later. Waiting for your comparison first 🙂
Andre Hagestedt
Hi Pieter,
DeepL is definitely worth a look. I was using Google NMT and was impressed but DeepL is definitely better, at least for EnglishGerman.
Josephine Bacon
Human translators are no more replaceable than human authors, in most circumstances. Granted, a robot can translate a weather forecast, but most translations require translator ingenuity and creativity. There is only one reason to use MT: to cheat translators out of money. Many translation agencies claim “we have to use CAT tools, our clients want it”. Their clients wouldn’t know a CAT from a dog! It is their own greed and they will come a cropper and serve them right.
Ioan Prislopeanu
Jo, there is a girl to my heart ! But – alas – there is a way around it – a narrow path that I have just discovered last week – after 35 years of being convinced that TRANSLATORS and MACHINES DON’T MIX ! Well, Josephine, they MIX and can positively interact ! But you know what? Machines – and the guys up there, the Pupeteers – need US – the Old School bastards to do the trick. Not for translating “avec panache” the recipe of a tropical cocktail or another – but for Will Shakespeare, for Goethe, for Baudelaire – even for Ernst Hemingway, the undisputed bastard of the XX-th. Century…
Ioan Prislopeanu
Yes, my friends ! There is a way for the Machine to steal our immortal souls from us – and this moment is not so distant as I thought; maybe a couple of decades or so. I was convinced that mechanic translations will never ever exceed the quality and complexity encountered when you ask for the Service Handbook of your new Japanese car….
Thee IS a way which I saw against my will – and NOW I would very much like to see the guys from Google, Microsoft and the whole bunch – to negotiate with them. Because I am a translator and a soft developer . Tell me – if you know where they are ! ! prislopeanu@yahoo.com
Pingback:Translation favorites (Sep 30-Oct 12)