Why I Don’t Think Supporting Machine Translation Systems Is A Good Idea
I’ve been contemplating about machine translation (MT) for a while, but finally the recent TriKonf in Germany accelerated my thoughts on the topic. Machine translation was discussed there, and here are my comments.
(Please note: with “supporting” in the headline I don’t mean “using”.)
First I would like to say I’m no big fan of MT. I don’t care or mind too much about MT as such and I can live happily without it as a translator, but on the other hand I think when it crosses certain borders it could have a negative impact on our work. It is how and for what purpose it is used what makes the difference in my eyes. As long as MT systems are used to support pro-bono efforts or share medical knowledge to save lives where otherwise this wouldn’t materialize for budget restrictions, I have no problem.
What makes me concerned (also given how easily some colleagues might be willing to have a help in form of semi-artificial pre-translations from MT systems and don’t realize the potential risks of the whole chain) is the following:
1) Some data from translators help improve MT mechanisms used for commercial purposes, i.e. to partially or fully replace the paid translators
While some suggest “all is fine, they predicted miracles already 50 years ago and still nothing”, there are on the other hand statements from experts in the TM/MT processes like Emmanuel Planas or Philipp Koehn indicating that the situation is changing (maybe slowly, but surely). P. Koehn said at the TriKonf that high quality translations are “currently almost exclusively done by human translators” – this looks great at the first sight, but there is “currently” and “almost exclusively”. I think the recent boom of extensive data sharing, cloud systems etc. changes the situation because translators (or their work) are no longer isolated from each other.
Some users of MT are (depending on their settings and agreements) submitting their own translations back into MT systems, and these can learn/improve based on this input. This fact might be diminished by a counter-argument that input from one translator has no real impact. Fine. But imagine you would have input from 10% of all translators to improve the MT algorithms. (As it was said at the TriKonf, more is better.) In my opinion that changes the perspective.
Saying “your own input makes no effect” (read “harm” in my eyes) sounds somewhat utilitarianistically to me, i.e. “it is only the large scale that makes the difference, and you can profit from what 99,99% of translators have been or will be doing, so don’t care about how your 0,01% input can affect you or the profession as a whole”. Well, that large scale had to start somewhere, right? Honestly, I don’t like this way of justification.
(I’m aware that MT systems are loaded with plenty of data from existing translations anyway and I know that the pool of publicly accessible translations is huge, but I see a difference between accepting/ignoring a status quo on one hand and getting directly involved in the system on the other.)
The problem is that any data provided to the MT system is/might be used to teach the MT system to create better constructions, evaluate prevalence etc., eventually leading (especially with a large pool of segments) to a better “conglomerate” than if using the same words/grammar on random basis. If these conglomerates – based on a translator’s earlier input – are then used to replace or reduce his/her own services, it means that providing the input is “feeding the enemy” (even if on a very small scale).
2) MT is to some extent supported through post-editing machine translation (PEMT) jobs
The other potential issue is post-editing of machine translation. While the benefit might be faster work, the risks are:
- Lower price per a word (this is quite certain, unlike the benefit). Of course this would ideally be compensated by higher output, resulting in the same hourly rate, but that is not granted – there are quality issues and certain language pairs are more “demanding” than others because of their grammar, flexibility etc. Also, in case there is no significant excess demand that would make a translator busy – in terms of unchanged amount of working hours – this potential compensation is questionable as the monthly income would decrease.
- Supporting the idea of price-focused clients that MT is something to make the professional translation cheaper, or even unnecessary. (A kind of a vicious circle.) Imagine you come to a restaurant and instead of ordering a steak, you give the waitress a ready-to-cook frozen meat of unknown quality and origin and ask them to warm it up. Do you like this idea? In my eyes, clients requiring PEMT act just like this.
- As above, potentially helping improve the MT algorithms (that depends on how does the client treat the final translation).
With this post, I have tried to outline some facts and risks I see in relation to MT. I decided earlier to act based on what they mean (or might mean) to me (or my colleagues). I say openly I won’t be supporting this system – I won’t wilfully participate in projects that explicitly expect me to “feed” the MT databases to improve the learning curve of MT, and I won’t do post-editing of machine translations. Surely I’m a single drop in the ocean, but I’m open to the idea of forming a group with other drops – MT-resistant translators.
PS I do understand MT won’t succeed for a long time to come in certain fields or language pairs that are more sensitive to “human touch”. I just prefer early caution to late unpleasant surprise.
Did you enjoy this post? Feel free to share it or join the list of subscribers in the right menu.