, ,

I’m going to write again about the special field of translation, and only about a small section thereof, so my topic should actually be read as “What does the percentage score of agreement mean when using machine translation as basis?” I would like to invite fellow translators to comment on the issue. However, I also hope to provide further proof against the validity of using the ‘grammar-translation method’ in language teaching. Or at least further warning about it.

The main reason for me to write today, plainly put, is that I’m not very experienced in translating based on CAT tools and machine supported translation tools. I’ve received one agreement so far that included percentages of my full price if there was a partial agreement between the pre-translated (by Google, for example) version and my corrected version. I haven’t actually received any jobs yet from this client, but I keep wondering about what it means in practice. I can imagine % of agreement between Dutch and German, or English, or Swedish, but my feeling is that there may be very rough estimations and even wrong ones between languages of very different nature. To be explained later below.

Another reason is that I must prepare a long list of my own so that I can convert the whole thing into an auto-suggest dictionary or term base. I use a translation tool for it, and a global service helps me, but one feature of the software helps me identify and apply translations that I’ve already done through the translation memory I’m creating along the way. There’s the rub. After I translated “take note”, I wasn’t offered automatic similar answer when the next phrase to be translated was “take note of”. However, when the following item was “taken notes of” (I’d originally made a typo using the -n), I was given a 71% score relative to the previous one, and a possible translation based on mine with that one. On the other hand, after I did “throw (threw, thrown), hurl, fling”, I was given 85% when I tried to translate “throw (threw, thrown), yield”. I wonder how – ‘yield’ has nothing to do with hurling or flinging, yet, the similarity was found higher here than when I accidentally added the third form of ‘take’ with the plural of ‘note’.

On the text level, I don’t think anyone could come up with anything better than the famous Karinthy story with the cross-translations between Hungarian and German. On a lower level, I have an idea to show what I mean by asking about this problem. Here is goes.

Let’s suppose there is a situation when someone was murdered, there was a knife found next to a pool, the identity of the victim not yet revealed at the beginning of the news. In Hungarian, the text could go like this,

“Az áldozatot valószínűleg késsel ölték meg. A dikicset közvetlenül a medence mellett találta meg a rendőrség. Az elkövetés időpontja még bizonytalan, de a száraz vér okán a dikics már napok óta ott fekhetett.”

Well, because the target language is English, the native English translator may not directly remember what a ‘dikics’ is, he may be a tennis fan and may fail to look up the Hungarian word because he remembers Ms Dokic, former tennis player’s name. There is really not a lot of difference, so he may easily come up with the following translation of the second sentence,

“Dokic was found dead right next to the pool by police.” Deepest apologies to the living person, but the translation tool could well find a 90% or 95% similarity between this and my correction, based on which I would have to give up quite a lot of my earnings on this sentence. However, the meanings of the two sentences couldn’t be greater. The corrector would have to recreate the sentence and give it a completely different meaning, revealing that it was not a person and not dead that was found next to the pool. Not to mention the problem of defamation to the person very much alive. And not to mention that if it were really Ms Dokic, the Hungarian text would read ‘Dokic-ot’, not ‘A dikicset’. Very different, but no MT would understand the difference. I would venture to add that anyone checking my translation with a CAT-tool would also overlook the difference. I would deem the original translation almost worthless, but for the correction, the corrector would receive perhaps only 10% or 20% of his full fee.

Working on my own word list, I am also continually perturbed by the fact that verb forms of English words are identified as nouns, like “bark” as a verb comes in invariably as “kéreg”, and, what’s more, mostly suffixed, like “snore” becomes “horkolással”, while “blare” becomes “Katonazenét hallottunk”. Boggles the mind.

It is also nice when a simple list of verbs is turned into a completely wrong sentence, like from “spot, catch sight of, descry”, I get the result as “Helyszíni, mikor a pusztaságot,” In this way, with a comma at the end. Similarly, when I use the tool for texts, it regularly up-end the translation by adding a negative in Hungarian to the originally positive sentence. Why, one wonders incessantly.

For more fun for Hungarian speakers, let me quote here two machine-translated Hungarian terms from TermWiki, the aspiring definition-provider,

fűszer (egészben vagy őrölve) leírás: a szerecsendió fa szürkésbarna, ovális magot. Buzogány a spice, nyert a magokat a membrán. Diós, meleg, fűszeres, édes. Felhasználás: Italok (esp. tojás nog), sütemények, cookie-kat, szószok, édes burgonya, tejsodó és kenyerek”
Folyó Georgina

Georgina folyó ez Észak-három fő folyók, a csatorna ország nyugati Queensland áramlását rendkívül nedves években, hogy a Eyre-tó.”

No comment. But if someone got such texts to be corrected, based on the similarity of many of the original words to the correction, I am afraid that the fee would not reflect the fact that the whole text would have to be re-done.

Aspiring translators of unrelated languages: beware! Students swotting words of a foreign language: beware!

On the other hand, I would gladly receive any kind of feed-back from En-Hun or Hun-En translators, or any other, even if there’s a great disagreement with the above.

by P. S.