Serious efforts to develop machine translation systems were under way soon after the ENIAC was built in 1946, and the first known public trial took place in January 1954 at Georgetown University. We've made remarkable progress in the past fifty years, but machine translation involves so many complex tasks that current systems give only a rough idea or indication of the topic and content of the source document. These systems tend to reach a quality barrier beyond which they cannot go, and work best if the subject matter is specific or restricted, free of ambiguities and straightforward, the typical example for this type of text are computer manuals. We'll need more advanced systems to handle the ambiguities and inconsistencies of the real world languages, no wonder that translation, whether performed by machine or by human, is often regarded as an art discipline, and not as an exact science.
Today we have an entire range of translation methods with varying degrees of human involvement. Two extreme ends of the translation spectrum are fully automatic high quality translation which has no human involvement and traditional human translation which has no machine translation input. Human aided machine translation and machine aided human translation lie in between these extremities, with human aided machine translation being primarily a machine translation system that requires human aid, whereas machine aided human translation is a human translation method that is utilising machine translation as an aid or tool for translation. The term Computer Assisted Translation is often used to represent both human aided machine translation and machine aided human translation.
So, what's so hard about machine translation? There's no such thing as a perfect translation, even when performed by a human expert. A variety of approaches to machine translation exist today, with direct translation being the oldest and most basic one of them. It translates texts by replacing source language words with target language words, with the amount of analysis varying from system to system. Typically it would contain the correspondence lexicon, lists of source language patterns and phrases and mappings to their target language equivalents. The quality of the translated text will vary depending on a size of the system's lexicon and on how smart the replacement strategies are. The main problem with this approach is its lack of contextual accuracy and the inability to capture the real meaning of the source text.
Going a step further, syntactic transfer systems use software parsers to analyse the source-language sentences and apply linguistic and lexical rules (or transfer rules) to rewrite the original parse tree so it obeys the syntax of the target language. On the other hand, interlingual systems translate texts using a central data representation notation called an interlingua. This representation is neutral to any languages in the system and breaks the direct relationship that a bilingual dictionary approach would have. Statistical systems use standard methods for translation, but their correspondence lexicon are constructed automatically, using advanced alignment algorithms from a large amount of text for each language, usually available in online databases.
<-- Back to Future of Computers