How it works? | Machine translate
For the first time the idea of using electronic computers for word translation was made in 1947 in the United States, immediately after the appearance of the first computers. The first public demonstration of a machine translation took place in 1954. That system was very primitive: she had a vocabulary of only 250 words, 6 grammar rules and could translate a few simple phrases. But the experiment has received a wide response: research began in countries around the world, including the Soviet Union. How does the modern machine translation system - about it in today's issue!
The basis of modern systems is the translation algorithm using formal grammar and language statistics. To learn a language, the system compares thousands of parallel texts - containing the same information but in different languages. For each study the text system builds a list of unique features. For example, rarely used words and special characters that appear in the text with a certain frequency.
The machine translation systems are usually three main parts: the translation model, language model and a decoder. Translation model - this table that for all the words and phrases in one language are the possible translation into another language, indicating the likelihood of these translations. The system compares not only separate words but also phrases of several words, consecutive. translation models for each pair of languages contain millions of pairs of words and phrases. As for the language model, it is created by the system at the stage of studying the texts. Transfer takes decoder. He spends the morphological and syntactic analysis of the text and for each sentence selects all translations decreasing order of probability. Then all the obtained variants decoder estimates using the language model to use frequency and selects the bid with the best combination of likelihood and frequency.
machine translation systems can be used not only for word processing, but also for the translation of individual words. They contain a complete dictionary with detailed cards of words and expressions. This card system is based on statistical data, based on the rules of the language. For machine vocabulary she selects only dictionary forms of words and set expressions. The system carries out a morphological and syntactic analysis, determines the part of speech, the dictionary form of words and phrases set boundaries. This information helps weed out incomplete phrases. To avoid errors by an algorithm based on machine learning technology, checks all potential pairs transfers and eliminates unreliable.
Close in value transfers are grouped system using dictionaries of synonyms. They get word that is often translated into another language or form the same phrase with the same words. As a result, native vocabulary gets everything he needs to know about every word and expression: its dictionary form, part of speech, meanings and synonyms. Some systems, for clarity, are added to the translation examples are taken from parallel texts.
Use of statistics allows machine translation systems vary with the language. If people start to write a word in a different way, the system sees it as soon as it enter new texts. To improve the quality of the translation, the system is regularly updated and carry out inspections. However, high-quality machine translation of texts is still unattainable. However, it greatly simplifies and accelerates the work of translators.