Close this search box.

Alignment of Parallel Texts at the Level of Sentence

Dr. Sergei Potemkin & Dr. Galina Kedrova,

Faculty of Philology, Lomonosov Moscow State University, Moscow, Russia

In the article, we propose a procedure for aligning parallel texts using an on-line translator of the source text sentences. The result of the MT is compared with the translation made by a professional translator and these two translations are aligned using a dynamic programming procedure. The method was tested on parallel corpora of Chekhov’s stories translated into English, German, French, Italian, Portuguese and Armenian. The future work involves the fragmentation of sentences into phrases and words. The procedure for aligning parallel texts on the level of sentences is proposed. In the procedure, a machine translation system (Google Translation) is used, which allows, in the absence of a bilingual machine-readable dictionary, to translate the original / target text and then compare this translation with the translation made by the human translator. As a measure of proximity between sentences, it is possible to use the number of coinciding or similar words without resorting to morphological analysis of word forms. The dynamic programming procedure finds the optimal path (in the sense of the largest number of matching words) from the beginning of the texts to their end. At the same time, 85% of all sentences are matched. The remaining gaps are caused, as a rule, by the translation of one sentence into two or more, or vice versa – two or more sentences of SL are translated by one sentence of TL. In these cases, the specified segments are merged.

Keywords: Parallel Text, Chekhov, Alignment, Corpora

The above abstract is a part of the article which was accepted at The Eighth International Conference on Languages, Linguistics, Translation and Literature (WWW.LLLD.IR), 14-15 February 2023, Ahwaz.