ETAP-3

ETAP-3 is a proprietary linguistic processing system focusing on English and Russian.[1] It was developed in Moscow, Russia at the Institute for Information Transmission Problems (ru:Институт проблем передачи информации им. А. А. Харкевича РАН). It is a rule-based system which uses the Meaning-Text Theory as its theoretical foundation. At present, there are several applications of ETAP-3, such as a machine translation tool, a converter of the Universal Networking Language, an interactive learning tool for Russian language learners and a syntactically annotated corpus of Russian language. Demo versions of some of these tools are available online.

Machine translation tool

The ETAP-3 machine translation tool can translate text from English into Russian and vice versa. It is a rule-based system which makes it different from the most present-day systems that are predominantly statistical-based. The system makes a syntactical analysis of the input sentence which can be visualized as a syntax tree.

The machine translation tool uses bilingual dictionaries which contain more than 100,000 lexical entries.

UNL converter

The UNL converter based on ETAP-3 can transform English and Russian sentences into there representations in UNL (Universal Networking Language) and generate English and Russian sentences from their UNL representations.

Russian language treebank

A syntactically annotated corpus (treebank) is a part of Russian National Corpus.[2] It contains 40,000 sentences (600,000 words) which are fully syntactically and morphologically annotated. The primary annotation was made by ETAP-3 and then manually verified by competent linguists. This makes the syntactically annotated corpus a reliable tool for linguistic research.

Lexical functions learning tool

The ETAP-3 system makes extensive use of lexical functions explored in the Meaning-Text Theory. For this reason, an interactive tool for Russian language learners aiming at the acquisition of lexical functions has been developed. Such learning tools are now being created for German, Spanish and Bulgarian[3]

gollark: I mean, the NN is.

gollark: That's not automated.

gollark: Since you can't really detect any "offensive" thing automatically. Which is probably fortunate, on the whole.

gollark: r/place, if you actually were to moderate it.

gollark: That would require a ton of manual moderation anyway, probably.

References

"МНОГОЦЕЛЕВОЙ ЛИНГВИСТИЧЕСКИЙ ПРОЦЕССОР ЭТАП-3". Iitp.ru. Retrieved 2012-02-14.
"Search the Corpus. Russian National Corpus". Ruscorpora.ru. Retrieved 2012-02-14.
"Лаборатория № 15". Iitp.ru. Retrieved 2012-02-14.

External links

Official website with demo-versions of linguistic tools

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] "МНОГОЦЕЛЕВОЙ ЛИНГВИСТИЧЕСКИЙ ПРОЦЕССОР ЭТАП-3". Iitp.ru. Retrieved 2012-02-14.

[2] "Search the Corpus. Russian National Corpus". Ruscorpora.ru. Retrieved 2012-02-14.

[3] "Лаборатория № 15". Iitp.ru. Retrieved 2012-02-14.