Russian National Corpus
Национальный Корпус Русского Языка
This website contains a corpus of the modern Russian language incorporating over 300 million words. The corpus of Russian is a reference system based on a collection of Russian texts in electronic form.
The Corpus is intended for all who are interested in the Russian language and various associated fields: professional linguists, language teachers, school and university students, foreigners learning the language.
The Russian National Corpus includes the following subcorpora:
- The Deeply Annotated corpus, containing sentences with full morphological and syntax structure markup.
- The Parallel Corpora (English, German, Ukrainian, Belorussian, and multilingual).
- The Dialectal corpus, which includes recordings of dialectal speech from various regions of Russia.
- The Poetry corpus, which facilitates searches not only by lexical and grammatical features but also by specifically poetical features, such as meter, rhyme types, etc,
- The Educational corpus, a corpus of texts with disambiguated grammatical homonyms, which was adapted for the Russian school teaching program.
- The Corpus of Spoken Russian which includes the recordings of public and spontaneous spoken Russian and the transcripts of the Russian movies (1930-2007).