Skip to main content

Russian National Corpus

Russian National Corpus

September 14, 2016

Russian National Corpus

Национальный Корпус Русского Языка


This website contains a corpus of the modern Russian language incorporating over 300 million words. The corpus of Russian is a reference system based on a collection of Russian texts in electronic form.

The Corpus is intended for all who are interested in the Russian language and various associated fields: professional linguists, language teachers, school and university students, foreigners learning the language.

The Russian National Corpus includes the following subcorpora:

  • The Deeply Annotated corpus, containing sentences with full morphological and syntax structure markup.
  • The Parallel Corpora (English, German, Ukrainian, Belorussian, and multilingual).
  • The Dialectal corpus, which includes recordings of dialectal speech from various regions of Russia.
  • The Poetry corpus, which facilitates searches not only by lexical and grammatical features but also by specifically poetical features, such as meter, rhyme types, etc,
  • The Educational corpus, a corpus of texts with disambiguated grammatical homonyms, which was adapted for the Russian school teaching program.
  • The Corpus of Spoken Russian which includes the recordings of public and spontaneous spoken Russian and the transcripts of the Russian movies (1930-2007).
Read More