Almaty Corpus of Kazakh
Алматы қазақ тілі корпусы
This is the first version of Kazakh National Corpus (KNC), a tool based on a large collection of annotated texts in literary Kazakh, the official language of the Republic of Kazakhstan. There will be regular updates of the corpus, in terms of both quality and quantity.
The corpus considers its goal the following characteristics of KNC:
- a linguistically representative corpus;
- a powerful search engine which allows for complex lexical and morphological queries;
- a convenient tool for study of the Kazakh language where most words are accompanied by morphological analysis and English/Russian translation equivalents;
- a diachronically oriented corpus which covers different periods of modern Kazakh language history;
- a diversified corpus which includes written and oral texts of various genres;
- an annotated corpus with grammatical and metatext markup;
- an open access corpus;
- an online library with acces to more than 100 pieces of classical Kazakh literature.