Linguistic corpus of Crimean Tatar language

September 14, 2016

Къырымтатар тилининъ лингвистик корпусы

The Corpus contains texts written in the cyrillic orthography, and the queries should be also written in cyrillic. To accommodate systems without cyrillic keyboard, on-screen virtual keyboard can be used to insert cyrillic characters into the query form. It is possible to search either for single tokens (words), sequences of tokens, or tokens matching given regular expressions. The Corpus consists mostly of newspaper texts published at the beginning of XXI century.

