Skip to main content

Georgian Language Corpus

Georgian Language Corpus

September 15, 2016

Georgian Language Corpus

ქართული ენის კორპუსი


The Georgian Language Corpus (GLC) is developed at the Institute of Linguistic Studies of Ilia State University during 2009-2016. At present the corpus includes two main sections, monolingual and bilingual. The monolingual section consists of

  • New and Modern Georgian;
  • Old and Middle Georgian.

New and Modern Georgian Corpus contains linguistically annotated texts from 1832 to 2012 with each word tagged by its lemma and morphosyntactic description. The linguistic annotation was carried out by means of Morphological Analyzer of Modern Georgian Language developed within the framework of the project financed by the Shota Rustaveli National Science Foundation. At present, the analyzer is being adjusted to literary works in Old and Middle Georgian.

Read More