Language Resources
The Impact Centre of Competence provides historical and named-entities lexica for the following languages. In addition, we offer access to the different corpora.
WHAT IS A LEXICON?
A lexicon is a structured, machine-usable repository of relevant linguistic knowledge about words in a language. A lexicon will contain historical variants (orthographical variants, inflected forms) and link them to a corresponding dictionary form in modern spelling (known as a ‘modern lemma’).
Historical Lexica
![Bulgarian](https://www.digitisation.eu/wp-content/uploads/2017/05/bulgarian_lexicon.png)
Bulgarian Lexicon
The current lexicon consists of 28,857 lexical entries.
![Czech](https://www.digitisation.eu/wp-content/uploads/2017/05/Czech-1.png)
Czech Lexicon
The period covered by the Historical Lexicon of Czech is 1800 – 1900.
![Dutch](https://www.digitisation.eu/wp-content/uploads/2017/05/dutch.png)
Dutch Lexicon
The period covered by the Historical Lexicon of Dutch is 1600 – 1940.
![Bulgarian](https://www.digitisation.eu/wp-content/uploads/2017/05/bulgarian_lexicon.png)
German Lexicon
The German lexicon consists of 510 texts including different genres.
Corpora
![IMPACT-es diachronic corpus](https://www.digitisation.eu/wp-content/uploads/2017/05/corpus.png)
IMPACT-es Diachronic Corpus
IMPACT-es diachronic corpus of historical Spanish compiles over one hundred books. A complementary lexicon which links more than 10 thousand lemmas.
![Slovene Corpora](https://www.digitisation.eu/wp-content/uploads/2017/05/Slovene-Corpora.png)
IMP Slovene Corpora
The reference corpus of historical Slovene goo300k contains the text from 1,100 pages sampled from the IMP collection with hand-validated linguistic annotation.