Recent Czech Web Corpora

Investor logo

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

SUCHOMEL Vít

Year of publication 2012
Type Article in Proceedings
Conference 6th Workshop on Recent Advances in Slavonic Natural Language Processing
MU Faculty or unit

Faculty of Informatics

Citation
web https://nlp.fi.muni.cz/raslan/2012/paper11.pdf
Field Linguistics
Keywords web corpora; czech corpus
Description We introduce the largest Czech text corpus for language research – czTenTen12 with 5.4 billion tokens. A brief comparison with other recent Czech corpora follows.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info