Corpus Analysis for Lexical Database Construction: A Case of Russian and Czech Wordnets
Authors | |
---|---|
Year of publication | 2004 |
Type | Article in Proceedings |
Conference | Proceedings of the 33th International Conference on Linguistics |
MU Faculty or unit | |
Citation | |
Web | http://nlp.fi.muni.cz/publications/spbgu2004_anna_smrz |
Field | Informatics |
Keywords | corpus; lexical database; lexico-syntactic patterns; word sketches |
Description | The paper deals with corpus-based methods applied to the particular tasks of lexical database construction. Different techniques of the corpus analysis are discussed and their applicability for the tasks is assessed. Corpus management system Manatee + Bonito developed at the Faculty of Informatics, Masaryk University in Brno, Czech Republic, is presented as a tool that enables to perform all discussed linguistic studies. We mainly focus on the methods of substitutions and extractions of lexico-syntactic patterns that present a kind of standard approaches to the creation of lexical databases. We also briefly mention the employment of word sketches a new technique in lexicography aiming at speed up of corpus analysis work |
Related projects: |