A Bayesian Approach to Query Language Identification

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

MATERNA Jiří HREŠKO Juraj

Year of publication 2011
Type Article in Proceedings
Conference Recent Advances in Slavonic Natural Language Processing
MU Faculty or unit

Faculty of Informatics

Citation
Web https://nlp.fi.muni.cz/raslan/2011/paper10.pdf
Field Informatics
Keywords language identification; query language; information retrieval
Description In this paper we present a Bayesian approach to language identification of queries sent to an information retrieval system. The aim of the work is to identify both the language of a query as a whole and the language of particular words in the query. The method is evaluated on a test set of manually labelled queries. The evaluation shows that our method performs better than the Google Language Detect API and an implementation of the n-gram method on our testing set of queries.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info