Languages of Mathematics -- Random Walking in the Mathematics of Languages

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

SOJKA Petr SOJKA Petr HORÁK Aleš

Year of publication 2009
Type Article in Proceedings
Conference Third Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2009
MU Faculty or unit

Faculty of Informatics

Citation
Web
Field Informatics
Keywords language of mathematics;mathematics of language;random walking;plagiarity;similarity;topicality;conarrativity;DML-CZ;EuDML
Description An essay about mathematics being a sublanguage of other natural languages: how it may be represented, stored, searched and handled in several projects of (European) Digital Mathematics Libraries as DML-CZ or EuDML. A framework for solving problem of computing of similar papers in a digital library is proposed, allowing several types of similarity type definitions: \emph{plagiarity} counting on common word $n$-grams, \emph{topicality} counting on common topics, or \emph{conarrativity} counting on the same narrative. The vector of the most similar documents for a given similarity type is suggested to be computed using the algorithm by Page for web page ranking, often explained as `random walking'.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info