The Art of Mathematics Retrieval
Authors | |
---|---|
Year of publication | 2011 |
Type | Article in Proceedings |
Conference | Proceedings of the 2011 ACM Symposium on Document Engineering |
MU Faculty or unit | |
Citation | |
web | |
Doi | http://dx.doi.org/10.1145/2034691.2034703 |
Field | Informatics |
Keywords | math indexing and retrieval; mathematical digital libraries; information systems; information retrieval; mathematical content search; document ranking of mathematical papers; math text mining; MIaS; WebMIaS |
Attached files | |
Description | The design and architecture of MIaS (Math Indexer and Searcher), a system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a similarity of math subformulae. The system was implemented as a math-aware search engine based on the state-of-the-art system Apache Lucene. Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a Solr-compatible Lucene. |
Related projects: |