Language Resources for Intelligent Processing of Dialogues about Electrical Networks

Horák,  Aleš; Svoboda,  Lukáš; Kadlec,  Vladimír; Cenek,  Pavel

Language Resources for Intelligent Processing of Dialogues about Electrical Networks

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	HORÁK Aleš SVOBODA Lukáš KADLEC Vladimír CENEK Pavel
Year of publication	2006
Type	Article in Proceedings
Conference	Proceedings of ElNet 2005
MU Faculty or unit	Faculty of Informatics
Citation
Field	Informatics
Keywords	corpora; question answering; desambiguation; electircal networks
Description	The paper describes the process of designing a natural language dialogue interface for querying large databases with time data about electrical power network failures. The first stage of implementation of such dialogue interface consists of creation and preparation of several auxiliary resources that are required for natural language processing of texts over this specific domain. All modern methods of automatic input analysis of texts covering a domain with special terminology are based on a collection of large amount of texts from the field, so called textual corpus. We describe the process and statistical results of creation of a corpus of electrical power networks texts consisting of more than 100.000 of positions (words and marks). We also offer some preliminary results of syntactical analysis of these texts. In the last part of this paper, we present the design of a dialogue system based on the analysis techniques using the corpus data that will allow natural language queries (in Czech) over the database of power networks failures.
Related projects:	Intelligentmethods for incresing of reliability of electrical networks