Automatic Structuring of Written Texts


This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on

VEBER Marek HORÁK Aleš JULINEK Rostislav SMRŽ Pavel

Year of publication 1999
Type Article in Proceedings
Conference Proceedings of 2nd International Conference on Text, Speech, and Dialogue (TSD 1999)
MU Faculty or unit

Faculty of Informatics

Field Use of computers, robotics and its application
Keywords text structure
Description This paper deals with automatic structuring and sentence boundary labelling in natural language texts. We describe the implemented structure tagging algorithm and heuristic rules that are used for automatic or semiautomatic labelling. Inside the detected sentence the algorithm performs a decomposition to clauses and then marks the parts of text which do not form a sentence, i.e. headings, signatures, tables and other structured data. We also pay attention to the processing of matched symbols in the text, especially to the analysis of direct speech notation.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info