Preprocessing of informal mathematical discourse

in context of controlled natural language 2012/10/29

Autores

Raúl Ernesto Gutiérrez de Piñerez Reyes, Juan Francisco Díaz Frías

Fecha de publicación

2012-10-29

Conferencia

Proceedings of the 21st ACM international conference on Information and knowledge management, page 1632-1636

Editor

ACM

Abstract

Informal Mathematical Discourse (IMD) is characterized by the mixture of natural language and symbolic expressions in the context of textbooks, publications in mathematics and mathematical proof. We focused the IMD processing at the low level of discourse. In this paper, we proposed the preprocessing phase before the IMD structure analysis within the context of Controlled Natural Language (CNL). Our contribution is defined in context of the IMD processing and the use of machine learning; first, we present a CNL, a pure corpus and Matemathical Treebank for processing IMD; second, we present a preprocessing phase for IMD analysis with connectives disambiguation and verbs treatment, finally, we found a satisfactory result on input text parsing using a statistical parsing model. We will propagate these results for classification of argumentative informal practices via the low level discourse in IMD processing.

PDF