Previous abstract | Contents | Next abstract

Topological fields chunking for German

In this paper we compare three different approaches to the analysis of the basic structure in German sentences: the sentence brackets in the topological field framework in German (Höhle, 1986). The first approach is based on hand-written Finite-State Automata (FSA); the other two are trained on corpus data. One is a Probabilistic Context-Free Grammar (PCFG) approach, the other is a classification-based Memory-Based Learning (MBL) approach. The three approaches are evaluated on a manually annotated corpus. We will show that the F_beta=1 value for this task is around 94% for all three approaches, which suggests that this is a fruitful first step for parsing and analysing German text.


Jorn Veenstra, Frank H. Müller and Tylman Ule, Topological fields chunking for German. In: Dan Roth and Antal van den Bosch (eds.), Proceedings of CoNLL-2002, Taipei, Taiwan, 2002, pp. 56-62. [ps] [ps.gz] [pdf] [bibtex]
Last update: September 07, 2002. erikt@uia.ua.ac.be