Previous abstract | Contents | Next abstract

Extracting the unextractable: A case study on verb-particles

This paper proposes a series of techniques for extracting English verb--particle constructions from raw text corpora. We initially propose three basic methods, based on tagger output, chunker output and a chunk grammar, respectively, with the chunk grammar method optionally combining with an attachment resolution module to determine the syntactic structure of verb--preposition pairs in ambiguous constructs. We then combine the three methods together into a single classifier, and add in a number of extra lexical and frequentistic features, producing a final F-score of 0.865 over the WSJ.

Timothy Baldwin and Aline Villavicencio, Extracting the unextractable: A case study on verb-particles. In: Dan Roth and Antal van den Bosch (eds.), Proceedings of CoNLL-2002, Taipei, Taiwan, 2002, pp. 98-104. [ps] [ps.gz] [pdf] [bibtex]
Last update: September 10, 2002.