Previous abstract | Contents | Next abstract

Inductive Logic Programming for Corpus-Based Acquisition of Semantic Lexicons

In this paper, we propose an Inductive Logic Programming learning method which aims at automatically extracting special Noun-Verb (N-V) pairs from a corpus in order to build up semantic lexicons based on Pustejovsky's Generative Lexicon (GL) principles (Pustejovsky, 1995). In one of the components of this lexical model, called the qualia structure, words are described in terms of semantic roles. For example, the telic role indicates the purpose or function of an item (cut for knife), the agentive role its creation mode (build for house), etc. The qualia structure of a noun is mainly made up of verbal associations, encoding relational information. The Inductive Logic Programming learning method that we have developed enables us to automatically extract from a corpus N-V pairs whose elements are linked by one of the semantic relations defined in the qualia structure in GL, and to distinguish them, in terms of surrounding categorial context from N-V pairs also present in sentences of the corpus but not relevant. This method has been theoretically and empirically validated, on a technical corpus. The N-V pairs that have been extracted will further be used in information retrieval applications for index expansion.


Pascale Sébillot, Pierrette Bouillon and Cécile Fabre, Inductive Logic Programming for Corpus-Based Acquisition of Semantic Lexicons. In: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal, 2000. [ps] [pdf] [bibtex]
Last update: June 27, 2001. erikt@uia.ua.ac.be