Previous abstract | Contents | Next abstract

Semantic Lexicon Construction: Learning from Unlabeled Data via Spectral Analysis

This paper considers the task of automatically collecting words with their entity class labels, starting from a small number of labeled examples (`seed' words). We show that spectral analysis is useful for compensating for the paucity of labeled examples by learning from unlabeled data. The proposed method significantly outperforms a number of methods that employ techniques such as EM and co-training. Furthermore, when trained with 300 labeled examples and unlabeled data, it rivals Naive Bayes classifiers trained with 7500 labeled examples.

Rie Kubota Ando, Semantic Lexicon Construction: Learning from Unlabeled Data via Spectral Analysis. In: Proceedings of CoNLL-2004, Boston, MA, USA, 2004, pp. 9-16. [ps] [ps.gz] [pdf] [bibtex]

Last update: May 13, 2003. erikt@uia.ua.ac.be