Previous abstract | Contents | Next abstract

On the robustness of entropy-based similarity measures in evaluation of subcategorization acquisition systems

Some statistical learning systems are evaluated using measures of distributional similarity. To deal with the problem of zero events in the distributions under comparison, smoothing is frequently performed before similarity measures are applied. Smoothing alters the information in the original distribution, and may add noise to the results. Here, we investigate the sensitivity of entropy-based similarity measures to noise from uninformative smoothing. Our experiments with two subcategorization acquisition systems show that similarity measures vary in their robustness. While some are led astray by noise from smoothing, others are more resilient.

Anna Korhonen and Yuval Krymolowski, On the robustness of entropy-based similarity measures in evaluation of subcategorization acquisition systems. In: Dan Roth and Antal van den Bosch (eds.), Proceedings of CoNLL-2002, Taipei, Taiwan, 2002, pp. 91-97. [ps] [ps.gz] [pdf] [bibtex]

Last update: September 07, 2002. erikt@uia.ua.ac.be