Previous abstract | Contents | Next abstract

Backward machine transliteration by learning phonetic similarity

In many cross-lingual applications we need to convert a transliterated word into its original word. In this paper, we present a similarity-based framework to model the task of backward transliteration, and provide a learning algorithm to automatically acquire phonetic similarities from a corpus. The learning algorithm is based on Widrow-Hoff rule with some modifications. The experiment results show that the learning algorithm converges quickly, and the method using acquired phonetic similarities remarkably outperforms previous methods using pre-defined phonetic similarities or graphic similarities in a corpus of 1574 pairs of English names and transliterated Chinese names. The learning algorithm does not assume any underlying phonological structures or rules, and can be extended to other language pairs once a training corpus and a pronouncing dictionary are available.


Wei-Hao Lin and Hsin-Hsi Chen, Backward machine transliteration by learning phonetic similarity. In: Dan Roth and Antal van den Bosch (eds.), Proceedings of CoNLL-2002, Taipei, Taiwan, 2002, pp. 139-145. [ps] [ps.gz] [pdf] [bibtex]
Last update: September 07, 2002. erikt@uia.ua.ac.be