Machine Learning Approaches to Sentiment Analysis Using the Dutch Netlog Corpus

Sentiment analysis deals with the computational treatment of opinion, sentiment and subjectivity. We constructed and manually annotated a corpus, the Dutch Netlog Corpus, with data extracted from the social networking website Netlog. This corpus was annotated on three levels: ‘valence’ (expressing the opinion of the writer; we distinguish between ‘positive’, ‘negative’, ‘both’, ‘neutral’ and ‘n/a’) and additionally language performance, which is divided into two areas: ‘performance’ (‘standard’ versus ‘dialect’) and ‘chat’ (‘chat’ versus ‘non-chat’). We tackle sentiment analysis as a text classification task and employ two simple feature sets (the most frequent and the most informative words of the corpus) and three supervised classifiers implemented from the Natural Language ToolKit (the Naïve Bayes, Maximum Entropy and Decision Tree classifiers). 

Issue #: 
001
Author(s): 

Sarah Schrauwen

ISSN: 
2033-3544
Published: 
28/07/2010
AttachmentSize
PDF2.83 MB
CTRS-001