About Tasks Tracks Datasets Data format
Evaluation Submissions Paper submission References Organisation
News
24/05/2012: The data are available for download.
09/05/2012: The results and the list of accepted papers have been added.
03/04/2012: Updated information about the system description paper. New deadline: 16 April 2012.
03/04/2012: The final version of the evaluation script and the results have been sent to participants.
16/03/2012: The test dataset has been distributed to participants.
15/03/2012: A new version of the evaluation script for the scope detection task has been distributed to participants.
13/03/2012: A new version of the evaluation script for the scope detection task has been distributed to participants.
09/03/2012: A new version of the CD-SCO dataset has been distributed to participants.
29/02/2012: A new version of the CD-SCO dataset has been distributed to participants.
22/02/2012: A new version of the CD-SCO dataset has been distributed to participants.
19/02/2012: The schedule has been changed. Check Important Dates.
17/02/2012: Evaluation script for the scope detection task has been released.
12/02/2012: Evaluation script for the focus detection task has been released.
7/02/2012: Registration is still possible.
7/02/2012: The training and development datasets have been released.
20/01/2012: LDC will provide during competition time an evaluation license to obtain free of charge the tokens corresponding to the PB-FOC dataset. Thus, all necessary annotations will be available to participants.
16/01/2012: First call for participation.

Scope and focus of negation

Negation is a pervasive and intricate linguistic phenomenon present in all languages (Horn 1989). Despite this fact, computational semanticists mostly ignore it; current proposals to represent the meaning of text either dismiss negation or only treat it in a superficial manner. This shared task tackles two key steps in order to obtain the meaning of negated statements: scope and focus detection. Regardless of the semantic representation one favors (predicate calculus, logic forms, binary semantic relations, etc.), these tasks are the basic building blocks to process the meaning of negated statements.

Scope of negation is the part of the meaning that is negated and focus the part of the scope that is most prominently negated (Huddleston and Pullum 2002). In the example (1), scope is enclosed in square brackets and focus is underlined:

  1. [John had] never [said as much before].

Scope marks all negated concepts. In (1) the statement is strictly true if an event saying did not take place, John was not the one who said, as much is not the quantity of said or before the time.

Focus indicates the intended negated concepts and allows to reveal implicit positive meaning. The implicit positive meaning of (1) is that John had said less before.

This shared tasks aims at detecting the scope and focus of negation.

Tasks

Two tasks and a pilot task are proposed:

  • Task 1: scope detection.

    For each negation, the negation cue and scope are marked, as well as the negated event, if any. Cues and scopes may be discontinuous. Example (1) shows an annotated sentence, where the scope is enclosed between square brackets, the cue highlighted in bold letters and the negated event is marked between asterisks.

    1. [I do]n't [*know* what made me look up], but there was a face looking in at me through the lower pane.

  • Task 2: focus detection.

    Example (2) shows an annotated sentence indicating focus with an underline and semantic roles as provided by PropBank in curly brackets. Detecting focus is useful to detect positive implicit meaning (in (2), a decision is expected in June).

    1. {A decision A1} is{n't M-NEG} expected {until June M-TMP}

  • Pilot: detection of both scope and focus Cancelled!.

    The pilot task aims at detecting both scope and focus. The test set will be the same test set as for Task 1.

Tracks

All tasks will have a closed and open track.

  • For the closed track, systems have to be built strictly with information contained in the given training corpus. These includes the automatic annotations that the organization will provide for several levels of analysis (PoS, parsing, semantic role labeling).
  • For the open track systems can be developed making use of any kind of external tools and resources.

Datasets

Two datasets are provided, one for Task 1, and another for Task 2.
  • [CD-SCO] for Task 1. This dataset includes two stories by Conan Doyle, The Hound of the Baskervilles, The Adventures of Wisteria Lodge for training and development. All occurrences of negation are annotated (1,056 out of 3,899 sentences), accounting for negation expressed by nouns, pronouns, verbs, adverbs, determiners, conjunctions and prepositions. For each negation cue, the negation cue and scope are marked, as well as the negated event, if any. Cues and scopes may be discontinuous. The annotation guidelines are published in Morante et al. (2011) Annotation of Negation Cues and their Scope. Guidelines v1.0, CLiPS Technical Report Series and are available on-line. For testing, another story by Conan Doyle will be provided, [CD-SCO-TEST].

    An example sentence is shown in example (2) above.

    Regarding copyright, the stories by Conan Doyle are in the public domain, so CD-SCO and CD-SCO-TEST will be freely available (original text and annotations).

  • [PB-FOC] for Task 2. In this dataset, focus of negation is annotated over the 3,993 sentences in the WSJ section of the Penn TreeBank marked with MNEG in PropBank. It accounts for verbal, analytical and clausal relation; the role most likely to correspond to the focus was selected as focus. Unlike [CD-SCO], all sentences in [PB-FOC] contain a negation. 80% of [PB-FOC] will be released as training/development set and the rest for test.

    An example sentence is shown in example (3) above. More information about the dataset can be found in E. Blanco and D. Moldovan (2011) Semantic Representation of Negation Using Focus Detection, in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT 2011), Portland, OR, USA.

    Regarding copyright, PB-FOC is built on top of the Penn TreeBank. LDC will provide during competition time an evaluation license to obtain free of charge the tokens corresponding to the PB-FOC dataset.

Data format

Following previous shared tasks, all annotations will be provided in the CoNLL-2005 Shared Task format. Very briefly, each line corresponds to a token, each annotation (chunks, named entities, etc.) is provided in a column; empty lines indicate end of sentence.

A sample of data will be provided very soon.

Evaluation

Evaluation will be performed as follows:

Task 1: Scope detection

  • F-measure for predicting negation cues (perfect match).
  • F-measure for predicting both negation cues and scope. Evaluation will be carried out at the scope level.
  • F-measure for predicting negated events (perfect match).
  • Full evaluation: F-measure for negation cues, scope and negated events (perfect match).
  • Sentence evaluation: F-measure per sentence. A sentence is predicted correct if all negation cues, scopes and negated events are predicted exactly correct.

Task 2: Focus detection

  • F-measure for predicting focus of negation (perfect match).

Pilot: Scope and focus detection Cancelled!

  • Full evaluation and sentence evaluation (same as for Task1).
  • F-measure for detecting focus of negation (perfect match).
  • Joint Evaluation: F-measure per sentence, a sentence is predicted correct if all negation cues, scopes, negated events and foci are predicted exactly correct.

The evaluation scripts will be provided with the training datasets.

Submissions

Participants will be allowed to engage in any combination of tasks and submit a maximum of two runs per track. Submissions should be made by sending an e-mail to the organizers with a zip or tar.gz file. The compressed file should be named according to the following convention: <SurnameName_of_registered_participant>-semst-submission-<run_number><test set name>.<extension>. The files should be organised in a directory structure as in the image below. The test set names for Task 1 are: circle, cardboard. The test set name for Task 2 is: pb.

Submissions can be sent by e-mail to the organisers until the 27th of March, 2012 (24:00, UTC-11:00).

Submission of papers

Participants are invited to submit a paper describing their system. The paper should include the system description and an evaluation of the system performance, which should include error analysis. Papers should clearly describe the methods used in sufficient detail to ensure it is reproducible.

The system description papers must be submitted no later than April 16, 2012 (24:00 pm GMT-7).

The only accepted formats for submitted papers is PDF. Papers should be submitted using the START system:

https://www.softconf.com/naaclhlt2012/STARSEM2012/

In "Submission Categories" you should choose "Shared Task" from the "Papers" pull down menu.

Papers should follow the NAACL 2012 guidelines:

http://www.naaclhlt2012.org/conference/conference.php

with the exception that paper reviewing will *not* be blind, so you can include authors' names, affiliations, and references in the submitted paper.

Additionally, the following restrictions apply to system description papers:

  • Paper Title: The title should start with the team ID followed by ":" .
  • Page Limit: For each team paper, 6 pages for the first system and 2 extra pages for each additional different system, up to a maximum of 8 pages + 2 pages for references.

References and related work

  • E. Blanco and D. Moldovan. 2011. Semantic Representation of Negation Using Focus Detection. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT 2011), Portland, OR, USA.
  • I. Councill, R. McDonald, and L. Velikovich. 2010. What’s great and what’s not: learning to classify the scope of negation for improved sentiment analysis. In Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, pages 51–59, Uppsala, Sweden. University of Antwerp.
  • L. R. Horn. 1989. A natural history of negation. Chicago University Press, Chicago.
  • R. D. Huddleston and G. K. Pullum. 2002. The Cambridge Grammar of the English Language. CUP, Cambridge.
  • R. Morante and W. Daelemans. 2009. A metalearning approach to processing the scope of negation. In Proceedings of CoNLL 2009, pages 28–36, Boulder, Colorado.
  • R. Morante and W. Daelemans. 2012. ConanDoyle-neg: Annotation of negation in Conan Doyle stories. In Proceedings of LREC. To appear.
  • R. Morante, S. Schrauwen, and W. Daelemans. 2011. Annotation of negation cues and their scope Guidelines v1.0. CLiPS Technical Report 3, CLiPS, Antwerp, Belgium, April.
  • Mats Rooth. 1985. Association with Focus. Ph.D. thesis, Univeristy of Massachusetts, Amherst.
  • Mats Rooth. 1992. A Theory of Focus Interpretation. Natural Language Semantics, 1:75-116.

Organisation - contact

Roser Morante, CLiPS-Computational Linguistics, University of Antwerp, Belgium.

Eduardo Blanco, Lymba Corporation, USA

You can contact us by sending a mail to the addresses below. Please, indicate in the subject [*sem-st].