MWE 2008

CALL FOR EVALUATION RESOURCES

LREC2008 - Towards a Shared Task for Multiword Expressions (MWE 2008)

endorsed by the ACL Special Interest Group on the Lexicon (SIGLEX)

Date: Sunday, 1 June 2008 Location: Marrakech, Morocco

Workshop web page: http://multiword.sf.net/mwe2008/



In recent years, considerable progress has been made in our understanding of multiword expressions (MWE), the development of algorithms for their automatic extraction from corpora, and the automatic identification of additional properties such as morphosyntactic preferences or the interpretation of semi-compositional expressions.

It is difficult to compare results of the many published studies on MWEs and obtain a broader perspective, though, because algorithms and implemented systems have been evaluated on vastly different gold standards and corpora, in different languages, for different subtypes of MWEs, etc. In order to make the next big step forward, the field of MWE research needs a shared task in which different approaches are applied to the same data sets, allowing completely new insights to be gained. Since there is as yet not a clear and universally accepted definition of multiword expressions, the first instalment of this shared task will be of a more exploratory nature than the competitions that have been carried out in other areas of computational linguistics.

The MWE 2008 workshop is primarily intended as a forum for collecting, sharing and exploiting MWE evaluation resources. We solicit contributions of such resources from the MWE community, in particular:

(1) manually annotated data sets (MWE candidates marked as true and false     positives, or as different subtypes of MWEs);

(2) data sets of MWEs annotated with additional properties; and

(3) lists of known MWEs, e.g. from machine-readable dictionaries.

In addition, candidate data obtained from corpora with sophisticated proprietary NLP tools may be of interest, helping researchers to apply their statistical MWE identification techniques to a broad range of languages.

The contributed resources will be made available freely for research purposes on multiword.sf.net, and should be accompanied by documentation (e.g. annotation guidelines) on the SourceForge project wiki. Contributors will be invited to submit a short paper (4 pages) describing their resource and summarising previous research carried out on these data.

After collection of the resources, teams participating in the shared task can evaluate their MWE extraction algorithms on multiple data sets and discuss implications for their generalisability and further development. At the workshop, the evaluation results of the different teams will be summarised and compared. A call for papers and participation in the shared task is being distributed separately.

SUBMISSION INFORMATION

If you are interested in contributing a MWE evaluation resource to our initiative, please contact us by e-mail to make further arrangements.

If you have a SourceForge account, you will be able to upload the resource yourself and document it on the project wiki.

IMPORTANT DATES

Resource submission deadline: February 1, 2008 Paper submission deadline: February 29, 2008 Notification of acceptance: March 28, 2008 Camera-ready papers due: April 4, 2008 Workshop date: June 1, 2008

WORKSHOP CHAIRS

Nicole Grégoire University of Utrecht, The Netherlands

Stefan Evert University of Osnabrueck, Germany

Brigitte Krenn Austrian Research Institute for Artificial Intelligence (�?FAI), Austria

CONTACT

For any inquiries regarding the workshop please contact Nicole Grégoire (Nicole.Gregoire@let.uu.nl). This CfP was obtained from WikiCFP