REIMEREM 2008

Workshop on Resources and Evaluation for Identity Matching, Entity Resolution and Entity Management

LREC 2008 Workshop

31 May 2008

Call for Papers

Structured repositories of data about people are being created through information extraction from unstructured text as well as from sources that may themselves be structured documents such as passports or customer transactions. Problems arise in managing these structured repositories and integrating information from diverse sources. For example, newly added information must be consistent with existing information, must avoid duplication, and must be associated with an existing entity when that is appropriate. Researchers have addressed these problems in different contexts with goals such as name or record matching, identity resolution, and entity disambiguation. In this workshop, researchers with different perspectives will focus on the development of resources, algorithms and evaluation methodologies to improve the technology for managing structured repositories of identity data.

Evaluation measures for tasks that integrate person information are especially challenging. Whereas it is generally accepted that entity extraction systems can be evaluated using MUC scoring metrics, the case is less clear for �??follow-on�?? technologies. Even a seemingly simple task such as matching person names in a database context is deceptively complex, and although measures like precision and recall have been used to evaluate name and record matching, there are methodological issues to resolve before we can refer to a �??standard�?? evaluation methodology for this task. Moreover, it is much less clear how to effectively evaluate identity matching, resolution, and management systems, or even what it means to perform an effective identity match, particularly in the context of data containing identity attributes of varying quality and in which we have varying degrees of confidence.

We solicit papers that address the following areas:

1.     Position papers which:

·        Discuss metrics for evaluation of the above-mentioned technologies

·        Discuss resources that can be brought to bear on these tasks

·	 This CfP was obtained from WikiCFP