WSCD 2009

WSCD09: Workshop on Web Search Click Data 2009 http://research.microsoft.com/users/nickcr/wscd09/

Held in conjunction with WSDM 2009 http://www.wsdm2008.org/

February 9, 2009 Barcelona, Spain

Organizers

* Nick Craswell, Microsoft * Rosie Jones, Yahoo! Labs * Georges Dupret, Yahoo! Labs * Evelyne Viegas, Microsoft

Workshop Overview

Research relating to search logs has been hampered by the limited availability of click datasets. This workshop is a forum for new research relating to Web search usage logs. It has an associated dataset, the Microsoft 2006 RFP dataset, which will be made available to participants (for free, but under license). Besides using this dataset, the workshop may also serve as a forum for other new developments in the area, and for discussing desirable properties of future search log datasets.

Topics of interest include but are not restricted to:

* web mining * information retrieval * learning to rank * desiderata for future click data releases * mining semantic relationships, for example within and between the query set and document set * analysis and correction of biases in the data * clustering/grouping log data by: topic, task, geographic location, time. * generative models for the log events, query text and/or document text * other tasks which can be improved with the click data

The Dataset

MSN Search query Log excerpt

* 15 million queries * Sampled over one month * Queries from the US site (mostly English)

Per query attributes included:

1. Session ID 2. Time-stamp 3. Query string 4. Number of results on results page 5. Results page number

Data per query for each result clicked:

1. URL 2. Associated query 3. Position on results page 4. Time-stamp

Due to the type of assets under consideration, the principal investigator will be asked to sign a data licensing agreement before accessing the data. The terms of the license will allow for publication of results but restricts redistribution of the data and publication of detailed excerpts of the data.

Other click datasets may also be used, but it is desirable to show your findings on the shared dataset where possible.

Maximum Number of Participants: 40

Activities: Presentations & Posters sessions.

Proposals

To access the data, write a one page abstract of your proposed experiments using the data. We will check the proposals, collect the necessary paperwork then deliver the data on CD.

Submission details to follow on workshop website: http://research.microsoft.com/users/nickcr/wscd09/

Important Dates

* Proposals: Wednesday, September 3, 2008 * Response to proposals: Wednesday, September 10, 2008 * Paper submission: Friday, December 5, 2008 * Paper notification: Friday, January 2, 2008 * Camera ready: January 12, 2009 * Workshop: February 9, 2009 This CfP was obtained from WikiCFP