MATMT 2008

Background

The aim of the workshop "Mixing Approaches To Machine Translation" is to promote practical hybrid approaches to MT, combining resources and algorithms coming from rule-based, example-based or statistical approaches.

The boundaries between the three principal approaches to MT (rule-based, example-based, statistical) are becoming narrower:

* Phrase based SMT models are incorporating morphology, syntax and semantics into their systems. * Rule based systems are using parallel corpora to enrich their lexicons and grammars, and to create new methods for disambiguation. * Previous ASR/ALT projects have shown that in a MT system benefits can be realized by a simple combination of different MT approaches in a Rover architecture.

Data-driven Machine Translation (example-based or statistical) is nowadays the most prevalent trend in Machine Translation research.

Translation results obtained with this approach have now reached a high level of accuracy, especially when the target language is English. But these Data-driven MT systems base their knowledge on aligned bilingual corpora, and the accuracy of their output depends heavily on the quality and the size of these corpora. Large and reliable bilingual corpora are unavailable for many language pairs. Workshop Programme

Structure:

* Invited talks * Programme papers * Panel discussion

Keynote speakers:

* Koehn, Philipp (University of Edinburgh, UK) * Ney, Hermann (Rheinisch-Westfälische Technische Hochschule, Germany) * Way, Andy (Dublin City University, Ireland)

Workshop topics

We are particulartly interested in papers describing research and development in the following areas:

* Comparing different approaches for developing MT   * Methods to compare and integrate translation outputs obtained with different MT approaches. * MT evaluation methods, especially those suitable for languages with rich morphology. * Morphology-, syntax- or semantic-augmented SMT models * Research developed using OpenSource language resources for developing hybrid MT

All contributions will be published in the workshop proceedings. Paper submission

Papers should be written in English and no longer than 8 pages.

Use the same file template as was used for the TMI-07 conference

Papers should be sent via e-mail to i.alegria@ehu.es

All contributions will be published in the workshop proceedings. Important Dates

* Paper submission deadline: Nov 26, 2007 * Notification of acceptance: Jan 9, 2008 * Camera-ready papers: Jan 20, 2008 * Workshop: Feb 14, 2008

Programme committee

* Iñaki Alegria (University of the Basque Country, Donostia) * Kutz Arrieta (Vicomtech, Donostia) * Núria Castell (Technical University of Catalonia, TALP, Barcelona) * Arantza Diaz de Ilarraza (University of the Basque Country, Donostia) * David Farwell (Technical University of Catalonia, TALP, Barcelona) * Mikel Forcada (University of Alacant, Alicante) * Philipp Koehn (University Of Edinburgh, UK) * Lluis Marquez (Technical University of Catalonia, Barcelona) (Co-chair) * Hermann Ney (Rheinisch-Westfälische Technische Hochschule, Aachen) * Kepa Sarasola (University of the Basque Country, Donostia) (Co-chair)

Local organization

IXA Group, University of the Basque Country

* Alegria I., Casillas A., Díaz de Ilarraza A., Igartua J., Labaka G., Lersundi M., and Sarasola K.

Elhuyar Fundazioa

* Gurrutxaga A., Leturia,I., and Saralegi X.

About OpenMT project

The main goal of /OpenMT /project is the development of Open Source Machine Translation Architectures based on hybrid models and advanced semantic processors. These architectures will be open-source systems combining the three main Machine Translation frameworks �??Rule-Based MT (RBMT), Statistical MT (SMT) and Example-Based MT (EBMT)�?? into hybrid systems. Defined architectures and results of the project will be Open Source, so it will allow rapid development and adaptation of new advanced Machine Translations systems for other languages. We will test the functionality of this system with different languages: English, Spanish, Catalan and Basque. Corpora are easily available for English and Spanish, but not so for the remaining languages. While the structure of some of those languages is very similar (Catalan and Spanish), others are very different (English and Basque). Basque is an agglutinative language with a very rich morphology, unlike English, Catalan and Spanish.

The main innovative points of OpenMT project are:

* The design of hybrid systems combining traditional linguistic rules, example-based methods and statistical methods. * That it is an Open Source Iniciative * The use of advanced syntactic and semantic processing in MT 	 This CfP was obtained from WikiCFP