EPSRC logo
 Home | GoW Home | Back | Programme | Scheme | Topic | Sector | Theme | Region | Organisation     
 
Details of Grant
 
EPSRC Reference: EP/D074959/1
Title: Discriminative Phrase-Based Statistical Machine Translation
Principal Investigator: Dr M Osborne
Other Investigators:
Researcher Co-investigator:
Project Partner:
Department: Sch of Informatics
Organisation: University of Edinburgh
Scheme: Standard Research
Starts: 01 February 2007 Ends: 31 January 2010 Value (£): 269,262
EPSRC Research Topic Classifications:
Human Communication
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
01 Mar 2006 People and Interactivity Announced
Summary
Statistical Machine Translation (SMT) has made great improvements over the last decade. A striking property of these systems is that they make minimal usage of linguistic knowledge about translation. All knowledge about how to translate sentences is gathered in a data-driven manner from parallel corpora (sentences paired with their translation).

In tandem with this observation, projecting ahead, we can see that the volumes of parallel corpora available for traning will not increase at a substantial rate. This suggests that further progress in SMT will come from better modelling of the existing data we have: this means bringing linguistics to the translation problem.

For some languages, linguistic constraints are easily obtained. For other languages, this information is less widely present. We intend seeing whether an improvement in translation can be obtained even when using impoverished knowledge sources.

To successfully carry out this integration, we need a flexible framework. We shall extend an existing approach (which yields state-of-the-art results) using techniques from discriminitive machine learning techinques ("maximim entropy"). These approaches will not only allow us to easily integrate linguistics into the translation process, but should also allow us to improve upon the state-of-the-art simply from better modelling. Associated with better modelling are serious scaling problems, for which we have experience at tackling.

The language pairs we shall investigate will include German-English, Arabic-English and Chinese-English.

Finally, we shall compete in international Machine Translation evaluation exercises. This will involve automatic and manual evaluation of our translation quality, and will allow comparison of our approaches with that of other groups.

Final Report Summary
No final report summary is available for this grant.
Further Information:  
Organisation Website: http://www.ed.ac.uk
Terms and conditions