News & Events

Irish Times Article on Machine Translation

19 March 2010

Word Getting Out About Translation

[full article taken from the Irish Times online edition]

Machine translation is on the cusp of delivering convenient yet reasonably accurate tools to decipher foreign languages, writes John Cradden

IT MIGHT seem like advances in language translation technology will soon cancel out the need for kids to learn foreign languages, but there are no signs that human translators will be out of a job just yet.

The field of machine translation (MT) has been around for years but it has improved significantly over the past decade. Now it is being merged with other technologies to give us the promise of highly convenient yet reasonably accurate tools that will allow us to decipher a range of foreign languages.

At the recent Mobile World Congress in Barcelona, Google chief executive Eric Schmidt demonstrated a new prototype of his firm's visual search application, Google Goggles.It works with the company’s MT technology, Google Translate, to make a smartphone application that can read a foreign language text taken by a camera photo, such as a menu or street sign, and get it translated instantly.

Google has also confirmed that it is working on a mobile speech-to-speech translation application that it expects to become available within a couple of years. Using existing technologies in voice recognition combined with Google Translate, the firm aims to have a system capable of understanding a caller’s voice and translating it into something close to the equivalent in a foreign language. But it won’t be anywhere near perfect. It is widely acknowledged that, despite recent advances, automated MT is still crude compared to human translations.

Until recently, most MT technology worked on a rules-based approach to programming, teaching the computer the linguistic rules of two languages and inputting the necessary dictionaries. This is being overtaken by a statistical approach. This is more like educated guesswork, aided by feeding in source-language data along with their human-generated translations in the target language. As well as huge amounts of data, it also demands lots of computing power, so it is easy to see where Google has spotted an opportunity to muscle into areas previously led by Microsoft, IBM and Babelfish.

But while the search engine giant has been busy capturing many of the headlines in the area of MT, research teams involved in the Centre for Next Generation Localisation (CNGL) have also been working on speech-to-speech translation technology and other MT-related areas.

The CNGL, which is funded by the Science Foundation Ireland, includes academics and researchers from Dublin City University, Trinity College Dublin, UCD and the University of Limerick.

"It's doable already, especially in limited domains,” says Prof Andy Way, who leads the MT research group at DCU’s school of computing. “Surprisingly, given that speech contains many errors, false starts, hesitations and so on, machine translation quality doesn’t degrade much when confronted with speech input.”

The one variable that is proving a challenge, he adds, is the massive range of speakers’ voices and accents. Way’s team is also working on smarter ways to translate using the data-driven approach, and merging MT with the translation memory software that many translators now use to improve translation quality and output. “In sum, there’s no reason to be fearful of Google,” says Way.

English-Irish is among more than 50 language pairs that Google Translate can work with, but one Wicklow-based firm has managed to get a headstart by creating its own statistical MT technology for this language pair.

"Machine translation tools are not widely available for this pairing, and there were even fewer when we started out a few years ago," says English-Irish translation agency Traslán chief executive Donncha Ó Cróinín.

"So we developed our own, based on our own skills and experience in providing language tools and resources for Irish."

However, advances in machine translation along with greater availability are causing mixed feelings among professional translators.

"On the one hand, translators continue to be sceptical about the quality of machine translation, but on the other hand they are fearful that they will lose their jobs as machine translation quality improves and becomes more pervasive," says Way.

But he adds that translators will “never be out of a job” because MT quality is not good enough to replace humans. Nonetheless, it can be a useful tool when doing human or computer-assisted translations, as well as for getting the “gist” of something, he says.

“Human translators will soon have to use MT, as there’s a huge bottleneck in the amount of translation that needs doing, but we don’t have enough human translators to satisfy this demand,” says Way. “The economic situation that we all find ourselves in is also putting on huge pressure to cut translation costs.”

In what is clearly a sign of the times, DCU’s School of Applied Language and Intercultural Studies is launching a postgraduate MSc in translation technology. But senior lecturer Dr Dorothy Kenny worries that attention given to developments in MT may put off many language students wishing to become translators.

“If there is a perception out there that machines can translate adequately, the danger is that people will think there is no point in humans learning how to translate,” she says. “This would be very serious for the industry given that there is already a shortage of well-qualified translators in certain markets, despite the fact that pay and conditions are good.”

3rd Workshop on Example-Based Machine Translation

July 2009

Reflecting Traslán's position at the forefront of MT development and research, our Senior Software Developer, Dr. Declan Groves, has been invited to sit on the programme committee for the 3rd Workshop on Example-Based Machine Translation, taking place Nov 12 - 13th 2009.

The workshop, hosted by the Centre for Next Generation Localisation at DCU, follows two successful EBMT workshops in 2001 at the MT Summit VIII in Santiago de Compostela, Spain, and in 2005 at the MT Summit X in Phuket, Thailand. This year's workshop, entitled "Going open-source to revive EBMT" focuses on the open-sourcing of existing EBMT software to strengthen EBMT research and usage.

More information on the workshop, including submission dates, can be viewed here.

Machine Translation Summit and Collaboration with Microsoft Ireland

June 2009

Traslán have recently carried out joint collaborative work with Dag Schmidtke of Microsoft Ireland, involving research into Machine Translation and the use of post-editing. A paper based on this work has been accepted for inclusion in the Machine Translation Summit XII Conference which takes place in Ottawa, Canada, from the 26-30 August.

The joint Traslán-Microsoft work, entitled "Identification and Analysis of Post-Editing Patterns for MT", discusses research into the types of changes made by post-editors when correcting machine translated output and how these edits can be automatically identified. This work can prove to be invaluable in helping to reduce the amount of post-editing effort required to perfect raw MT output, thus helping to reduce translation costs. The abstract for the paper is given below:

Identification and Analysis of Post-Editing Patterns for MT

Abstract

For this work we have carried out a number of analysis experiments comparing raw MT output produced by Microsoft's Treelet MT engine (Quirk et al., 2005) with its human post-edited counterpart, for English-German and English-French. Through these experiments we identify a number of interesting post-editing patterns, both textual (string-based) and constituent-based. In this paper we discuss our analysis methodologies, present some of our results and provide information on how this type of analysis can be of benefit to translation systems and post-editors, with a view to improving initial MT output and consequently post-editor productivity. In addition, we also discuss the MT and post-editing workflow at Microsoft and results from MT post-editing pilots for a number of different language pairs.

European Association for Machine Translation Website

Sept 2008

Traslán have just completed a full bespoke re-design and re-structuring of the website for the European Association of Machine Translation (EAMT).

The newly launched website presents a clean, modern and professional design, incorporating new bespoke graphics and images, including an updated logo for the EAMT organisation. Traslán developed the new website making full use of XHTML, PHP & CSS technologies to ensure consistency throughout and the provision of accessible information. The easily-maintainable site contains many new features including integrated RSS feeds and state-of-the-art spam prevention.

The new website can be viewed here.

AMTA 2008 Keynote & Paper

Sept 2008

Traslán's Senior Software Developer, Dr. Declan Groves, has been invited to give a keynote speech at the Eighth Conference of the Association for Machine Translation in the Americas (AMTA 2008). The conference takes place October 21st - 25th in Waikiki Hawaii with experts in translation software from around the world participating.

The conference includes educational events for people who are interested in how companies and government organizations use translation software in practical, real world settings. There are courses, workshops, and presentations from researchers, from developers, and from users of translation software in business and in government.

For his keynote address, entitled "Localization at Traslán: Bringing Humans Into the Loop", Dr. Groves will discuss the use of cutting-edge Machine Translation software within the company. The abstract and link to the accompanying paper are given below:

Localization at Traslán: Bringing Humans Into the Loop

Abstract:

Traslán makes full use of MT during our translation workflow, where the raw output from our MT system is passed onto human translators who perform post-editing (if necessary) to arrive at the final translation.

Within Traslán we have found that using MT has enabled us to increase the speed, accuracy and consistency of translation – elements which allow us to process larger amounts of translation with quicker turnaround times, which in turn has resulted in overall savings of approx. 20% so far.

One of the main challenges in using MT within a commercial setting is getting human translators to adopt and make full use of the technology. Within Traslán we overcome this obstacle by working closely and intensively with our translators, getting them involved directly in the development process. Doing so enables translators in turn to train new users of the system and to effectively communicate to other translators the benefits of integrating MT into the translation pipeline.


[Localization at Traslán: Bringing Humans Into the Loop (full paper)]


Sponsorship of Éire Óg GAA Club

July 2008

Traslán are proud to announce their sponsorship of Éire Óg GAA club.

Éire Óg are a local GAA club based in Co. Wicklow, Ireland and service the Greystones & Delgany Area, with Hurling and Gaelic Football teams competing in both adult and juvenile leagues.

For more information please visit Éire Óg's official website.

Inagural Meeting of the CNGL

May 2008

Traslán attended the Inagural Meeting of the Centre for Next Generation Localisation (CNGL) which took place on Friday May 30th 2008 at the Helix at Dublin City University. The meeting included the signing of an Intellectual Property Framework agreement facilitating EUR 14M in industry contributions to the Centre's research.

CNGL is a Centre for Science Engineering and Technology (CSET) established with funding of EUR 16.8M by Science Foundation Ireland (SFI).The centre brings together thirteen different partners spanning international industry, including IBM, Symantec, Microsoft and Dai Nippon Printing, local SMEs including Traslán, and Irish universities. The industry contribution will bring the total value of the centre to over EUR 30M over 5 years.

The Inaugural CNGL Convention was formally opened by Prof. Ferdinand Prondzynski, President of Dublin City University and was addressed by Prof. Fionn Murtagh, Director of the Information, Communications & Emergent Technologies Directorate at Science Foundation Ireland. A keynote address was given by Jaap van der Meer from The Netherlands, a Language Industry pioneer and Director of the Translation Automation Users Society (TAUS).

[DCU - "Signing of CNGL Intellectual Property Agreement enables €14M Industry Contribution to Localisation Research"]
[CNGL website]