George Foster
Phone: 819-934-9373
Fax: 819-934-2607
Email: George.Foster@cnrc-nrc.gc.ca
Roland Kuhn
Phone: 819-934-9603
Fax: 819-934-2607
Email: Roland.Kuhn@cnrc-nrc.gc.ca
Michel Mellinger
Phone: 819-934-9176
Fax: 819-934-2607
Email: Michel.Mellinger@cnrc-nrc.gc.ca
The aim of the PORTAGE project is to develop technology for allowing a computer to translate from one language to another, and to apply that technology to improve the productivity of human translators. The project started in September 2004. By 2006, the technology had reached a maturity level sufficient for us to begin user testing of the technology with users from interested organizations (both governmental and commercial). Though many of the results remain confidential at the request of the organizations involved, they are extremely encouraging. We are currently implementing several scenarios for commercialization, both of the PORTAGE system itself and of the diverse applications which can be derived from this technology.
As explained in more detail in the technical overview, there are a few approaches to machine translation (MT): an older approach in which experts write a set of translation rules for the computer based on their knowledge of how to translate from one language to another, and a newer approach in which the computer itself learns such rules from a huge bilingual corpus. The PORTAGE technology is based on the second, newer approach, often called "statistical machine translation". Provided a bilingual corpus for the two languages involved - the language one wishes to translate from (the source language) and the language one wishes to translate into (the target language) - is available, the statistical MT approach enables one to build a translator between the two languages much more quickly and economically than with the older approach. Thus, although our research has focused so far on English, French, Arabic, and Chinese as the main languages of interest, the PORTAGE technology is applicable to all pairs of languages for which there is interest and for which the bilingual corpus from which the technology 'learns' how to translate is available; as a matter of fact, the PORTAGE system has now also been tested on several additional European languages, that is, Spanish, German, and Finnish.
To ensure that PORTAGE is competitive with the world's best translation systems, we participate in international evaluations of MT performance; our participation in such competitions includes the shared translation tasks associated with the following:
As the PORTAGE system evolved, we have steadily increased our standing in these evaluations, so that PORTAGE is currently positioned with the top performing systems. For instance, our Chinese-to-English system was one of the top three systems in the 2009 NIST evaluation. The PORTAGE technology's international visibility was also heightened by our participation (Oct. 2005 – June 2009) in the multimillion dollar GALE project sponsored by the US Government's Defense Advanced Research Projects Agency (DARPA). The goal of the GALE project (Global Autonomous Language Exploitation) is to make foreign-language (Arabic and Chinese) speech and text accessible to English monolingual people, particularly in military settings. As a member of the Nightingale consortium led by SRI International (California), one of the three consortia participating in the project, our role was to supply MT technology for translation from Arabic and Chinese into English. See the Nightingale consortium announcement for more details.
In addition, the PORTAGE system was used as the baseline system in the European project SMART (Statistical Multilingual Analysis for Retrieval and Translation) in which our group participated.
The PORTAGE system has now reached the point where it is positioned for supporting real-life tasks in different settings, including:
We expect PORTAGE to have an impact on several sectors, for example: translation and terminology, second-language education, and e-business.
In terms of technology transfer, we welcome discussion with potential industrial partners interested in any of the possible application areas for which PORTAGE is suited, such as those listed above. Further details are provided in the next section.
For additional information about the PORTAGE system, please consult the technical overview of this project.
In order to ensure that the PORTAGE's state-of-the-art SMT software has an impact in real-life applications, we are engaged in several commercialization scenarios.
In order to foster Canadian R&D in SMT and, in particular, the training of highly qualified personnel in SMT, we have been licensing PORTAGE to Canadian Universities since 2006 for R&D and education purposes. This permits interested R&D teams to move ahead rapidly in their work thanks to a fully operational state-of-the-art system that serves as an effective baseline. We have extended this mode of collaboration to European machine learning laboratories during our participation in the European project SMART noted above.
As far as the private sector is concerned, we offer a one-year evaluation licence to Canadian companies so that they may engage in SMT activities and determine the applications and markets they wish to target with the help of PORTAGE. This one-year evaluation licence may be renewed if warranted and will lead to the negotiation for a commercial licence once a company has established a business plan for the commercialization of PORTAGE as part of their service or product offerings.
Should a company or organization, whether a potential end-user of PORTAGE or a technology provider interested in PORTAGE, first wish to explore PORTAGE applications, we are also open to R&D collaborations which will demonstrate the applicability of PORTAGE. Collaborations are tailored to the specific needs identified.