Michel Mellinger
Phone: 819-934-9176
Fax: 819-934-2607
Email: Michel.Mellinger@cnrc-nrc.gc.ca
The TerminoWeb project focuses on the development of a technology which will allow, as a medium term objective, the automatic construction of specialized ontologies (i.e. ontologies for specific domains), converging in this way with the study of terminology. The project began in 2004 and a number of its aspects are being actively explored, the results of which will be presented as they are produced (consult this page on a regular basis for updates). TerminoWeb now processes both English and French; other languages could be included thanks to the modular design of the system; we are now preparing to explore bilingual English-French terminology.
The different aspects for the project are as follows:
The project is based on information extraction technology by use of linguistic patterns, and also develops new and innovative technologies. For example, for the construction of specialized corpora, we are developing a module for the post-processing of results obtained from search engines, which allows texts ordering by taking into account criteria regarding text structure (flowing text) and knowledge density (knowledge-rich contexts). This technology is unique and supports information retrieval for the purpose of ontology extraction. As for term extraction, we have engaged in R&D for the integration of relational criteria between terms in order to determine their status (term of interest or not) in a given domain. This is also an original approach as criteria that are commonly used at this time are essentially linguistic and statistical criteria linked to individual terms only (one term at a time).
The TerminoWeb project will lead to several software applications, including:
Therefore, the project aims at having significant impact in the fields of terminology, translation, and language learning.
The publication NRC-48765 provides further details of the project, explains the various functions of TerminoWeb, illustrates its capabilities with results from experiments, and identifies areas of future R&D work considered at that time.
We welcome discussing technology transfer options with private sector companies interested in one or the other application areas listed above.
Version 2.0 of TerminoWeb is now available on-line. We still appreciate comments from users. Please see the Web site of the TerminoWeb should you wish to use TerminoWeb. We make the TerminoWeb technology evolve according to feedback received from users, at the same time as we pursue those areas of R&D of interest to the development of tools derived from TerminoWeb.