ARCHIVED - Smarter search tool hits the market
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.
December 07, 2007— Ottawa, Ontario
Information searches just got smarter with a new NRC technology that wades through oceans of digital information to find just the facts you need. "Factor" is a search tool that detects the nature of words and how they relate to each other, returning a strategically narrowed-down list of meaningful results.
"Factor finds the facts you want rather than a list of documents," says Dr. Joel Martin, leader of the group that developed the technology at the NRC Institute for Information Technology (NRC-IIT) in Ottawa.
Unlike existing search tools, Factor can identify relationships between facts (e.g., mergers, hirings, financial statements) and entities (e.g., people, geographic locations, currencies). For example, typing the word "merger" into a mainstream search engine would generate thousands of hits based on keyword matching and require a great deal of time to sift through. Factor streamlines the search process by getting right to the knowledge that the user ultimately needs.
"Instead of returning a list of documents, Factor will tell you the names of the companies that have merged," says Dr. Martin. The technology can quickly zero in on questions such as "What investments have American companies made in European companies over the last three years?" or "What finance companies are hiring new staff?"
Factor could be used in any situation that requires extracting meaningful information from vast amounts of data or documents. "For example, e-publishing companies, which manage huge amounts of content, need automated search tools that can find the information their clients want," says Randall Milburn, business development officer with NRC-IIT.
The technology has already found a home in the private sector with Nstein Technologies Inc. The Montréal-based company has licensed Factor to magnify its text mining and text analytics technology (see sidebar) for the e-publishing and other sectors. NRC and Nstein have signed a ten-year technology license agreement and a three-year collaborative research agreement to continue developing Factor for Nstein's markets.
Mining for knowledge
Creating knowledge from vast amounts of data is a two step process:
- Text mining is the process of extracting nuggets of raw information from unstructured data, similar to extracting raw ore from a mine. For example, text mining could be used to extract the names of all people referenced in a series of documents.
- Text analytics is the process of turning those nuggets of information into knowledge, such as trends, relationships, and events. Text analytics could tell you that the name "Stephen Harper" refers to the Prime Minister of Canada and leader of the federal Conservative party.
In a market-driven approach to technology transfer, NRC focused on finding the company in the text analytics sector that would get the most value from its technology.
Nstein was approached because of their strong position in this sector and because they had the skills and resources to bring the product to market. Factor will give Nstein a competitive edge in text analytics for content management and information retrieval. "It gives them a much more powerful capability to find and process relevant information faster," says Milburn.
NRC's research partnership with Nstein will ensure that the product continues to meet the company's market needs. "By working with Nstein, we're making sure that real needs and real problems are driving the technology development," says Milburn. "That's the best way to guarantee that NRC is providing something of value, because it's coming from a real market requirement."
In addition to working with Nstein, NRC continues to develop the technology for other applications such as drug and health research. "The volume of information in the health science domain is growing exponentially," says Dr. Martin. In the field of biology alone, approximately 40,000 to 50,000 new articles are published every month. Researchers are accustomed to keeping up with articles in their own particular sub-fields. "But what if there's a fact that's relevant to your research in one of the other 40,000 articles published last month?" says Dr. Martin. "How are you ever going to find it?"
Factor makes it possible to search vast numbers of articles for relevant facts – for example, which medications that are used to treat diabetes have been shown to cause an increase in blood pressure. "That would save researchers a lot of time," says Dr. Martin. "But just as important, it's going to tell them things they never would have found out otherwise, because it's simply not possible for a human to go through that much information."
- IT & Communications: NRC Areas of Research
- NRC's Institute for Information Technology Interactive Information Group
Enquiries: Media relations
National Research Council of Canada
Report a problem or mistake on this page
- Date modified: