ARCHIVED - New International Collaboration in Bioinformatics

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

February 04, 2004— Ottawa, Ontario

High-throughput screening facility.
High-throughput screening facility.

Researchers from several NRC institutes will soon be joining forces with a group of Spanish scientists in a new research project that merges computer sciences and biotechnology, a field known as bioinformatics. Funding for the project was recently announced by NRC and the Ministry of Science and Technology (MCYT) of Spain and was made available as the result of a broader collaborative research agreement between NRC and Spain.

Bioinformatics Defined

Bioinformatics, also referred to as computational biology, involves the collection, organization and analysis of biological data.

Bioinformatics is increasingly used as a tool to help researchers study gene and protein functions. As an example, a typical biochip can contain 20,000 genes. A common experiment involves the study of changes in the genes over time in response to the introduction of an external stimulus, such as a drug. With repeated measurements over time, more and more data is generated about these genes, resulting in an extremely large data set which requires the use of bioinformatics tools for processing and analysis. Effective analysis of this data can lead to faster drug discovery, improved patient diagnoses and treatments.

Bioinformatics is a rapidly-growing R&D area, the result of increasing amounts of data generated by research into genomics and proteomics. "Advances that have happened in the last five to seven years have helped biologists produce data that is beyond the immediate analysis capabilities of the labs," explained Dr. A. Fazel Famili, a researcher with the NRC Institute for Information Technology (NRC-IIT) and one of the principal investigators for the project.

A typical genomics application using DNA microarray technology generates in the range of 10-50 gigabytes of data, enough to fill the hard drive of a high-end personal computer in a matter of days. Proteomics applications raise the total even higher; data generated by proteomics applications are measured in terabytes, quantities usually reserved for telecommunications providers. The sheer size of these data sets is evidence of the staggering complexity of biological systems and the effort required to understand the complex interactions taking place.

Dr. Alfonso Valencia of Spain's National Centre for Biotechnology also plays a key role as a principal investigator and leads a team with a wealth of experience in bioinformatics and analytical techniques. Other members of the project team include researchers from Spain's National Center for Oncology, the NRC Biotechnology Research Institute, the NRC Institute for Biological Sciences and the Children's Hospital of Eastern Ontario. This broad partnership is critical for applications such as bioinformatics which involve the merger of different disciplines. "It is extremely important to understand the problem and what you are looking for. You have to define a data mining strategy that works with the application and this absolutely requires the involvement of domain experts, such as biologists and biochemists," Famili noted.

DNA microarray.
DNA microarray.

The team will be working towards several goals over the next three years. Researchers will focus on improving the understanding of source data from DNA sequencing, microarrays and other applications, which will allow them to store and analyze this material more effectively. The ultimate goal of the project is to take this newly expanded knowledge base and create what is known as a decision support system, in essence, an artificial intelligence tool that can be used to merge and manipulate multiple streams of biomedical data. For example, in addition to data generated by microarrays and the like, the system will also be designed to utilize clinical information, or other experimental data generated by partners at the Children's Hospital of Eastern Ontario. "We have begun looking beyond just data mining of genomics and proteomics information. The question has now become; what do you do with this huge amount of knowledge that is produced in these applications? We need to make sure that it's structured in a way that it will be properly used and that it will be continuously enhanced with new information from other sources, such as clinical data," Famili said.

Potential applications of such a system could include recommendations for treatment options or diagnosis. According to Famili, the team is focusing its efforts on creating a tool that will be successful with particular diseases, such as Leukemia or Hepatitis C and Hepatitis B and will be focused on particular decision support tasks that need to be performed.

Funding for this project was announced along with projects in the areas of marine biosciences and microelectronics. These collaborations began with a series of joint workshops conducted over the last two years by NRC and the Consejo Superior de Investigaciones Cientificas. The workshops resulted in an agreement between NRC and MCYT of Spain, in which both organizations pledged to conduct joint collaborative research for a three-year period.

Recommended Links

Enquiries: Media relations
National Research Council of Canada

Stay connected


Date modified: