Government of Canada
Symbol of the Government of Canada

This is a past project. It is no longer active.

Video has become widely accepted as one of the most valuable sources of information, particularly in the context of human-oriented applications such as:

  • security surveillance,
  • immersive and collaborative environments,
  • multi-media games,
  • computer-human interactions,
  • video-conferencing, and
  • video annotation and coding.

Video information is analyzed for the presence of "information" about faces. The most important tasks underlying "face in video" analysis are:

  • face detection,
  • face tracking,
  • face memorization,
  • face classification, and
  • face recognition.

These tasks are commonly approached independently, with the result that the information drawn from video is used only partially and not very efficiently - which is incidentally an inherent problem in all data mining and knowledge discovery problems.

Face in video problems differ, however, from the other knowledge mining problems in that many of their sub-problems - for example, face detection, tracking, memorization, classification and recognition - have naturally occurring solutions provided by the human brain. And while solutions provided by biological vision are impeccably efficient, computational models of vision do not yet demonstrate very much resemblance to their biological counterparts.

The Face Recognition in Video project aims at bridging the gap between computational models of vision and their biological counterparts. Researchers have established a framework and developed a set of tools in order to encompass all the relevant information available from the video stream.

Using results from recent studies on the neuronal mechanisms of biological vision, particularly on visual attention, video information is presented as a set of channels that correspond to the motion, colour, intensity and orientation components of the video. Each of the channels works independently and is governed by the same top-down goal. Researchers have been able to combine blink detection with skin colour detection and nose tracking to yield intelligent perceptual user interfaces and efficient face detection systems.

Another thrust of the project is to develop techniques for efficient face representation and retrieval. Researchers use the group method for data handling (GMDH) network to discover the minimum set of features capable of discriminating a face and discourse to the auto-associative neural networks to memorize and retrieve the face. Please visit the Perceptual Vision web site for demonstrations.

Combining these techniques with the sound localization techniques stemming from beam forming audio will lead to a versatile and powerful system capable of detecting, memorizing and recognizing persons in various environments.