DagSemProc.06031.5.pdf
- Filesize: 1.84 MB
- 46 pages
An important requirement for the expression of cognitive structures is the ability to form mental objects by rapidly binding together constituent parts. In this sense, one may conceive the brain's data structure to have the form of graphs whose nodes are labeled with elementary features. These provide a versatile data format with the additional ability to render the structure of any mental object. Because of the multitude of possible object variations the graphs are required to be dynamic. Upon presentation of an image a so-called model graph should rapidly emerge by binding together memorized subgraphs derived from earlier learning examples driven by the image features. In this model, the richness and flexibility of the mind is made possible by a combinatorical game of immense complexity. Consequently, the emergence of model graphs is a laborious task which, in computer vision, has most often been disregarded in favor of employing model graphs tailored to specific object categories like, for instance, faces in frontal pose. Recognition or categorization of arbitrary objects, however, demands dynamic graphs. In this work we propose a form of graph dynamics, which proceeds in two steps. In the first step component classifiers, which decide whether a feature is present in an image, are learned from training images. For processing arbitrary objects, features are small localized grid graphs, so-called parquet graphs, whose nodes are attributed with Gabor amplitudes. Through combination of these classifiers into a linear discriminant that conforms to Linsker's infomax principle a weighted majority voting scheme is implemented. It allows for preselection of salient learning examples, so-called model candidates, and likewise for preselection of categories the object in the presented image supposably belongs to. Each model candidate is verified in a second step using a variant of elastic graph matching, a standard correspondence-based technique for face and object recognition. To further differentiate between model candidates with similar features it is asserted that the features be in similar spatial arrangement for the model to be selected. Model graphs are constructed dynamically by assembling model features into larger graphs according to their spatial arrangement. From the viewpoint of pattern recognition, the presented technique is a combination of a discriminative (feature-based) and a generative (correspondence-based) classifier while the majority voting scheme implemented in the feature-based part is an extension of existing multiple feature subset methods. We report the results of experiments on standard databases for object recognition and categorization. The method achieved high recognition rates on identity, object category, pose, and illumination type. Unlike many other models the presented technique can also cope with varying background, multiple objects, and partial occlusion.
Feedback for Dagstuhl Publishing