Hi Hoover,
Could you please broadcast this announcement of
the 2nd NECI Vision Workshop at the NEC Research Institute,
Princeton, New Jersey, June 3 - 14 ? Thanks.
---- Zili Liu =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Dear Colleagues:
We would like to announce and invite you to attend the Second NECI Vision Workshop, from June 3-14, at the NEC Research Institute, Princeton.
About 30 invited vision researchers from computer vision, psychology, and neuroscience will come during this time and discuss their work. They are:
Tao Alter (MIT), Jonas August (McGill), Bart Anderson (MIT), Ronen Basri (Weizmann Institute), Peter Belhumeur (Yale), Irving Biederman (U. Southern California), Pat Cavanagh (Harvard), Sven Dickinson (Rutgers), Davi Geiger (NYU), Alan Gilchrist (Rutgers), Mel Goodale (U. of Western Ontario), Keith Humphrey (U. of Western Ontario), Glyn Humphreys (U. of Birmingham), Phil Kellman (UCLA), David Kriegman (Yale), Ilona Kovacs (Rutgers), David Lowe (U. of British Columbia), Pascal Mamassian (NYU), Thomas Papathomas (Rutgers), V.S. Ramachandran (UC San Diego), Ron Rensink (Nissan), Ruth Rosenholtz (NASA Ames), Nava Rubin (Harvard), Guillermo Sapiro (HP), Kaleem Siddiqui (McGill), Eric Saund (Xerox PARC), Pawan Sinha (MIT), Rudiger von der Heydt (Johns Hopkins), Joachim Weickert (Utrecht U. Hospital), Laurie Wilcox (Unvi Montreal).
The vision group at NECI features a mix of psychologists, biophysicists, and computational vision researchers, with interests in object recognition, motion and stereo, shape from shading, perceptual organization, perceptual learning, and fly vision. Most of us are interested in both biological and computational vision. The group consists of: Bill Bialek, Ingemar Cox, Rob de Ruyter, James Elder, David Jacobs, Michael Langer, Zili Liu, John Oliensis, Steve Omohundro, Sebastien Roy, Karvel Thornber, David Waltz, and Lance Williams.
For information about the workshop schedule, the titles, the abstracts, the location of NECI, the First NECI Vision Workshop in 1995, the vision research at NECI, and the NEC Research Institute in general, please visit our web page at:
http://www.neci.nj.nec.com/homepages/dwj/workshop.html
We are looking forward to meeting you at the workshop. **************************************************************************** Schedule, Titles, and Abstracts *************************************************************************** 6/3 Mon 10:30 Gilchrist "The Perception of Self-Luminous Surfaces" 4:00 Belhumeur "What is the Set of Images of an Object Under All Possible Lighting Conditions?" 4:00 Kriegman "Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection"
6/4 Tue 10:30 Humphrey "Studies of orientation anisotropies in the human perception of depth from shading" 4:00 Kellman
6/5 Wed 10:30 Goodale "Dissociations between perception and action in normal human vision" 4:00 Humphreys "Spatial representation and attention in the brain: Fractionation of space within and between perceptual objects"
6/6 Thu 10:30 Ramachandran "Perceptual correlates of neural plasticity in the adult human brain" 4:00 Anderson
6/7 Fri 10:30 Cavanagh "Role of 2D view matching in recognition" 2:00 Elder "Are images one-dimensional ?" 4:00 Geiger " Visual Organization and Illusory Surfaces (and junction detection)"
6/8 Sat 10:00 Liu "Perceptual Grouping: Beyond Good Contour Continuation" 11:00 Biederman "Why faces and objects are coded differently" 2:00 Siddiqui "Shape, Shocks and Wiggles" 3:00 Rubin "Abrupt learning in illusory contour perception"
6/9 Sun 10:00 Lowe "From image groupings to 3-D object interpretations: A learning approach" 11:00 Williams "Local Parallel Computation of Stochastic Completion Fields" 2:00 Alter/Basri "Extracting Salient Curves from Images: An Analysis of the Saliency Network" 3:00 Weickert "Model-Driven Image Processing Using Anisotropic Diffusion Scale-Spaces"
6/10 Mon 10:30 von der Heydt "Contour and surface representations in primate visual cortex" 4:00 Rensink "How Much of a Scene is Seen ?"
6/11 Tue 10:30 Saund 4:00 Kovacs "When the brain changes its mind: Interocular grouping during binocular rivalry"
6/12 Wed *10:00* Rosenholtz "Affine Structure and Photometry" 4:00 August "Fragment Grouping via the Principle of Perceptual Occlusion"
6/13 Thu 10:30 Mamassian "Illumination and Viewpoint from Above" 4:00 Papathomas "An Efficient Technique for Assessing the Sensitivities of First- and Second-Order Motion: Theory, Experiments, and Applications"
ABSTRACTS ************************************************************************ Jonas August McGill University
Fragment Grouping via the Principle of Perceptual Occlusion
Bounding contours of physical objects are often fragmented by other occluding objects. Long-distance perceptual grouping seeks to join fragments belonging to the same object. Approaches to grouping based on invariants assume objects are in restricted classes, while those based on minimal energy continuations assume a shape for the missing contours and require this shape to drive the grouping process. We propose the more general principle that those fragments should be grouped whose fragmentation could have arisen from a generic occluder. The gap skeleton is introduced as a representation of this virtual occluder, and an algorithm for computing it is given.
Joint work with Kaleem Siddiqi, and Steven W. Zucker. **************************************************************************** Ronen Basri The Weizmann Institute, Israel
Extracting Salient Curves from Images: An Analysis of the Saliency Network
The Saliency Network proposed by Shashua and Ullman is a well-known approach to the problem of extracting salient curves from images while performing gap completion. This paper analyzes the Saliency Network. The Saliency Network is attractive for several reasons. First, the network generally prefers long and smooth curves over short or wiggly ones. While computing saliencies, the network also fills in gaps with smooth completions and tolerates noise. Finally, the network is locally connected, and its size is proportional to the size of the image.
Nevertheless, our analysis reveals certain weaknesses with the method. In particular, we show cases in which the most salient element does not lie on the perceptually most salient curve. Furthermore, in some cases the saliency measure changes its preferences when curves are scaled uniformly. Also, we show that for certain fragmented curves the measure prefers large gaps over a few small gaps of the same total size. In addition, we analyze the time complexity required by the method. We show that the number of steps required for convergence in serial implementations is quadratic in the size of the network, and in parallel implementations is linear in the size of the network. We discuss problems due to coarse sampling of the range of possible orientations. We show that with proper sampling the complexity of the network becomes cubic in the size of the network. Finally, we consider the possibility of using the Saliency Network for grouping. We show that the Saliency Network recovers the most salient curve efficiently, but it has problems with identifying any salient curve other than the most salient one.
Joint work with T. D. Alter. ***************************************************************************** Peter N. Belhumeur Dept. of Electrical Engineering Yale University
What is the Set of Images of an Object Under All Possible Lighting Conditions?
The appearance of a particular object depends on both the viewpoint from which it is observed and the light sources by which it is illuminated. If the appearance of two objects is never identical for any pose or lighting conditions, then -- in theory -- the objects can always be distinguished or recognized. The question arises: What is the set of images of an object under all lighting conditions and pose? In this talk, we consider only the set of images of an object under variable in illumination (including multiple light sources and attached shadows). We prove that set of n-pixel images of an object with a Lambertian reflectance function, seen under all possible illumination conditions, forms a convex cone in R^n and that the span of this illumination cone equals the number of distinct surface normals. Furthermore, we show that the cone for a particular object can be constructed from three properly chosen images. If the cones corresponding to two different objects have no images in common, then due to their convexity the cones are also linearly separable. This fact immediately suggests certain approaches to object recognition, some of which we will briefly discuss. Throughout the talk, we present results demonstrating the empirical validity of the illumination cone representation. ***************************************************************************** Patrick Cavanagh, Dept. of Psychology Harvard University
Role of 2D view matching in recognition
Many theories of perception have considered a holistic mode of recognition as an alternative or complement to a part-based mode of recognition. However, the seemingly magical trick of recognizing the whole before its parts are identified has been relatively ignored compared to part-oriented models (Marr, 1981; Biederman, 1985). In this talk, I will explore whether some type of direct recognition serves as a first step for object recognition, a step that rather than identifying the object, simply selects a first guess that then guides the further analysis of the image. The experiments use specialized sparse images to minimize the likelihood that parts can be identified before the object. We find that the recognition of these two-tone images is speeded when they are preceded by a brief presentation of their outline, even when the outlines are unrecognizable on their own. We attribute this facilitation to an initial step in the recognition process, one in which a first guess for the object can be obtained directly from the image without first identifying the parts or structure of the object. We have used two-tone images as tests because they are unique in being impervious to any analysis by part-based or structural techniques. Our results favor a view-point specific process for this initial match but suggest that this is only a first step and one which can operate even on the contours of the image. The initial match must be followed by an analysis to verify whether it is sufficiently supported and then, if it is rejected, it appears that the results of the first match are not available to awareness. Our results suggest that this initial matching process occurs in the first 180 msec of processing. ****************************************************************************** Irv Biederman University of Southern California
Why Faces and Objects are Coded Differently ?
Patches of cortical tissue that are tuned to faces differ from those tuned to objects. Why? The spatial filtering that characterizes the earliest cortical representation of shape is presumed to be common to both classes of stimuli. Individual faces can be represented by a two-layer network in which the output of early filters is mapped directly onto units in a 2D coordinate space. The direct mapping is useful because it may allow coarse coding of the small metric differences that distinguish faces. For objects, the image variation at a given location in a 2D coordinate space may be too great to yield sufficient predictability directly from the output of spatial kernels. Instead, intermediate layers coding qualitative differences in parts and relations may be required. A series of experiments documents that whereas face recognition is strongly dependent on the original spatial filter values, object recognition evidences strong invariance of these values, even when distinguishing among objects that are as similar as faces. ***************************************************************************** James H. Elder NEC Research Institute
Are images one-dimensional ?
Psychophysical masking experiments suggest that our perception of surface brightness may be the result of a filling-in process, in which the luminance signal is encoded only at image contours and is then neurally diffused to fill-in the intervening 2-D space. In this talk I will present a computational model for this filling-in process which can be used to evaluate the perceptual content of a contour representation.
I will first introduce a scale-space method for edge detection which computes a contour code consisting of estimates of position, brightness, contrast and blur at each edge point in an image. I will then show how this code can be inverted by a diffusion-based filling-in algorithm which reconstructs an estimate of the original image. The results show that while filling-in of brightness alone leads to significant artifact, parallel filling-in of both brightness and blur produces perceptually accurate reconstructions. These results suggest that a contour code captures the information needed for higher-level visual inference. ****************************************************************************** Davi Geiger The Courant Institute, New York University
Visual Organization and Illusory Surfaces (and junction detection)
A common factor in all illusory contour figures is the perception of a surface occluding part of a background. These surfaces are not constrained to be at constant depth and they can cross other surfaces. We address the problem of how the image organizations that yield illusory contours arise and what is the shape of these contours. Our approach is to iteratively find the most salient surface by (i) detecting junctions/occlusions; (ii) assigning a set of hypothesis of the local salient surface configuration; (iii) applying a model to diffuse these surface-hypothesis; and (iv) efficiently selecting the best image organization (set of hypothesis) based on a coherence measure that groups the hypothesis. A mathematical/computational model for each of the four items will be the core of the presentation.
We note that the illusory contours arise from the surface boundaries and the amodal completions emerge at the overlapping surfaces. We also show that with multiple views available (stereo or motion) the selection of a visual organization is more easily done. The model reproduces various qualitative and quantitative aspects of illusory contour perception. **************************************************************************** Alan Gilcrist Rutgers University
The Perception of Self-Luminous Surfaces
The appearance of self-luminous surfaces in complex images has a pop-out likequality. Yet there remains no coherent account of how self-luminosity is detected, either in human vision or in machine vision. We report a series of experiments on the luminosity threshold under various conditions, simple and complex. The luminance of a target surface in a display is increased to the point that it just begins to be perceived as self-luminous. The luminosity threshold behaves like a value of surface lightness, showing both constancy despite changes in the illumination level and constancy despite changes in the surrounding reflectance. But how do these findings square with earlier findings that the brightness of luminous regions depends on absolute luminance not relative luminance, as does surface lightness. Further work will be reported that seems to resolve this apparent paradox.
Joint work with Fred Bonato, St. Peters College. ******************************************************************************* Keith Humphrey University of Western Ontario, Canada.
Studies of orientation anisotropies in the human perception of depth from shading
There is a strong and reliable orientation dependency in the human perception of 'shape from shading'. Displays composed of elements with vertically oriented shading gradients of opposite polarity produce a strong and stable percept of concave and convex elements. If such a display is rotated 90 degrees so that the shading gradient is horizontal, the depth percept is reduced and much more ambiguous. Psychophysical experiments using visual search tasks have shown that the extraction of shape from shading occurs 'preattentively' if the shading gradients are vertically oriented - the target 'pops out' of the field of distractors. The task is performed rapidly in a spatially 'parallel' manner taken to be indicative of low-level visual processing. If the shading gradients are horizontally oriented, however, a target with one luminance polarity does not pop out of a field of distractors with the opposite polarity. The search rate becomes dependent on the number of distractors. Such search is generally taken to be indicative of serial search. In various psychophysical tasks, we assessed the orientation tuning of this skill across orientations of the shading gradient and note some new aspects of the orientation dependency in shape from shading.
We will also discuss research conducted with a neurological patient, who has deficits in various aspects of form perception. The patient could perform a discrimination task dependent on the perception of 'shape from shading' when the task involved vertically oriented shading gradients. She could not make such discriminations for horizontally oriented gradients or for shapes in which edges were depicted as lines or as luminance discontinuities.
Finally, we will present the results of a 3-D fMRI study at 1.5T. We found significantly less activation in area V1 and neighbouring low-level visual areas of cortex when subjects viewed displays composed of elements with vertically oriented shading gradients that led to a stable depth percept, than when they viewed displays composed of elements with horizontally oriented shading gradients that led to weak and ambiguous depth percepts. There was no reliable difference in activation for the control stimuli that lacked depth structure and ambiguity. The study demonstrates that the difference in strength and stability of the perception of shape from shading when a shading gradient is rotated 90=B0 is accompanied by a corresponding change in the level of activation in early visual areas of the human brain. The reduced activation with stable visual displays may reflect top-down modulation of early visual processing or an intrinsic bias in neural networks in V1 and neighbouring areas. ****************************************************************************** Glyn W. Humphreys School of Psychology University of Birmingham
Spatial representation and attention in the brain: Fractionation of space within and between perceptual objects
Our understanding of how space is coded in the brain has recently been informed by studies of patients with forms of spatial neglect, in which they fail to respond to stimuli presented in particular spatial locations. In this paper I will present new data demonstrating a double dissociation between patients showing neglect of space within perceptual objects and neglect of space between separate perceptual objects; the findings suggest that there are independent representations of space within and between perceptual objects. These results will be related to a computational model of viewpoint-invariant object recognition, in which information from different retinal positions is mapped onto a single attentional window. Lesions of the mapping routines selectively produce neglect within and between perceptual obbjects. ***************************************************************************** Ilona Kovacs Laboratory of Vision Research Rutgers University
WHEN THE BRAIN CHANGES ITS MIND: Interocular grouping during binocular rivalry.
When the brain simultaneously receives two or more equally consistent sensory events, those will compete for `actuality' in our experience. This rivalry happens constantly when we focus on certain aspects of the environment, not minding others. Artificially induced binocular rivalry is one of the rare instances when transitions between perceptually dominant phases of competing events are so clearly observed that one can even measure the length of the transition phases. Binocular rivalry is produced by providing dissimilar stimuli to the two eyes that cannot be fused into a single percept, giving rise to alternating percepts. The prevalent view of binocular rivalry holds that it results from reciprocal inhibition among monocular neurons. However, there is recent evidence that binocularly driven cells in V4 and MT reflect perceptual alternations in their firing pattern (Leopold and Logothetis, Nature 1996, 379:549-552), suggesting that there is more to binocular rivalry than mere eye-competition.
With T. V. Papathomas and A. Feher we have developed a paradigm to study interocular interactions during rivalry. Conventional rivalry-inducing stimuli are dissimilar image pairs that are each coherent in their global structure (such as gratings of orthogonal orientations, or blobs of opposite colors). We replace these by COMPLEMENTARY PATCHWORKS of intermingled rivalrous images. Can the brain unscramble the pieces of the patchwork arriving from different eyes to obtain a global percept? Percepts with conventional, globally coherent image-pairs (e.g., a monkey face vs. a jungle scene) are compared to those obtained with the patchwork image pairs (spatially complementing pieces of the monkey and the jungle in each image). We find that interocular grouping of image components occurs across extended images (15x15degs) in all tested conditions (color defined, orientation-defined, natural, and moving natural images), implying that binocular rivalry goes beyond interocular suppression and follows more complex rules of perceptual organization. We suggest that interocular grouping is mediated by extremely interactive feedforward-feedback connections involving a large portion of the cortical architecture.
Supported by the J. S. McDonnell Foundation 9560. ***************************************************************************** David Kriegman Yale University
Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection
We develop a face recognition algorithm which is insensitive to gross variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face under varying illumination direction lie in a 3-D linear subspace of the high dimensional feature space -- if the face is a Lambertian surface without self-shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's Linear Discriminant and produces well separated classes in a low-dimensional subspace even under severe variation in lighting and facial expressions. The Eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed ``Fisherface'' method has error rates that are significantly lower than those of the Eigenface technique when tested on the same database. ****************************************************************************** Zili Liu NEC Research Institute
Perceptual Grouping: Beyond Good Contour Continuation
We can all see, in the Kanizsa triangle illusion, a bright triangle sitting on top of three black disks. Presumably, such illusory figures are perceptually compelling because the figures are simple, regular, and the illusory contours collinear or curvilinear. When a figure is more complex, or when a (simple) figure is completed behind an occluder, our percept of the this figural shape is less compelling. Good contour continuation, in this case, serves more as a model of the goodness of the grouping rather than a percept of the contours.
The focus of the talk is on this, perhaps more general, class of phenomena of figural completion behind an occluder. We compare models based on several variations of good continuation to one based on the convexity of regions, and find that the region based comparison better explains our psychophysics data. We further argue that precise shape reconstruction behind an occluder, which contour based models imply, might be neither necessary nor possible in perceptual completion in general.
(Joint work with David Jacobs and Ronen Basri.) **************************************************************************** David G. Lowe Computer Science Dept University of British Columbia
>From image groupings to 3-D object interpretations: A learning approach
Object recognition requires a method for indexing from image features and their groupings to 3-D object interpretations. A learning approach will be described that forms large numbers of local feature vectors from sample images and uses these to assign probabilisitic 3-D object interpretations to features from new images. To support this approach, a new learning method called Variable-kernel Similarity Metric (VSM) learning will be described. This is a learning method that is based on interpolation between the nearest neighbors in a training set, augmented with optimization of the similarity metric. This approach avoids the problems of combinatorics and boundary conditions faced by voting methods, Hough transforms, or hashing, and proves to be very efficient in practice. A wide range of image features and groupings can be used, and the most discriminating will automatically be selected for recognition. **************************************************************************** Thomas V. Papathomas Laboratory of Vision Research & Department of Biomedical Engineering, Rutgers University, Piscataway, New Jersey
An efficient technique for assessing the sensitivities of first- and second-order motion: Theory, experiments, and applications
I will present a recent technique, developed jointly with C. Chubb and A. Gorea, for the purpose of obtaining texture patches with a mean luminance that is equal to a reference background luminance. It is based on the idea of separate first- and second- order motion pathways, and its results are predicted by a computational model, developed jointly with A. Rosenthal. This equiluminance technique provides much more accurate estimates of the equiluminant setting than conventional techniques (such as flicker or minimum-motion), because the equiluminant setting in our stimuli is the point at which a sharp transition of motion direction occurs. In addition, it turns out that this technique also offers a bonus: it provides a good measure of the relative sensitivities of the first- and second-order motion systems. In particular, when implemented by a staircase procedure, the technique provides both the equiluminant setting and a measure of the relative strengths with a very small number of trials. This makes it particularly suited in applications where observers get easily fatigued (e.g., infants or stroke patients), or when the experimental time is limited (e.g., fMRI), or when data are needed for large numbers of observers (e.g., in developmental studies). ******************************************************************************* Ron Rensink Nissan Cambridge, Mass.
How Much of a Scene is Seen?
When looking at a dynamic scene, our impression as observers is that we can simultaneously see all the changes that are taking place. It will be shown that this impression is an illusion, and that humans instead have a severely limited ability to detect change. It will be argued that attention is needed to perceive changes in a scene, and that the limited ability to detect change is a direct consequence of the limited capacity of the attentional mechanisms involved.
Investigations were carried via a "flicker" technique, in which an original image A was repeatedly alternated with a modified image A', with brief blank fields interposed between successive displays. The resulting flicker created a global transient that swamped the local motion signals caused by the image change, preventing attention from being drawn to the location of the change. Under these conditions, a dramatic effect was found: even when the change was substantial and made repeatedly, subjects generally had great difficulty seeing it, sometimes requiring over 50 s to see a large change that was obvious once noticed.
These results show that the detection of change in a scene does not involve any kind of dense spatiotopic representation. Rather, it is suggested that scenes are represented via a relatively sparse set of structures, and that attention enters a small subset of these into a more durable buffer to allow comparisons to be made. Although only a few items can be held in this buffer, they can be rapidly swapped in and out. Thus, given that attention is normally attracted to the more dynamic and interesting parts of a scene, the result will be an impression of a richly-detailed environment, with accurate descriptions of those aspects most important to us. ************************************************************************* Ruth Rosenholtz NASA Ames
Affine Structure and Photometry
In a typical motion sequence, objects move relative to both the observer and light sources. The motion relative to the observer is a cue to the structure of the scene, and motion relative to the light source causes changing patterns of shading on the surface, which provide information about surface structure, albedos, and light sources. Under certain conditions, photometric information can be stratified into affine, unitary, and metric "structure," much like the stratification of structure from motion.
However, motion and photometric cues have typically not been combined in computer vision, and in fact structure from motion algorithms tend to assume that the shading pattern does not change, and photometry algorithms tend to assume that the object does not move. And with good reason: motion makes it hard to find the corresponding points necessary for photometry, and vice versa. We bypass this problem for the time being, so as to investigate the usefulness of combining the various levels of structure from motion (affine, "affine in depth," or metric) with structure from photometry (affine, unitary, or metric). ***************************************************************************** Nava Rubin Harvard University
Abrupt Learning in Illusory Contour Perception
When a 'camouhen a 'camouflaged' figure such as the Dalmatian Dog is viewed by naive observers, the transition into the "meaningful" organization occurs abruptly; this is commonly taken to indicate a cognitive event, similar to "insight" phenomena in problem-solving. In contrast, gradual improvement is often observed when learning a motor skill or a perceptual task. The findings that these tasks were commonly specific to relatively low-level stimulus attributes (the orientation of the stimulus, the hand used, etc.) seem consistent with such an incremental form of learning; together, these findings were taken to indicate that the synaptic modifications involved occur in early cortical areas.
I will describe an experimental procedure in which observers undergo an abrupt transition in the perception of illusory contours, which dramatically improves their performance in a psychophysical task. Surprisingly, the improved performance is found not to generalize to a new retinal size. Thus, characteristics of both the "high-level" and the "low-level" types of learning described above can be juxtaposed within a single task. This suggests that both types of learning may involved interactions between early cortical areas and high-level processes.
This work is in collaboration with: Anne Grossetete, Ken Nakayama and Robert Shapley. **************************************************************************** Kaleem Siddiqi Center for Intelligent Machines McGill University
Shape, Shocks and Wiggles
We are attempting to develop a theory for the qualitative description of shapes roughly comparable to "entry level" categorical descriptions. The theory is based on the mathematics of curve evolution, and in particular identifies the singularities (or shocks) created during this evolution with the generic parts, protrusions, and bends comprising shapes.
In this talk we focus on the theoretical and practical difficulties of computing a shock-based representation. First, we develop subpixel local detectors for finding and classifying shocks. Second, we show that shock patterns are not arbitrary but obey the rules of a grammar, and in addition satisfy specific topological and geometric constraints. Shock hypotheses that violate the grammar or are topologically or geometrically invalid are pruned to enforce global consistency. Survivors are organized into a hierarchical graph of shock groups, leading to the aforementioned parts, protrusions and bends . The representation reflects both qualitative and quantitative aspects of shape, and is suited to recognition. Examples illustrate its stability with rotations, scale changes, occlusion and movement of parts, even at very low resolutions.
We have recently evaluated this model in the context of psychophysical experiments conducted by Christina Burbeck and Stephen Pizer, where subjects were asked to estimate the local centers of stimuli consisting of rectangles with ``wiggles'' (sides modulated by sinusoids). Their results suggest that: 1) for a fixed edge modulation frequency, the perceived modulation of the center of an object decreases with increasing object width, and 2) for a fixed object width, the perceived modulation of the center of an object decreases with increasing edge modulation frequency. The degree to which the shock-based descriptions of the Burbeck-Pizer stimuli predicts the psychophysical data they collected is uncanny.
Joint work with Ben Kimia and Steve Zucker. *************************************************************************** Joachim Weickert Imaging Center Utrecht Utrecht University Hospital The Netherlands
Model-Driven Image Processing Using Anisotropic Diffusion Scale-Spaces
Scale-spaces constitute useful bottom-up tools in computer vision, which reveal certain similarities with the human visual system. Usually scale-spaces are completely uncommited. On the other hand, it is well-known that expectation and a priori information may influence early vision processes.
The goal of the presentation is to study how one can integrate knowledge into the scale-space evolution. To this end, we investigate a family of nonlinear anisotropic diffusion processes. They use a diffusion tensor which is adapted to the differential structure of the underlying image.
This approach combines ideas of image enhancement and smoothing scale-space transformations: while the filters may behave locally image-enhancing, their global smoothing behaviour can be interpreted in a deterministic, stochastic, information-theory-based, and Fourier-based sense. Well-posedness properties ensure stability under changing environments. The presented filters turn out to be quite flexible, as they may reveal different behaviour in different directions (anisotropy), and they can be steered by different combinations of Gaussian derivatives. Moreover, they can be generalized to vector-valued images arising e.g. in the feature space or as colour images.
Examples are presented which show the use of anisotropic diffusion scale-spaces for creating segmentation-like results and for solving the problem of gap completion in line-like structures. Application fields range from computer-aided quality control to medical imaging. *************************************************************************** Lance Williams NECI Princeton, NJ.
Local Parallel Computation of Stochastic Completion Fields
We describe a local parallel method for computing the stochastic completion field introduced in an earlier paper (Williams and Jacobs '95). The stochastic completion field represents the likelihood that a completion (i.e., illusory contour) joining two contour fragments passes through any given position and orientation in the image plane. It is based upon the assumption that the prior probability distribution of completion shape can be modeled as a random walk in a lattice of discrete positions and orientations. The local parallel method can be interpreted as a stable finite difference scheme for solving the underlying Fokker-Planck equation identified by Mumford (Mumford '94). The resulting algorithm is significantly faster than the previously employed method which relied on convolution with large-kernel filters computed by Monte Carlo simulation. The complexity of the new method is O(n^3m) while that of the previous algorithm was O(n^4m^2) (for an n \times n image with m discrete orientations). Perhaps most significantly, the use of a local method allows us to model the probability distribution of completion shape using stochastic processes which are neither homogenous nor isotropic. For example, it is possible to modulate particle decay rate by a directional function of local image brightnesses (i.e., anisotropic decay) so that illusory contours respect local image brightness structure. We demonstrate the effect of the anisotropic decay function using Kanizsa square figures with checkered backgrounds of different phases (Ramachandran et al., '94). The results of the computer simulations are consistent with observed human psychophysics. Finally, we note that the new method is more plausible as a neural model since 1) unlike the previous method, it can be computed in a sparse, locally connected network; and 2) the network dynamics are consistent with psychophysical measurements of the time course of illusory contour formation (Rubin et al., '95). Joint work with David Jacobs, NECI. ******************************************************************************