- By
**ivory** - Follow User

- 139 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'ICNNB2005 Plenary Speech V isual Perceptual Learning' - ivory

Download Now**An Image/Link below is provided (as is) to download presentation**

Download Now

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### ICNNB2005 PlenarySpeechVisual Perceptual Learning

Zhongzhi Shi Qingyong Li

Hong Hu Zheng Zheng

shizz@ics.ict.ac.cn

Institute of Computing Technology,

Chinese Academy of Sciences, Beijing 100080, China

Zhongzhi Shi VPL-ICNNB05

Outline

- Introduction
- Classification-oriented sparse coding model
- Selective attention model based on response saliency
- Tolerance relation based granular computing model
- Conclusions

Zhongzhi Shi VPL-ICNNB05

Visual Pathways

Zhongzhi Shi VPL-ICNNB05

Visual Pathways

Dorsal pathways analyze motion and spatial relationships between

the body and visual stimuli.

Ventral pathways analyze form with specific regions identifying colors,

faces, letters and other stimuli.

Zhongzhi Shi VPL-ICNNB05

Visual Information Processing

Zhongzhi Shi VPL-ICNNB05

Visual perceptual learning

Goal

- Probe into visual system.

Visual perceptual learning should be considered as an active process that embeds particular abstraction, reformulation and approximation within the Abstraction framework.

- Model the vision information processing mechanism.

Neural representation, attention mechanism

- Guide the computer vision research.

Feature extraction

Feature binding

Object recognition

Zhongzhi Shi VPL-ICNNB05

Perception is entirely data driven.

“Non-constructivist” or direct perception.

Optic array - patterns of light reaching retina

gives a texture gradient

-> depth perception.

Visual perceptions have their own ‘affordances’.

Affordances are salient perceptual characteristics that suggest the use of an object e.g., an umbrella.

How could a three dimensional image be derived from affordances.

Gibson’s ecological theoryZhongzhi Shi VPL-ICNNB05

The whole visual percept is more than the sum of parts.

Visual illusions.

A visual percept can be interpreted in more than one way therefore we must have a representation of visual information in our mind.

Gestalt theory of perceptionZhongzhi Shi VPL-ICNNB05

Sensory input is chaotic, unstable, and distorted.

It must be interpreted.

The perceiver generates predictions about the nature of sensory input.

Visualperception is

indirect

constructive

based on hypothesis testing

a cognitively mediated process

Empiricism (Richard Gregory)Zhongzhi Shi VPL-ICNNB05

Image processing theory of recognition from vision as it is data driven.

It starts with input to the perceptual system in the form of the retinal image.

Marr then describes four different stages of visual information processing.

Marr’s theory of visionZhongzhi Shi VPL-ICNNB05

Grey level description

Primal sketch.

2.5 Dimensional sketch.

3 Dimensional model sketch

Marr’s theory of visionZhongzhi Shi VPL-ICNNB05

Three levels of analysis

Marrian framework for understanding complex information processing systems (Marr, 1982)

- Computational theory
- Goals of computation, appropriateness of the goal, general strategies
- Representation/Algorithm
- How to represent the input and the output
- Algorithms for transforming from one representation to another
- Implementation
- How can the representation and algorithm be realized physically (architecture, hardware)?

Zhongzhi Shi VPL-ICNNB05

Visual Perception

- Efficient coding hypothesis: the goal of visual perception is to produce an efficient representation of the incoming signal (Attneave 1954).
- How to establish a precise quantitative relationship between environmental statistics and neural processing?

Zhongzhi Shi VPL-ICNNB05

Related work

- Biologic approach: examine the statistical properties of neural responses under natural stimulation conditions.

Sparse coding and decorrelation in primary visual cortex during natural vision. Science, 287:1273-1276, Feb 2000.

Retinal ganglion cells act largely as independent encoders. Nature, 411:698{701, June 2001.

- Computation approach: use the statistical properties of natural images to constrain or derive a model for early sensory processing.
- Sparse coding model

Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607-609, 1996.

- Independent component analysis model

The 'independent components' of natural scenes are edge filters. Vision Research, 37 23):3327-3338, 1997.

Zhongzhi Shi VPL-ICNNB05

An image would be represented by a small number of ‘active’ neurons, ai , out of a large set. Which neurons are active varies from one image to another.

The distribution of activity on any given unit should be peaked around zero with heavy tails. Such a distribution will have low entropy, as opposed to a Gaussian distribution

What is sparse codingZhongzhi Shi VPL-ICNNB05

Why is sparse coding

- It allows for increased storage capacity in associative memories;
- It makes the structure in natural signals explicit;
- It represents complex data in a way that is easier to read out at subsequent levels of processing;
- It saves energy.

Zhongzhi Shi VPL-ICNNB05

Sparse coding model(Field)

Olshausen pointed out a perceptual system is exposed to a series of small image patches, drawn from one or more large images, just like the CRF of neurons. Imagine that each image patch, represented by the vector I (numbered row-wise), has been formed by the linear combination of N basis functions. The basis functions form the columns of a fixed matrix, A. The weighting of this linear combination is given by a vector, s. Each component of this vector has its own associated basis function, and represents a response value of a neuron in vision system. The linear synthesis model is therefore given by:

Linear superposition model with basis functions

Zhongzhi Shi VPL-ICNNB05

Sparse coding model(Field)

Olshausen and Field applied two criteria to seek the optimal basis vector and the coefficients

Sparseness cost function

Minimize the cost function

Zhongzhi Shi VPL-ICNNB05

Outline

- Introduction
- Classification-oriented sparse coding model
- Selective attention model based on response saliency
- Tolerance relation based granular computing model
- Conclusions

Zhongzhi Shi VPL-ICNNB05

Classification-oriented Sparse Coding Model for Pattern Classification

- Sparse coding model just states how information should be represented efficiently.
- What information should be represented is more important for visual perception task.
- Computations in the early visual cortex are rather interactive and plastic, subject to influence from perceptual inference, task requirement and behavioral experience.

Zhongzhi Shi VPL-ICNNB05

Computations in the early visual cortex (Lee)

- Feedforward and one-layer network is limited.
- Various levels in cognitive and sensory systems have to work together interactively.
- Multi-layer model integrating the unsupervised sparse coding principle and supervised feedback

Zhongzhi Shi VPL-ICNNB05

Classification-oriented Sparse Coding Model for Pattern Classification

Goal of COSC model

- Sparseness for the coefficients
- Discriminable for the pattern classification task
- Combining unsupervised and supervised learning

Zhongzhi Shi VPL-ICNNB05

Classification -oriented Sparse Coding Model for Pattern Classification

- Training pattern sets and coefficients
- Distance between two coefficient vectors
- Within-class distance measures
- Between-class distance measures the distance between a coefficient vector and the center of class which excludes the vector.

Zhongzhi Shi VPL-ICNNB05

Classification -oriented Sparse Coding Model for Pattern Classification

- Fisher-like discriminate distance
- Cost function

Zhongzhi Shi VPL-ICNNB05

Classification -oriented Sparse Coding Model for Pattern Classification

Learning process by optimization

- Object function
- Two nested stages

Inner stage: minimize E respects to s with fixed A by conjugate gradient method.

Outer stage: minimize E with respect to the A by gradient descent method.

Zhongzhi Shi VPL-ICNNB05

Experiment result

The set of 144 basis functions learned by the COSC. All have been normalized to fill the grey scale, but with zero always represented by the same grey level.

Zhongzhi Shi VPL-ICNNB05

Experiment result- sparseness performance

Zhongzhi Shi VPL-ICNNB05

Experiment result

Reconstruction error comparison

Classification performance comparison

Zhongzhi Shi VPL-ICNNB05

Summary

- COSC model can code class-specific features.
- The coefficients of COSC notablely improved the classification accuracy, without distinctly damaged the performance of reconstruction error and sparseness.
- COSC model is interactive and plastic model supervised by visual perception task.

Task-oriented Sparse Coding Model for Pattern Classification. Lecture Notes in Computer Science, Vol. 3610/2005, pp. 903-914.

Learning Sparse and Discriminative Structures in Natural Images for Visual Classification. Submitted to Network: Computation in Neural Systems.

Zhongzhi Shi VPL-ICNNB05

Outline

- Introduction
- Classification-oriented sparse coding model
- Selective attention model based on response saliency
- Tolerance relation based granular computing model
- Conclusions

Zhongzhi Shi VPL-ICNNB05

Attention-guided visual sparse coding model

- The number of variable which has a large value produced by sparse coding model is relatively large compared with the computation capacity of neurons, though the kurtosis of every response coefficient is also high.
- A typical scene within the neuron’s classic receptive field (CRF) contains many different patterns which compete for neural representation because of the limited processing capacity of neurons in the visual system.
- Vision attention mechanism is an active strategy in information processing procedure of brain.

Zhongzhi Shi VPL-ICNNB05

Attention-guided visual sparse coding model

General model

- The first attention module performs a transformation of the image into a ‘retinal image’, nonuniformly sampling the input visual simuti.
- The second attention module performs the selective attention based on response saliency.

The diagram of the model.

Zhongzhi Shi VPL-ICNNB05

Nonuniform sampling model

- The density of photoreceptors in the retina is greatest in the central area (fovea) and decreases to the retinal periphery
- The resolution of the image representation in the visual cortex is highest for the part of the image projected onto the fovea and decreases rapidly with distance from the fovea center.

Vision sampling model

Zhongzhi Shi VPL-ICNNB05

Nonuniform sampling model

Recursive computation of the Gaussian-like convolution

Zhongzhi Shi VPL-ICNNB05

Nonuniform sampling model

The input image patch is represented: within the central circle the pixels are full sampled just as the original image, with lower resolution within the first ring surrounding the central circle, and with the lowest resolution within the third circle.

Zhongzhi Shi VPL-ICNNB05

Selective attention model

- Definition : Response saliency is the response extent for a neuron compared with a group of neurons which respond to the same stimulus.
- The purpose of the response saliency is to represent the conspicuity of every neuron in the same perception level for a stimulus and to guide the selection of attended neuron, based on the value of response saliency.
- The neuron response that has great response saliency value will be chosen to further process. On the contrary, the neuron that has small value will be omitted.

Zhongzhi Shi VPL-ICNNB05

Selective attention model

Every such pattern is selective for location, orientation and frequency

- Center of the excitatory subregion as the location selectivity
- Angle (in degree) between the x-axis and the major axis of the ellipse as orientation
- Area of the excitatory subregion as frequency

Zhongzhi Shi VPL-ICNNB05

Selective attention model

- Discrepancy between Ai and S

- Response saliency (RS) value

Zhongzhi Shi VPL-ICNNB05

Selective attention model

Selection strategies :

- Threshold selection mechanism (TSM)

TSM is a threshold filtering algorithm

- Proportion selection mechanism (PSM)

PSM is a bottleneck filtering algorithm

Zhongzhi Shi VPL-ICNNB05

Simulation results

Histogram of the coefficient in the model for an input image patch. (a) The original response coefficient produced by sparse coding. (b) The response saliency value. (c) The response coefficient after vision attention, selected the frontal 40% response coefficient sorted by response saliency descendly.

Zhongzhi Shi VPL-ICNNB05

Simulation results

The input image patch and the reconstructed image. The first column is the original image; the second column is the image reconstructed by the full coefficients produced by sparse coding; the third column is the image reconstructed by the selected coefficient by this model.

Zhongzhi Shi VPL-ICNNB05

Simulation results

Reconstruction errors of the sparse coding model (SC) and attention-guided sparse coding model (AGSC)

Zhongzhi Shi VPL-ICNNB05

Summary

- This model includes nonuniform sampling module and saliency-based data-driven module, in the framework of efficient coding hypothesis.
- This model prominently reduces the number of activated coefficients for an input stimulus but also remains the main essential vision information.
- This model designs and implements an active and efficient mechanism to adapt to the limited computation capability and improve the efficiency for sparse coding.

A model of Attention-guided Visual Sparse Coding. In Proc. IEEE International Conference on Cognitive Informatics, pp 98-104. California, USA, 2005.

Zhongzhi Shi VPL-ICNNB05

Outline

- Introduction
- Classification-oriented sparse coding model
- Selective attention model based on response saliency
- Tolerance Relation Based Granular SpaceModel
- Conclusions

Zhongzhi Shi VPL-ICNNB05

What is Granular Computing?

- “There are three basic concepts that underline human cognition: granulation, organization and causation.
- Informally, granulation involves decomposition of whole into parts;
- Organization involves integration of parts into whole;
- Causation involves association of causes with effects.
- Granulation of an object A leads to a collection of granules of A, with a granule being a clump of points (objects) drawn together by indistinguishability, similarity, proximity or functionality” (Zadeh 1997)

Zhongzhi Shi VPL-ICNNB05

What is Granular Computing

- An umbrella term to cover any theories, methodologies, techniques, and tools that make use of granules in problem solving.
- A subset of the universe is called a granule in granular computing.
- Basic ingredients of granular computing are subsets, classes, and clusters of a universe.

Zhongzhi Shi VPL-ICNNB05

Cognitive activities can be viewed as some kind tolerance spaces in a function space.

Motivation of Tolerant Relation Based Granular Computing ModelIn 1962, Zeeman proposed that cognitive activities can be viewed as some kind tolerance spaces in a function space. The tolerance spaces, which are constructed by distance functions based tolerance relations, is used for stability analysis of dynamic system by Zeeman. Tolerance spaces based on distance functions are developed for the modeling and analysis of information granulation.

Zhongzhi Shi VPL-ICNNB05

Most of the models and methods discuss symbolized data and consecutive real value data respectively.

Motivation of Tolerant Relation Based Granular Computing Model- The entities on data layer processed by Granular Computing usually belong to two types: symbolized data or consecutive real value data.
- Most of the models and methods of granular computing discuss symbolized data or consecutive real value data respectively.

Zhongzhi Shi VPL-ICNNB05

Motivation of Tolerant Relation Based Granular Computing Model

- Symbolized feature and real value feature can be generated from each other by feature extraction, feature reduction, classification or discretion, etc.
- So, we try to construct a uniform granular computing model to study some important problems in pattern recognition and machine learning, such as feature extraction, feature reduction, discretion and classification, etc.

Zhongzhi Shi VPL-ICNNB05

TR

NTC

TG

OS

TR

NTC

Tolerance Relation Based Granular Space ModelSuppose the triplet (, , ) describes a tolerance relation based granular space , where

- denotes an object set system;
- denotes a tolerance relation system;
- denotes a nested tolerance covering system.

Zhongzhi Shi VPL-ICNNB05

Object Set System

- OS0, called an original object vector, is a vector of Rn, where R is the real number set.
- OS1, called a subset object of hierarchy 1, is a set of origin object vectors.
- Generally speaking, OSk+1, a subset object of hierarchyk+1, is a set of hierarchyk subset objects, OSk.

Zhongzhi Shi VPL-ICNNB05

Tolerance Relation System

- Tolerance relation system is a (parameterized) relation structure, and it is composed by a set of tolerance relations. It includes the relations or coefficients that the granular spaces base on.

Zhongzhi Shi VPL-ICNNB05

Tolerance Relation System

- A tolerance relation sn, sn∈XX, is a reflexive and symmetrical binary relation, where X is the original space of object vector and X∈Rn.
- A simple tolerance proposition

sp(α, β|dis, d) is defined as

sp(α, β|dis, d)=dis(α,β|ω)≤d.

- A compound tolerance propositionP(α,β|D) .

Zhongzhi Shi VPL-ICNNB05

Tolerance Relation System

- The tolerance relation sn(P,ω, DIS, D) induced by P(α, β|D) .

((α, β)sn(P,ω, DIS, D)) P(α, β|D)

where DIS={dis1, dis2, …, disk}.

- Proposition P, weight vector ω, distance function vector DIS and radius vector D are four important elements of a tolerance relation. Tolerance relation system is composed by a set of tolerance relations, and many space areas can be described by tolerance relation system.

Zhongzhi Shi VPL-ICNNB05

The Nested Tolerance Covering System

- The nested tolerance covering system is a (parameterized) granule structure, which denotes different levels granules and the granulation process based on above object system and tolerance relation system.

Zhongzhi Shi VPL-ICNNB05

The Nested Tolerance Covering System

The Nested Tolerance Covering On OS1

- Granules
- Tolerance covering
- Nested tolerance covering
- The granulation process on OS1

Zhongzhi Shi VPL-ICNNB05

The Nested Tolerance Covering System

The Adjoint Nested Tolerance Covering System On level k granules

- Adjoint subset object, which can be viewed as the intension of a granule.
- Two ways to generate the adjoint subset object.

Zhongzhi Shi VPL-ICNNB05

The Nested Tolerance Covering System

- The tolerance relation based granular space is so versatile that it includes all classification processes using distance functions and most of the multi-scale feature extraction processes in pattern recognition.

Zhongzhi Shi VPL-ICNNB05

TG Based Image Texture-Shape Recognition

- Multiple-scale texture-shape recognition approach tries to cluster textures and shapes. In a multi-resolution approach, texture and shape should be analyzed at different levels. Usually texture can be viewed as a random field or fractals.

Zhongzhi Shi VPL-ICNNB05

TG Based Image Texture-Shape Recognition

- In this part, we construct an image granulation model with tolerance relation based granular space. By using this model, we extract features from image set and use the extracted features to recognize new samples.

Zhongzhi Shi VPL-ICNNB05

The Construction of Object Set System

- Here OS0=(x, y, r, g, b) can be viewed as a pixel, where
- (x, y) is the pixel’s position in OS1 and
- (r, g, b) is its color value (RGB).
- OS1 can be viewed as an image and OS2 can be viewed as a set of key frames in a video stream.

Zhongzhi Shi VPL-ICNNB05

The Construction of Tolerance Relation System

We have two methods to define the tolerance relation system.

a. P(α, β|D)=(disa1(α, β|ω)≤d)∧(disa2(α, β|ω)≤d),and

disa1(α, β|ω)=|α0-β0|,

disa2(α, β|ω)=|α1-β1|

b. Suppose F: mki→Gk(ηk|ωk) is a bijective function, the distance between mki and Gk(ηk|ωk) can be computed by:

disb(mki, Gk(ηk|ωk)|ω)= ,

Where dis(α, β|ω) =ωi|αi-βi|.

Zhongzhi Shi VPL-ICNNB05

The Construction of Nested Tolerance Covering System

- Because Gk+1(ηk+1|ωk+1) is nested in a k level granule Gk(ηk|ωk), an adjoint vector Sk(ηk|ωk) assigned to a granule Gk(ηk|ωk), is called the spectrum based on module set

Mk+1={m(k+1)0, m(k+1)1, …, m(k+1)n}.

Where Sk(ηk|ωk)={vgk0, …, vgk(n-1)} and vgki=|{a| a is a nested k+1 level granule in Gk(ηk|ωk) and disb(a, m(k+1)i|ω)r }|

or vgki=|{a| a is a nested k+1 level granule in Gk(ηk|ωk) and disb(B(a), m(k+1)i|ω)r }|, where |.| is the number of a set and B( )is the B-transform.

Zhongzhi Shi VPL-ICNNB05

The Construction of Nested Tolerance Covering System

- B( )is the B-transform.

I’(x,y)=

To eliminate noise, the pixel I(x, y)=(x, y, r, g, b) in Gk(ηk|ωk) can be transformed to I’(x, y),which is a binary one

Zhongzhi Shi VPL-ICNNB05

The Construction of Image Classes

- Algorithm . Class Construction Algorithm (CCA)

Input: An image Set OS2={P0, …, Pn-1}

Output: Class code of every image Pi.

Procedure:

Step 1 Compute the spectrum S0(Pi) based on module set Mk={mk0, mk1, …, mkn} for each image

Step 2For each seed image Ei,

{CLASS(Ej)={Pi| dism(S0(Ej), S0(Pi))ri}

Here,

dism(Sk(Ej),Sk(Pi))=ωp(|Skp(Ej)|2－|Skp(Pi)|2)2.

Zhongzhi Shi VPL-ICNNB05

Experiments

- In the following experiments, 8 texture class groups (TCG0~TCG7) are used and every image class group is created by a random affined transformation of a seed image. Every seed image creates a texture class that contains 10 affined texture images. TCGi is a subset of TCGi+1.

Zhongzhi Shi VPL-ICNNB05

Some seed images

Zhongzhi Shi VPL-ICNNB05

TCG1

TCG2

TCG3

TCG4

TCG5

TCG6

TCG7

CCA

69.0%

68.0%

77.7%

73.2%

69.8%

70.3%

76.1%

71.9%

GFBM

78.0%

68.5%

74.7%

69.3%

69.6%

68.4%

73.8%

66.9%

GARBT

77.0%

76.0%

85.3%

79.3%

75.3%

76.3%

80.0%

78.1%

Results of ExperimentsThe classification gain for 8 texture class groups

The method GFBM is the algorithm developed in [17], which uses cosine part of Garbor filter

The method GARBT is the algorithm developed in [18], which uses Gabor wavelets transformation

Zhongzhi Shi VPL-ICNNB05

Results of Experiments

The classification gain for 8 texture class groups

90

Classification Gain

85

CCA

80

GFBM

75

GARBT

70

65

1

2

3

4

5

6

7

8

Texture Class Groups

Zhongzhi Shi VPL-ICNNB05

Summary

- A tolerance relation based granular space TG,which is described as (OS, TR, NTC), are modeled and constructed.
- Illustrations of how to use the tolerance relation based granular space to represent and solve problems are presented.
- With an application of TG, we show tolerance relation based granular space is a good choice to deal with some problems in image processing field. After improving the model with flexible size of granules and more levels in granular space, we will get more effective and efficient results.

Zhongzhi Shi VPL-ICNNB05

Conclusions

We have put forward a novel sparse coding model, called classification-oriented sparse coding (COSC) model, for learning sparse and discriminable structures in the natural images which is valuable for the higher perception task: visual classification.

Extend sparse coding principle combining the vision.

Propose a tolerance relation based granular computing model

Zhongzhi Shi VPL-ICNNB05

Download Presentation

Connecting to Server..