- 251 Views
- Uploaded on

Download Presentation
## Learning in Neural and Belief Networks

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Contents

- How the Brain works
- Neural Networks
- Perceptrons

Introduction

- Two view points in this chapter
- Computational view points : representing function using network
- Biological view points : mathematical model for brain
- Neuron: computing elements
- Neural Networks: collection of interconnected neurons

How the Brain Works

- Cell body (soma) :provides the support functions and structure of the cell
- Axon : a branching fiber which carries signals away from the neurons
- Synapse : converts a electrical signal into a chemical signal
- Dendrites : consist of more branching fibers which receive signal from other nerve cells
- Action potential: electrical pulse
- Synapse
- excitatory: increasing potential
- synaptic connection: plasticity
- inhibitory: decreasing potential

A collection of simple cells can lead to thoughts, action, and consciousness.

Comparing brains with digital computers

- They perform quite different tasks, have different properties
- Speed (in Switching speed)
- computer is a million times faster
- brain is a billion times faster
- Brain
- Perform a complex task
- More fault-tolerant: graceful degradation
- To be trained using an inductive learning algorithm

Neural Networks

- NN: nodes(unit), links(has a numeric weight)
- Each link has a weight
- Learning : updating the weights
- Two computational components
- linear component: input function
- nonlinear component: activation function

Simple computing elements

- Total weighted input
- By applying the activation function g

Threshold

- To cause the neuron to fire
- can be replaced with an extra input weight.
- The input greater than threshold, output 1
- Otherwise 0

Network structures(I)

- Feed-forward networks
- Unidirectional links, no cycles
- DAG(directed acyclic graph)
- No links between units in the same layer, no links backward to a previous layer, no links that skip a layer.
- Uniformly processing from input units to output units
- No internal state

input units/ output units/ hidden units

- Perceptron: no hidden units
- Multilayer networks: one or more hidden units
- Specific parameterized structure: fixed structure and activation function
- Nonlinear regression: g(nonlinear function)

Network Structures(II)

- Recurrent Network
- The Brain similar to Recurrent Network
- Brain has backward link like Recurrent
- Recurrent networks have internal states stored in the activation level
- Unstable, oscillate, exhibit chaotic behavior
- Long computation time
- Need advanced mathematical method

Network Structures(III)

- Examples
- Hopfield networks
- Bidirectional connections with symmetric weights
- Associative memory: most closely resembles the new stimulus
- Boltzmann machines
- Stochastic(probabilitic) activation function

Optimal Network Struture(I)

- Too small network: in capable of representation
- Too big network: not generalized well
- Overfitting when there are too many parameters.
- Feed forward NN with one hidden layer
- can approximate any continuous function
- Feed forward NN with 2 hidden layer
- can approximate any function

Optimal Network Structures(II)

- NERF(Network Efficiently Representable Functions)
- Function that can be approximated with a small number of units
- Using genetic algorithm: running the whole NN training protocol
- Hill-climbing search(modifying an existing network structure)
- Start with a big network: optimal brain damage
- Removing weights from fully connected model
- Start with a small network: tiling algorithm
- Start with single unit and add subsequent units
- Cross-validation techniques

Perceptrons

- Perceptron: single-layer, feed-forward network
- Each output unit is indep. of the others
- Each weight only affects one of the outputs

where,

What perceptrons can represent

- Boolean function AND, OR, and NOT
- Majority function: Wj=1, t=n/2 ->1 unit, n weights
- In case of decision tree: O(2n) nodes
- can only represent linearly separable functions.
- cannot represent XOR

Examples of Perceptrons

- Entire input space is divided in two along a boundary defined by
- In Figure 19.9(a): n=2
- In Figure 19.10(a): n=3

Learning linearly separable functions(I)

- Bad news: not many problem in this set
- Good news: given enough training examples, there exists a perceptron algorithm learning them.
- Neural network learning algorithm
- Current-best-hypothesis(CBH) scheme
- Hypothesis: a network defined by the current values of the weights
- Initial network: randomly assigned weight in [-0.5, 0.5]
- Repeat the update phase to achieve convergence
- Each epoch: updating all the weights for all the examples

Learning linearly separable functions(II)

- Learning
- The error
- Err=T-O
- :Rosenblatt in 1960
- : learning rate
- Error positive
- Need to increase O
- Error negative
- Need to decrease O

Perceptrons(Minsky and Papert, 1969)

- Limits of linearly separable functions
- Gradient descent search through weight space
- Weight space han no local minima
- Difference btw. NN and other attribute-based methods such as decision trees.
- Real numbers in some fixed range vs. discrete set
- Dealing with discrete set
- Local encoding: a single input, discrete attribute values
- None=0.0, Some=0.5, Full=1.0 (WillWait)
- Distributed encoding: one input unit for each attribute

Summary(I)

- Neural network is made by seeing human’s brain
- Brain still superior to Computer in Switching Speed
- More fault-tolerant
- Neural network
- nodes(unit), links(has a numeric weight)
- Each link has a weight
- Learning : updating the weights
- Two computational components
- linear component: input function
- nonlinear component: activation function

Summary(II)

- In this text, We only consider
- Feed-forward networks
- Unidirectional links, no cycles
- DAG(directed acyclic graph)
- No links between units in the same layer, no links backward to a previous layer, no links that skip a layer.
- Uniformly processing from input units to output units
- No internal state

Summary(III)

- Network size decides Representation Power
- Overfitting when there are too many parameters.
- Feed forward NN with one hidden layer
- can approximate any continuous function
- Feed forward NN with 2 hidden layer
- can approximate any function

Summary(IV)

- Perceptron: single-layer, feed-forward network
- Each output unit is indep. of the others
- Each weight only affects one of the outputs
- Only available in linear separable functions
- If Problem Space is flat, Neural Network is very available.
- In other words, if we make it easy in algorithm perspective, Neural network also do
- Basically, Back Propagation only guarantee Local Optimality in neural network

Download Presentation

Connecting to Server..