Introduction to Neural Networks

1 / 39

Introduction to Neural Networks - PowerPoint PPT Presentation

Introduction to Neural Networks. Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs University College Dublin [email protected] Credits. Geoffrey Hinton, University of Toronto.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' Introduction to Neural Networks' - kin

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Introduction to Neural Networks

School of Computer Science and Informatics and

University College Dublin

[email protected]

Credits
• Geoffrey Hinton, University of Toronto.
• borrowed some of his slides for “Neural Networks” and “Computation in Neural Networks” courses.
• Paolo Frasconi, University of Florence.
• This guy taught me Neural Networks in the first place (*and* I borrowed some of his slides too!).
Recurrent Neural Networks (RNN)
• One of the earliest versions: Jeffrey Elman, 1990, Cognitive Science.
• Problem: it isn’t easy to represent time with Feedforward Neural Nets: usually time is represented with space.
• Attempt to design networks with memory.
RNNs
• The idea is having discrete time steps, and considering the hidden layer at time t-1 as an input at time t.
• This effectively removes cycles: we can model the network using an FFNN, and model memory explicitly.

It

Xt

Ot

d

d = delay element

BPTT
• BackPropagation Through Time.
• If Ot is the output at time t, It the input at time t, and Xt the memory (hidden) at time t, we can model the dependencies as follows:
BPTT
• We can model both f() and g() with (possibly multilayered) networks.
• We can transform the recurrent network by unrolling it in time.
• Backpropagation works on any DAG. An RNN becomes one once it’s unrolled.

It

Xt

Ot

d

d = delay element

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Xt-2

Ot-2

• # I=inputs, O=outputs, T=targets
• T := size(O);
• X0 := 0;
• for t := 1..T
• Xt := f( Xt-1 , It );
• for t := 1..T {
• Ot := g( Xt , It );
• g.gradient( Ot - Tt );
• δt = g.deltas( Ot - Tt );
• }
• for t := T..1
• δt-1 += f.deltas(δt );
• }

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Xt-2

Ot-2

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Xt-2

Ot-2

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Xt-2

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Xt-2

Xt-1

It

Ot

It+1

Xt+1

Ot+1

It-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Ot-1

Xt-2

Xt-1

Xt

It

It+1

Ot+1

It-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Ot-1

Ot

Xt-2

Xt-1

Xt

Xt+1

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

• Neurons
• Multi-Layered Neural Networks:
• Basic learning algorithm
• Expressive power
• Classification
• How can we *actually* train Neural Networks:
• Speeding up training
• Learning just right (not too little, not too much)
• Figuring out you got it right
• Feed-back networks?
• Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann Machines)
• Recurrent Neural Networks
• Bidirectional RNN
• 2D-RNN
• Concluding remarks
BRNN

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN
• (), () and () are independent from t: stationary
BRNN

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN
• (), () and () are independent from t: stationary
BRNN

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN
• (), () and () are independent from t: stationary
BRNN

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN
• (), () and () are independent from t: stationary
Inference in BRNNs
• FORWARD(U) {
• T  size(U);
• F0  BT+1  0;
• for t  1..T
• Ft = ( Ft-1 , Ut );
• for t  T..1
• Bt = ( Bt+1 , Ut );
• for t  1..T
• Yt = ( Ft , Bt , Ut );
• return Y;
• }

T  size(U);

F0  BT+1  0;

for t  1..T

Ft = ( Ft-1 , Ut );

for t  T..1

Bt = ( Bt+1 , Ut );

for t  1..T {

Yt = ( Ft , Bt , Ut );

[δFt, δBt] = .backprop&gradient( Yt - Yt );

}

for t  T..1

for t  1..T

}

Learning in BRNNs
• Neurons
• Multi-Layered Neural Networks:
• Basic learning algorithm
• Expressive power
• Classification
• How can we *actually* train Neural Networks:
• Speeding up training
• Learning just right (not too little, not too much)
• Figuring out you got it right
• Feed-back networks?
• Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann Machines)
• Recurrent Neural Networks
• Bidirectional RNN
• 2D-RNN
• Concluding remarks
2D RNNs

Pollastri & Baldi 2002, Bioinformatics

Baldi & Pollastri 2003, JMLR