1 / 39

# Introduction to Neural Networks - PowerPoint PPT Presentation

Introduction to Neural Networks. Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs University College Dublin gianluca.pollastri@ucd.ie. Credits. Geoffrey Hinton, University of Toronto.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Introduction to Neural Networks' - kin

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Introduction to Neural Networks

School of Computer Science and Informatics and

University College Dublin

gianluca.pollastri@ucd.ie

• Geoffrey Hinton, University of Toronto.

• borrowed some of his slides for “Neural Networks” and “Computation in Neural Networks” courses.

• Paolo Frasconi, University of Florence.

• This guy taught me Neural Networks in the first place (*and* I borrowed some of his slides too!).

• One of the earliest versions: Jeffrey Elman, 1990, Cognitive Science.

• Problem: it isn’t easy to represent time with Feedforward Neural Nets: usually time is represented with space.

• Attempt to design networks with memory.

• The idea is having discrete time steps, and considering the hidden layer at time t-1 as an input at time t.

• This effectively removes cycles: we can model the network using an FFNN, and model memory explicitly.

It

Xt

Ot

d

d = delay element

• BackPropagation Through Time.

• If Ot is the output at time t, It the input at time t, and Xt the memory (hidden) at time t, we can model the dependencies as follows:

• We can model both f() and g() with (possibly multilayered) networks.

• We can transform the recurrent network by unrolling it in time.

• Backpropagation works on any DAG. An RNN becomes one once it’s unrolled.

It

Xt

Ot

d

d = delay element

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Xt-2

Ot-2

• # I=inputs, O=outputs, T=targets

• T := size(O);

• X0 := 0;

• for t := 1..T

• Xt := f( Xt-1 , It );

• for t := 1..T {

• Ot := g( Xt , It );

• g.gradient( Ot - Tt );

• δt = g.deltas( Ot - Tt );

• }

• for t := T..1

• δt-1 += f.deltas(δt );

• }

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Xt-2

Ot-2

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Xt-2

Ot-2

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Xt-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Xt-2

It

Xt

Ot

It+1

Xt+1

Ot+1

It-1

Ot-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Xt-2

Xt-1

It

Ot

It+1

Xt+1

Ot+1

It-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Ot-1

Xt-2

Xt-1

Xt

It

It+1

Ot+1

It-1

It+2

Xt+2

Ot+2

It-2

Ot-2

Ot-1

Ot

Xt-2

Xt-1

Xt

Xt+1

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

It

It+1

It-1

It+2

It-2

Ot-2

Ot-1

Ot

Ot+1

Ot+2

Xt-2

Xt-1

Xt

Xt+1

Xt+2

• Neurons

• Multi-Layered Neural Networks:

• Basic learning algorithm

• Expressive power

• Classification

• How can we *actually* train Neural Networks:

• Speeding up training

• Learning just right (not too little, not too much)

• Figuring out you got it right

• Feed-back networks?

• Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann Machines)

• Recurrent Neural Networks

• Bidirectional RNN

• 2D-RNN

• Concluding remarks

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN

• (), () and () are independent from t: stationary

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN

• (), () and () are independent from t: stationary

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN

• (), () and () are independent from t: stationary

Ft = ( Ft-1 , Ut )

Bt = ( Bt+1 , Ut )

Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN

• (), () and () are independent from t: stationary

• FORWARD(U) {

• T  size(U);

• F0  BT+1  0;

• for t  1..T

• Ft = ( Ft-1 , Ut );

• for t  T..1

• Bt = ( Bt+1 , Ut );

• for t  1..T

• Yt = ( Ft , Bt , Ut );

• return Y;

• }

T  size(U);

F0  BT+1  0;

for t  1..T

Ft = ( Ft-1 , Ut );

for t  T..1

Bt = ( Bt+1 , Ut );

for t  1..T {

Yt = ( Ft , Bt , Ut );

[δFt, δBt] = .backprop&gradient( Yt - Yt );

}

for t  T..1

for t  1..T

}

Learning in BRNNs

• Neurons

• Multi-Layered Neural Networks:

• Basic learning algorithm

• Expressive power

• Classification

• How can we *actually* train Neural Networks:

• Speeding up training

• Learning just right (not too little, not too much)

• Figuring out you got it right

• Feed-back networks?

• Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann Machines)

• Recurrent Neural Networks

• Bidirectional RNN

• 2D-RNN

• Concluding remarks

Pollastri & Baldi 2002, Bioinformatics

Baldi & Pollastri 2003, JMLR