Chapter 12

Chapter 12 (Page 554 – 564) Ping Perez CS 147 Summer 2001

Topics Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks

What is dataflow computing To understand how data flow computers work, it is first necessary to understand dataflow graphs. As a computer program is compiled, it is converted into its equivalent dataflow graph, which shows the data dependencies between statements and is used by the dataflow computer to generate the structures it needs to execute the program.

A  B + C D  E + F G  A + H I  D + G J  I + K A code segment and its dataflow graph B C E F + + H + + K +

As shown in the figure, each vertex of the graph corresponds to the operator performed by one of the instructions. The directed edges going to a vertex correspond to the operands of the function performed by the vertex, and the directed edge leaving the vertex represents the result generated by the function. Dataflow Graph B C E F • A  B + C • D  E + F • G  A + H • I  D + G • J  I + K + + H + + K +

This code segment has four violations of the single assignment rule, starting with statement 2. The value stored by this statement, B, was used as an operand in statement 1, so it must be renamed. We can rename it B1, and change all references to it later in this code. Similarly, values C and D, set by statements 3 and 4, are also used as operands in prior statements and must be renamed. 1. A  B + C 2. B  A + D 3. C  A + B 4. D  C + B 5. A  A + C Single Assignment Rule

Finally, statement 5 stores its result in A, the same variable used to store the result in statement1, we must also change this variable’s name. Note that statement 2, 3 and 5 all use A as an operand: This is not a violation of the single assignment rule. An operand can be used many times. 1. A  B + C 2. B  A + D 3. C  A + B 4. D  C + B 5. A  A + C Single Assignment Rule (con’t)

A  B + C B1  A + D C1  A + B1 D1  C1 + B1 A1  A + C1 B C D The revised code segment and its dataflow graph + + + + +

Single Assignment Rule The data flow graph describes the dependencies between statements and how data will flow between statements. An edge, however, does not show when data flows from one statement to another. The data that traverses an edge is called a token. When a token is available, it is represented as a dot on the edge.

A vertex is ready to fire, or execute its instruction when all edges have tokens, or the instruction’s operands are all available. B C D Dataflow Graph + + + + +

I - Structures Within the computer system, dataflow vertices are usually stored as I-structures. Each I-structure includes the operation to be performed, its operands, and a list of destinations for its result.

+ 2 ( ) { 2 / 1 } An I-structure and the dataflow graph with I-structure + 2 3 { 2/1, 3/1,4/2} + ( ) 4 {3/2,4/2} + ( ) ( ) {4/1,5/2} + ( ) ( ) - + ( ) ( ) -

The architectures of dataflow system 1. Static architectures 2. Dynamic architectures

This figure shows the organization of the static dataflow computer. The I-store unit has two sections. The memory section stores the I-structures of the dataflow program. Static dataflow computer organization I-store unit Memory section Update/Ready/section Processors Firing queue

What is Systolic Arrays? Systolic array incorporates several processing elements into a regular structure, such as linear array or mesh. Each processing element performs a single, fixed function, and communicates only with its neighboring processing elements.

A 2 X 2 systolic array to multiply two matrices U L 1,1 R D U L 1,2 R D U L 2,2 R D U L 2,1 R D

During the first clock cycle we input A1,1 to input L and B1,1 to input U of processing element 1,1. This processing element calculates A1,1B1,1 and adds it to its running total and running time remain 0. The first clock cycle B1,1 0 Total= A1,1 B1,1 Total= 0 A1,1 Total= 0 Total= 0 0

During the second clock cycle, we input A1,2 to L and B2,1 to U, this processing element multiplies them and adds to product to its running total, which becomes A1,1B1,1 + A1,1B2,1, the finial value of C1,1. Clock Cycle 2 B2,1 B1,2 Total= A1,1 B1,1 + A1,2B2,1 Total= A1,1B1,2 A1,1 A1,2 Total= A2,1B1,1 Total= 0 A2,1

Clock cycle 3 continues the matrix multiplication. Since C1,1 has already been calculated, we input 0 to the inputs of processing element 1,1 so the running total is not changed. The final values of C1,2 and C2,1are calculated during this clock cycle and first part of C2,2 is generated. Clock Cycle 3 0 B2,2 Total= A1,1 B1,1+ A1,1B2,1 A1,2 Total= A1,1B1,2 + A1,2B2,2 0 B1,2 B2,1 Total= A1,1B1,1 + A2,2B2,1 Total= A2,1B1,2 A2,1 A2,2

The final value of C2,2 is calculated during clock cycle 4, as shown in the figure, at this point, multiplication of the two matrices has been computed. Clock Cycle 4 Total= A1,1 B1,1+ A1,1B2,1 Total= A1,1B1,2 + A1,2B2,2 B2,2 Total= A2,1B1,1 + A2,2B2,1 Total= A2,1B1,2+ A2,2B2,2 A2,2

Neural Networks • Neural network are different from any other computing structure. • They incorporate thousands or millions of simple processing elements called neurons. • They have far less processing power than CPU.

What is the difference between computer and neural network? Unlike traditional computer, which are programmed, neural networks are trained. Training consists of defining system input data and defining the desired system outputs for that input data.

How to generate the output? ( 1 ) System outputs are generated as a function as a function of the outputs of individual neurons. Each neuron’s output, in turn is a function of the outputs of the neurons to which it is connected. The output of each neuron is multiplied by its weighting factor. All of these weighted values are added together.

This value is compared to the threshold value for that neuron. If the weighted value is greater than or equal to the threshold value, the neural output value is 1, otherwise it’s output is 0. How to generate the output? (2)

How to generate the output? 3 1 Label 1 4 2 Label 1 2 3 4 Weight 0.1 0.2 0.3 0.4 Value 1 1 0 1 Input 1*0.1 + 1*0.2 + 0*0.3 + 1* 0.4 Value =0.7 > 0.65 (N’s threshold value) 0.1 0.2 0.3 0.4 weight weight value Since this weighted value 0.7 is greater than the threshold Value, neuron N outputs a logical value of 1

Where is a neural network be used? A neural network is not appropriate for general purpose computing, you won’t find a neural network running windows on a personal computer. Instead it has found applications in tasks that do not run well on conventional architectures. Neural networks are also being used in control systems and artificial intelligence applications.

The End

Chapter 12

Chapter 12

Presentation Transcript

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

CHAPTER 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12