1 / 21

Introduction to Parallel Processing with Multi-core Part I

Introduction to Parallel Processing with Multi-core Part I. Jie Liu, Ph.D. Professor Department of Computer Science Western Oregon University USA liuj@wou.edu. Now the question – Why parallel?. Three things are for sure: Tax, death, and parallelism

arlene
Download Presentation

Introduction to Parallel Processing with Multi-core Part I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Parallel Processing with Multi-corePart I Jie Liu, Ph.D. Professor Department of Computer Science Western Oregon University USA liuj@wou.edu

  2. Now the question – Why parallel? • Three things are for sure: • Tax, death, and parallelism • How long does it take a single person to build I-5? • Answer  • What we do is that we want to solve a very computational intensive problem, such as modeling protein interacting with the water surrounding it. The problem could take a long long time. • The protein simulation problem take a Cray X/MP 31,688 years to simulate 1 second of interaction (in 1990). Let’s say today super computer is 100 time faster than Cray X/MP, we still need more than 300 years! • The only solution  parallel processing

  3. Why parallel (2) • Moore’s Law • The logic density of silicon-based IC (Integrated Circuits) closely followed the curve , that is, it doubles every year (until 1970, then every 18 months) • Why is the density related to processor’s speed? Because, during the process of “Computing,” the electrons need to carry signal from one end of a circuit to the other end. • For a 2GHz computer, its signals travel about .5 meters per clock cycle (.5 nanosecond) • That is, the speed of light places a physical limitation on how fast a sign processor computer can run

  4. Why parallel (3) • There are problems require much faster computation power than today’s fastest single CPU computers can provide. • The speed of light limits how fast a single CPU computer can run • If we want to solve some computational intensive problems in a reasonable amount of time, we have to result to parallel computers!

  5. Some Definitions • Parallel processing • Information processing that emphasizes on concurrent manipulation of data belonging to many processes solving a single problem • Example: having 100 processors sorting an array of 1,400,000,000 element – is Parallel processing • Example: printing homework while reading emails – is not Parallel processing be cause the processes are not solving a single problem. • A parallel computer is a multi-processor computer capable of parallel processing • Computers with just co-processors for math and image processing are not considered as parallel computers (some people be disagree with this notion)

  6. Two forms of parallelisms • Control Parallelism • Concurrency is achieve by applying different operations to different data elements of a single problem • Pipeline is a special form of control parallelism • Assembly line is an example of pipeline • Data Parallelism • Concurrency is achieve by applying the same operation to different data elements of a single problem • Taking a class is an example of data parallelism (if we assuming you all are learning at the same speed) • Marching of army brigade can be considered as data parallelism

  7. Control VS. Data Parallelism • Looking the following statement • if a[i] > b[i] • a[i] = a[i]*b[i] • else • b[i] = a[i]-b[i] • In a control parallelism fashion, some processors execute statement a[i] = a[i]*b[i], other may execute b[i] = a[i]-b[i] during the same clock cycle • In a data parallelism fashion, especially on a SIMD machine, this if statement is executed in two clock cycles: • During the first clock cycle, all the processors satisfy the condition of a[i] > b[i] execute statement a[i] = a[i]*b[i]. • During the second machine cycle, processors not satisfy the condition of a[i] > b[i] execute statement b[i] = a[i]-b[i]

  8. Speedup – Take I • Speedup is a measurement of how well or how effective a parallel algorithm is • Is defined as the ratio between the time needed for the most efficient sequential algorithm to perform a computation and the time needed to perform the same computation on a parallel computer with a parallel algorithm. That is, • Example, we developed a parallel bubble sort that sort n elements in O(log n) time using nprocessors. The speedup is because there are efficient sorting algorithms that has a complexity of O(nlogn)

  9. Brain Exercise • Six equally skilled students need to make 210 special cookies, each consists of the following tasks • Break dough into small pieces of equal size (1) • Hand roll the small size dough pieces into balls (1) • Press the balls into flat “cookies” (1) • Roll the “cookies” into wrappers (1) • Place suitable amount of fillings onto the wrappers (1) • Fold the wrappers to enclose the fillings completely (1) • How to do this in a pipeline fashion? • How to do this in a control parallelism fashion, other than pipeline? • How to do this in data parallel fashion?

  10. Approach #1 D1 ~ D6 D7 ~ D12

  11. Approach #2 D1 D2 D3 D4 D5 D6 D7

  12. Analysis • Sequential cost (1+1+1+1+1+1)*210 = 1260 time units • Maximum Speedup for Approach #1 • Maximum Speedup for Approach #2 • Other questions to consider • If I have 1260 students, can I get the task done in 1 time unit? • What if step 3 takes 3 time units and step 6 takes 2 time units? • What if I add more “skilled” students to different approach, what would be the effect?

  13. Grand challenges • A list of problems that are very computational intensive, but can benefit human being greatly, heavily funded by the US government • The following is just the category of problems

  14. Parallel Computers & Companies

  15. One of the Fastest Computer • Per ttp://abcnews.go.com/Technology/WireStory?id=5028546&page=2 • By: IBM and Los Alamos National Laboratory • Name: Roadrunner (Named after New Mexico’s state bird ) • Twice as fast as IBM's Blue Gene, which is three time faster than the next fastest computer in the world • Cost $100,000,000 – very cheap • Speed 1,000,000,000,000,000 FLOP per second (petaflop) • Usage: primarily on nuclear weapons work, including simulating nuclear explosions • Related to gaming: In some ways, it's "a very souped-up Sony PlayStation 3." • Some facts: • The interconnecting system occupies 6,000 square feet with 57 miles of fiber optics and weighs 500,000 pounds. Although made from commercial parts, the computer consists of 6,948 dual-core computer chips and 12,960 cell engines, and it has 80 terabytes of memory housed in 288 connected refrigerator-sized racks. • Two years ago, the fastest computer in the world can perform 100,000,000,000,000 FLOP per second 100 taraflop

  16. Parallel Computers and Programming – the trend • Hardware • Super computers – multiprocessor/multicomputer – the fastest computers at the time • Beowulf – cluster of off-the-shelf computers linked by a switch • Othe distributed system such as NOW • Multi-core – Many core (a CPU itself) within a CPU, soon will go over 60+ cores per CPU • Programming • MPI for message passing architecture • Vendor specific add-on to well known programming languages • New language such as Microsoft’s F# • Multi-core programming (add-on to well known programming languages) • Intel's Threading Building Blocks (TBB) • Microsoft’s Task Parallel Library -- support Parallel For, PLINQ and etc, need to keep an eye on this one • Third party such as Jibu – may merge with MS

  17. Multi-Core Programming • Sequential  • Parallel 

  18. Why Study Parallel Processing/Programming • Making your code run more efficiently • Utilize existing resources (other cores) • … … • Good coding class for CS students • To learn something new • To improve your skill sets • To improve your problem solving skills • To exercise your brain • To review may Computer Science subject areas • To relax a constraint our professors embedded in our thinking process in our early years of studying (What is the PC in a CPU?)

  19. PRAM (Parallel Random Access Machine) • A theoretical parallel computer • Consists of a control unit, global memory, and an unbounded set of processors, each with its own memory. • In addition, • Each processor has its unique id • At each step, a active processor can Read/Write memory (global or private), perform the instruction as all other active processors, idle, or activate another processor • How many steps does it take to activate n processors

  20. PRAM

  21. Important Terms • Massive Parallel Computer • Roadrunner • petaflop • Super computers • Beowulf • NOW • MPI • Multi-core • PRAM • computational intensive problem • Moore’s Law • Parallel processing • parallel computer • Control Parallelism • Data Parallelism • Speedup • Grand challenges

More Related