1 / 16

CSCI 4125 Programming for Performance

CSCI 4125 Programming for Performance. Andrew Rau-Chaplin arc@cs.dal.ca www.cs.dal.ca/~arc. Course Objectives. Explore techniques for designing, implementing and evaluating efficient programs for Sequential computers, Shared-Memory Multiprocessors, and Distributed Memory Multicomputers

isanne
Download Presentation

CSCI 4125 Programming for Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCI 4125 Programming for Performance Andrew Rau-Chaplin arc@cs.dal.ca www.cs.dal.ca/~arc

  2. Course Objectives • Explore techniques for designing, implementing and evaluating efficient programs for • Sequential computers, • Shared-Memory Multiprocessors, and • Distributed Memory Multicomputers • Make it go fast!

  3. Performance oriented dev cycle • techniques and tools for a performance oriented development cycle • Algorithm design • Implementation • Benchmarking/evaluation • Performance Tuning

  4. Quantifying performance • Themes include: • evaluation of performance • design of test data sets • issues of stability/reliability • scalability • common performance enhancing techniques • parallel algorithm design techniques • identification and elimination of dependencies

  5. Skills Development • how to design experiments/benchmarks • how to use of statistics in performance evaluation • how to instrument code to obtain reliable timings • how to use compiler switches • how to use a profiler and performance tuning tools • how to use a debugger/tracing tools • how to plot performance results

  6. Introduction to Parallelism Parallel Programming Parallel Architectures Parallel Algorithms Parallel Applications Other Parallel Architectures & Algorithms Topics

  7. Official Outline • This course explores the design, implementation, and evaluation of computer programs for applications in which performance is a central issue. • In the sequential and multi-core settings, it explores topics such as profiling, cache effects, I/O performance, floating-point issues, multi-threading, and performance tuning techniques. • It introduces techniques for the design, implementation and evaluation of programs for Multicore processors, Shared-Memory Multiprocessors (SMPs) and Distributed Memory Multicomputers (Clusters).

  8. Resources • Course web page: • www.cs.dal.ca/~arc/teaching/CSc4125 • All notes, readings, assignments • Parallel Machines • Your laptop! • CGM6 & CGM7 • Hugh

  9. Readings • Sorry no text book! • Will Assign Readings

  10. Books • Introduction to High Performance Computing for Scientists and Engineers by Georg Hager and Gerhard Wellein • Parallel Programming by Peter Pacheco, Morgan Kaufman • Structured Parallel Programming by Michael McCool, Arch D. Robison, and James Reinders • Parallel Programming in C with MPI and OpenMP by Quinn • Parallel Programming with Intel Parallel Studio XE by S. Blair-Chappell and A. Stokes • Using OpenMP: Portable Shared Memory Parallel Programming By Barbara Chapman, Gabriele Jost and Ruud van der Pas; • Parallel Programming in OpenMP, by Rohit Chandra, Dave Kohr, Jeff McDonald, Morgan Kaufman

  11. Prerequisites • Knowledge of C • Csci3120: Operating systems • Good to have • CSci3110 - Analysis of Algorithms

  12. Course Evaluation • Assignments 50% • Midterm 25% • Final Project 20% • Participation 5% • See course web page for assignment copies and due dates

  13. Assignments • Selected From • Sequential Optimization • OpenMP • Cilk • Thread building blocks • MPI • Hadoop • CUDA/OpenCL Best 4 out of 5 count towards final grade!

  14. “Midterm” • About 2/3rd of the way through… • To test conceptual knowledge gained from classes and readings • If you have not done the readings you will not pass the midterm

  15. Final Project • Select your own topic • Either • Optimize an existing codebase • Design and implementation of an efficient new code • Components: Literature/Code review, some research or programming work, final paper, presentation • Main Deliverable: Conference style paper plus short in-class talk

  16. Questions • Why are you taking this course? • Which performance oriented technologies are you interested in? • How will you know if the course has been a success for you?

More Related