1 / 22

Cluster Computing in the Classroom: Topics, Guidelines, and Experiences

Cluster Computing in the Classroom: Topics, Guidelines, and Experiences. Amy Apon Department of Computer Science & Computer Engineering University of Arkansas. Clusters and Data Engineering.

elia
Download Presentation

Cluster Computing in the Classroom: Topics, Guidelines, and Experiences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cluster Computing in the Classroom: Topics, Guidelines, and Experiences Amy Apon Department of Computer Science & Computer Engineering University of Arkansas

  2. Clusters and Data Engineering • A cluster is a set of whole computers connected via a network, and used as an integrated resource to solve a single application • Increase throughput for massive data processing • Inexpensive - uses commodity computers with lots of disks and disk space

  3. Teaching Challenges • Prerequisites are difficult to establish • One course does not fit all! We propose: • Cluster teaching material organized as modules • Accessible to a variety of situations

  4. Outline • Overview of target audience for the proposed teaching materials • Description of course modules • Problem areas • Conclusions • Acknowledgements and references

  5. Courseware developed with: Dr. Amy Apon Dr. Jens Mache Dr. Hai Jin Dr. Rajkumar Buyya

  6. Who our students are • Juniors, seniors, graduate students • With a variety of preparation • Operating Systems? • Maybe haven’t seen threads • Computer Networks? • Maybe haven’t seen sockets • Computer Architecture? • Maybe don’t understand how cache works

  7. Course Units • Needed because of the diversity of institutions and student preparation • Matched to the Computing Curricula 2001 to avoid overlap with existing courses • Basic Units (have overlap with ACM Core) • Core Units (essential to cluster computing) • Extended Units (more advanced, optional)

  8. Course Units Can Be Combined • We propose sample courses with an emphasis in one of • Architecture • Programming • Algorithms and Applications

  9. Five Basic Units • Programming Fundamentals (PF2, PF5) • Algorithms and problem-solving • Event-driven programming (3 hours total) • Architecture and Organization 4 (AR4) • Memory system organization (1 hour) • Architecture and Organization 7 (AR7) • Multiprocessing architectures (1 hour) • Operating Systems 3 (OS3) • Concurrency (1 hour) • Net-Centric Computing 2 (NC2) • Communication and networking (2 hours)

  10. Ten Core Units • Algorithms and Complexity 4 (AL4) • Distributed algorithms (1 hour) • Algorithms and Complexity 11 (AL11) • Parallel algorithms (3 to 7 hours) • Architecture and Organization 7 (AR7) • Multiprocessing and alternative architectures (2 hours) • Architecture and Organization 9 (AR9) • Architectures for networks & distributed systems (1-4 hours) • Operating Systems 11 (OS11) • System performance evaluation (1-2 hours)

  11. Ten Core Units, continued • Net-Centric Computing 2 (NC2) • Communication and networking (1 hour) • Net-Centric Computing 6 (NC6) • Network management (1-2 hours) • Social and Professional Issues 9 (SP9) • Economic issues in computing (2 hours) • Software Engineering 2 (SE2) • Using API’s: Basic MPI or PVM, basic PVFS (2 hours) • Computational Science 4 (CN4) • High-performance computing (6 or more hours)

  12. Many Choices for Extended Units! • Software Engineering (SE3), Software tools and environments • Debugging tools • Operating Systems (OS8) • Parallel file systems • Algorithms (AL11) • Advanced parallel algorithms. • Architecture and Organization (AR9) • Architecture for networks and distributed systems • Graphics and Visualization (GV9) • Intelligent Systems 4 (IS4), Advanced search • Information Management (IM8, IM9, IM10, IM11) • Distributed databases, physical database design, data mining, and information storage and retrieval on clusters • Computational Science (CN1, CN3)

  13. Cluster Architecture Emphasis • Similar requirements as for a course in advanced computer architecture • Suited for advanced undergraduates and graduate students who have completed • Computer organization • Computer networks • Operating systems • Programming

  14. Cluster Architecture Topics

  15. Programming Emphasis • Suited for undergraduates with exposure to • Data structures and algorithms • Computer organization • Can use general access computer lab/LAN (if performance is not an issue) • Can use generally available programming environments

  16. Cluster Programming Topics • Shared memory programming • Leading to a discussion of NUMA • Sockets • Leading to discussion about network overhead, low-latency protocols • Parallel programming using MPI • Middleware: Java RMI, CORBA

  17. Algorithms and Applications • Suited for • Advanced undergraduate with a strong algorithms and programming background • Graduate students • Can be • Parallel algorithms • With a focus on topics from a particular domain

  18. Algorithms and Applications Topics • Application Overview • Compression, data mining, image rendering, genetic algorithms,… • Techniques of Algorithm design • Partitioning, divide and conquer, communication and synchronization, … • Modeling and visualization • Performance tuning

  19. Classroom Favorites • Build your own cluster • Using old lab machines, install PVM or MPI • Parallel matrix multiply, sort • Implement these using MPI, evaluate the performance using data of varying size, present results graphically • Term programming project • Can have students select their own!

  20. Problem Areas • Cluster setup and administration • Cluster usage (especially for performance experiments) • Security

  21. Conclusions • Cluster computing is a low cost approach to massive data processing • Cluster computing can be taught at the undergraduate level • Modules help to organize the material so that it is appropriate for your institution • Modules can be mixed and matched

  22. References and Acknowledgements • Cluster Computing in the Classroom: Topics, Guidelines, and Experiences  • by Amy Apon, Rajkumar Buyya, Hai Jin, Jens Mache, First International Workshop on Cluster Computing Education, Cluster.Edu 2001 • See http://citeseer.nj.nec.com/395286.html

More Related