High Performance Computing Topics

Fall 2004, a series of lectures

ScopeThis lecture series is given at a time when High Performance Computing is moving to the foreground of people’s attention. At UF the particular event is the installation this Summer of a large Dell cluster that is phase 1 of a Grid for Research at University Florida (GRUF).The intended audience is interested faculty, researchers, post-doctoral associates, graduate students and undergraduates.

To accommodate as much as possible the time constraints of such an audience, the course is structured as a series of stand-alone lectures. Making each lecture independent of all others will allow more people to attend the parts that benefit them most. Thus faculty can attend some or all lectures and learn the terminology, basic concepts and principles to plan the use of High Performance Computing in their research and teaching projects.

The second component of the course comes in the form of home work assignments. These will be substantial and it is expected that all post-docs, graduate and undergraduate students will work them in detail. Participants can come and see the instructor individually or in small groups to ask questions and discuss problems or to explore certain topics as far as they wish.

  • instructor: Erik Deumens, deumens at qtp.ufl.edu
  • format Every lecture will address a well-defined topic so that it useful on its own. It is thus not required to attend all lectures to learn something, but highly recommended to get the most out of the course. This format will allow faculty to attend and at least become familiar with the concepts. The homework problems should be worked to get full benefit from the lecture series.
  • schedule: A total of 16 lectures, one lecture per week starting the week of August 30 through the week of December 6 on Wednesday 6th period 12:50 pm – 1:40 pm
  • location NPB 1002 (New Physics Building)
  • registrar information The lecture series is open to everyone. However, for students wishing to take the class for credit, it is possible to register for the one credit course PHY 5905 Section 7621 “Topics in High Performance Computing”
  • requirements: Basic familiarity with computers and with one programming language such as Python, Java, C++, C, Fortran.
  • notes: Notes should be taken during the lectures; reference material will be provided on the web and linked from this page.

Synopsis

This lecture series teaches the practical details to

  • understand architecture and design of modern high performance computers, clusters and grids;
  • manage computations on them;
  • create reliable and maintainable software that effectively uses them, exploiting
    1. multiple processors on shared memory systems with OpenMP and POSIX Threads,
    2. multiple nodes in clusters and grids using the MPI message passing standard.

Syllabus

  1. Lecture 1 (Aug 30): Hardware Architecture of HPC systems
    Audience: everyone
    Hardware architecture of high performance computing:

    • nodes: processors, cache, RAM, disks, RAID
    • networks: Ethernet, Myrinet, Infiniband, FibreChannel, iSCSI, SAN
  2. Lecture 2 (Sep 14): Software Architecture HPC systems
    Audience: everyone
    Software architecture of high performance computing:

    • nodes: operating systems (Linux, AIX, Solaris, Windows), interpreters (Python, Perl, Java), compilers (C/C++/C#, Fortran, Ada), libraries (BLAS, LAPACK), OpenMP, POSIX Threads, interprocess communication (IPC)
    • networks: sockets, remote procedure call (RPC), network file system, CORBA
    • clusters: nodes, communication, storage, message passing interface (MPI) parallel file system, workload management systems
    • grids: clusters, middle ware (Globus)

    Access the system, start a job, monitor job evolution, manage job data.
    User model:

    • nodes: user id’s (authentication, authorization), interactive use, batch use (PBS, LoadLeveler)
    • clusters: head node, worker nodes, interactive use, batch use, data access (input, scratch, output)

    Work model:

    • grids: authentication, scheduling, data access

    homework 1: Configure a cluster purchase for your chosen problem.

  3. Lecture 3 (Sep 21): Classification of HPC Work
    Audience: everyone
    Finding best way to use the system for your HPC work

    • long running or large memory serial computation: need a single, powerful node
    • parameter space parallelism: many independent serial jobs
    • shared memory parallelism: multiple processors in one OS image access shared data
    • loosely coupled distributed memory parallelism: multiple processors in different OS images work on distributed data and share data with low intensity communication
    • intense communication distributed memory parallelism: multiple processors in different OS images work on distributed data and share data with frequent and high bandwidth communication
    • massive parallelism: very large number of processors cannot work on shared memory and must work on distributed data and even a little communication can cause problems because of the number of tasks involved.
    • need to read data: problem of data mining
    • need to write data: problem with output for visualization
    • need to write temporary data: problem large scratch space requirements such as electronci structure calculations

    homework 2: Classify a list of HPC jobs

  4. Lecture 4 (Sep 28): Programming is software engineering
    Audience: everyone
    The life of a program (not a programmer)

    • problem analysis and solution: the program implements an algorithm that provides a solution to some formulated problem
    • program design and prototyping: every program has a prototype, if you write the complete program in one session, you just created a completely satisfactory prototype.
    • writing and maintaining source code: use source code control software and editors to make this task systematic and consistent
    • managing complexity: use software engineering techniques such as data hiding, modules, software components, object oriented design and programming to keep control over the complexity of the project.
    • testing and validation: build your software from the start with the notion of verifiable tasks and tests that must be completed and run the test suite continuously.
  5. Lecture 5 (Oct 5): Programmer tools
    Audience: programmers

    • source: where to store (size, permissions, backup), revision control (CVS software)
    • compiling and linking: where to store (size, permisions, backup), version control (Makefile), compilers (GNU, vendor, flags, standard compliance, finding libraries)
    • running and debugging: where to store test data (size, permissions, backup), automation for consistency (scripts), interactive debuggers

    homework 3: Work through all the steps with a provided program.

  6. Lecture 6 (Oct 12): Algorithms
    Audience: everyone

    • definition: what is an algorithm
    • types of algorithms: quality of an algorithm, an algorithm that is too clever, complexity theory
    • implementation of algorithms:
  7. Lecture 7 (Oct 19): MPI programming part 1
    Audience: programmers
    Basic message passing

    • setup: MPI_init, MPI_finalize
    • messages: MPI_send MPI_recv
    • synchronization: MPI_barrier

    homework 4: Debug and run a given example program that implements a matrix multiply with minimal MPI.

  8. Lecture 8 (Oct 26): MPI programming part 2
    Audience: programmers
    Advanced message passing

    • communicators: groups and collective operations
    • asynchronous communication: MPI_isend, MPI_irecv, MPI_wait

    homework 5: Change the program of homework 5 to measure computation and communication times.

  9. Lecture 9 (Nov 2): OpenMP programming
    Audience: programmers
    Concepts and details for using OpenMP directives

    • directives:
    • scope of variables:
    • locks:

    homework 6: Compile and run an OpenMP Fortran 90 program.

  10. Lecture 10 (Nov 9): Threads programming
    Audience: programmers
    Concepts and details for using POSIX threads

    • threads: creation and termination
    • synchronization: by initialization, with locks and mutexes

    homework 7: Compile and run a C program using POSIX threads.

  11. Lecture 11 (Nov 16): The GRUF DELL cluster
    Audience: everyone
    By this time the cluster should be partially available for some restricted use by some people. The QTP clusters will also be discussed as examples.

    • hardware: nodes, network, storage
    • system software: operating system, compilers, libraries, grid middle ware
    • user software: applications and access methods
    • policies and practices: rules for access and use
  12. Lecture 12 (Nov 23): Managing clusters and grids
    Audience: system managers

    • functionality: operating system tools, reliability, available software
    • security: allow, monitor and control access
    • performance: monitoring to find and eliminate bottle necks
  13. Lecture 13 (Nov 30): Software engineering
    Audience: programmers
    Case study of a simple program for Monte Carlo calculations, comparing “writing a program” to “software engineering”.
  14. Lecture 14 (Dec 7): The future of HPC
    Audience: everyone
    Analysis of the evolution of computing and trends in technology.

    • hardware electronics and processors and quantum computing
    • systems clusters, grids, autonomous computing
    • software user interfaces and computing services, languages, tools

Reference material

General High Performance Computing A good place to start.

  1. High Performance Computing (2nd Edition), Kevin Dowd and Charles Severance, O’Reilly and Associates 1998.
  2. Using MPI: Portable parallel programming with the message-passing interface, William Gropp, Ewing Lusk, Anthony Skjellum, MIT Press, 1997.
  3. Parallel Programming in OpenMP, Robit Chandra, Leonardo Dagun Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon, Academic Press 2001.
Clusters and grids Recently numerous books on grids and clusters have appeared. Any of these provides good background material or reference material.

  1. The GRID: Blueprint for a new Computing Infrastructure, Edited by Ian Foser and Carl Kesselman, Morgan Kaufman 2003.
  2. GRID Computing: Making the global infrastructure a reality, Edited by Fran Berman, Geoffrey C. Fox, Anthony J. G. Key, John Wiley 2003
General programmer references Basic reference material for software engineers.

  1. Efficient C++ Performance Programming Techniques, Don Bulka, David Mayhew, Addison-Wesley, 2000.
  2. MPI – The complete reference, Volume 1, The MPI core (2nd edition), Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker, Jack Dongarra, Marc Snir, MIT Press, 1998.
  3. MPI – The complete reference, Volume 2: The MPI extension, William Gropp, Steven Huss-Lederman, Andrew Lumsdaine, Ewing Lusk, Bill Nitzberg, William Saphir, Marc Snir, MIT Press, 1998.
  4. Programming Python, Mark Lutz, O’Reilly and Associates, 1996.
  5. Multithreaded Programming with Pthreads, Bill Lewis and Daniel J. Berg, Sun Microsystems Press (Prentice Hall), 1998.
  6. Pthreads Programming, Bradford Nichols, Dick Buttlar and Jacqueline Proulx Farrell, O’Reilly and Associates, 1996.
Detailed programmer references For serious work on complex software.

  1. Fortran 95 Handbook: complete ISO/ANSI reference, J. C. Adams, W. S. Brainerd, J. T. Martin, B. T. Smith, J. L. Wagener, MIT Press, 1997.
  2. The C++ Programming Language (3rd edition), Bjarne Stroustrup, Addison-Wesley, 1997.
  3. Modern C++ Design: Generic Programming and Design Patterns Applied, Andrei Alexandrescu, Addison-Wesley, 2001.
  4. Using the STL: The C++ Standard Template Library, Robert Robson, Springer, 1997.
  5. Parallel Programming using C++, Bregory V. Wilson and Paul Lu (editors), MIT Press, 1996.
  6. Multithreaded Programming with Windows NT, Thuan Q. Pham and Pankaj K. Garg, Prentice Hall, 1995.
  7. Practical Parallel Programming analysed, Gregory V. Wilson, MIT Press, 1995.
  8. Designing and building parallel programs: Concepts and tools for parallel software engineering, Ian Foster, Addison-Wesley, 1995.
Object oriented design Advanced references for object programming.

  1. UML and C++: A practical guide to object-oriented development, Richard C. Lee and William M. Teppenhart, Prentice Hall, 2001.
  2. Object-Oriented software construction, Bertrand Meyer, Prentice Hall, 1997.
Software engineering Advanced references for construction of complex software with teams of developers.

  1. Code Complete, Steve McConnel, Microsoft Press 1993.
  2. Rapid Development, Steve McConnel, Microsoft Press 1996.
  3. Extreme Programming Explained, Kent Beck, Addison Wesley Professional 1999.