High Performance Computing Topics Homework

Homework

  1. Homework 1 Lecture 1 and 2: Architecture of HPC systems
    Configure a cluster purchase for your chosen problem. What is the “best” hardware and software configuration depends strongly on the kind of problem(s) you want to solve. Analyse the problem in your research area, or pick one of the following: Quantum chemistry using DFT, Molecular dynamics on bio-molecules, finite element of a bridge, an airplane, a car.
    Weigh each factor and their relative proportion and importance:

    • CPU need
    • I/O speed
    • storage (I/O capacity)

    Then specify the cluster hardware and software

    • type of CPU (32 or 64 bit, Intel, PowerPC or other architecture)
    • number of CPUs and size of RAM and disk per node
    • network switch
    • cluster storage: local disk, global disk, RAID type
    • special software: MPI, MFS, parallel file system, software RAID
  2. Homework 2 Lecture 3: Classification of HPC work
    Find the best mode to execute a list of tasks

    1. Find a given (short) gene sequence in a very large set of genes.
    2. Hurricane simulation and path prediction.
    3. Find the best match between a fingerprint (image) in a fingerprint database of several hundred thousand.
    4. Find the boundaries of all organs in a database of many MRI scans of human midsections.

    Describe how each task is classified, what its resource requirements are and how you address each need in your proposed mode to execute the tasks.

  3. Homework 3 Lecture 4 and 5: Programmer tools
    Work through all the steps with a provided program.

    1. Make a work directory
      mkdir $HOME/import
      cd $HOME/import
    2. Download the files for the software project, you may have to hold down the Shift key to force your browser to download the files instead of displaying them: poisson.cpp, Makefile, testpoisson, test.correct
    3. Create a CVS repository:
      mkdir $HOME/cvsrepository
      export CVSROOT=$HOME/cvsrepository
      cvs init
    4. Import the downloaded files into the repository
      cvs import -m “Solve Poisson and Helmholtz equation.” poisson Homework3 Serialversion
    5. Create a working directory of the project
      rm poisson.cpp Makefile testpoisson test.correct
      cvs co poisson
      cd poisson
      cvs log

      shows the status of the source code: names, dates created, who created them, etc. Look at the files with an editor or with more or less
    6. Build the program and verify that it works and gives correct answers
      make
      ./testpoisson

      The program works correctly when the testpoisson script prints “Poisson test passed”.
    7. Submit a brief report to the deumens@qtp.ufl.edu.
  4. Homework 4 Lecture 7: MPI programming part 1
    Debug and run a given example program that implements a matrix multiply with minimal MPI

    1. Continue as in homework 3. First download the program for this homework.
      If you prefer, you can rewrite the program in the language of your choice, such as Fortran 77, Fortran 90 or C 89. Your can start from the C++ 98 version supplied below.
      If you do not have access to a system with MPI, send an e-mail to deumens@qtp.ufl.edu to request a class account on the QTP system and run on the SIMU or XENA III cluster.
    2. Download the files for the software project, you may have to hold down the Shift key to force your browser to download the files instead of displaying them: matrixserial.cpp, matrix.cpp, Makefile
    3. You must edit the Makefile to get the correct call for compilation with C++ and MPI libraries, or Fortran or C if you choose tp rewrite the program.
      Import the new files into your CVS repository as a new module “matrix” as in homework 3. Check the final working version into your CVS repository.
    4. Figure out how to set up the LAM-MPI or MPICH machine file or the POE host.list file or LAM MPI universe (depending on which MPI you are using) to run your program. For example with MPICH:
      mpirun -machinefile list -np 4 ./matrix
      Run both the serial and parallel program.
    5. MPI_Broadcast Replace the MPI send and receive with MPI broadcast. Edit the file and fix the MPI calls. Use the online documentation or man page for each call to find the arguments and code them correctly.
      Compile, debug and run the program and make sure the answer is the same.
    6. Square blocks Change the program to use square blocks instead of rows to divide the matrices.
      Compile, debug and run the program and make sure the answer is the same.
    7. Send the final programs, with broadcast and with square blocks, to deumens@qtp.ufl.edu.
  5. Homework 5 Lecture 8: MPI programming part 2
    Time the program created in homework 4 using the method explained in the lecture.

    1. Modify the program matrix.cpp using code similar to the following to obtain wall-clock timings for computation and comminication:
        double newtime=0.;
        double oldtime=0;
        double othertime=0.;
        double matrix_workt=0.;
        double matrix_msgt=0.;
        long matrix_count=0;
        newtime=MPI_Wtime();
        oldtime=newtime;
      
        // other code
      
        // inside the loop for doing work
      
        newtime=MPI_Wtime();
        othertime+=newtime-oldtime;
        oldtime=newtime;
      
        // do work here...
      
        newtime=MPI_Wtime();
        matrix_workt+=newtime-oldtime;
        oldtime=newtime;
      
        // do communication here...
      
        newtime=MPI_Wtime();
        matrix_msgt+=newtime-oldtime;
        oldtime=newtime;
      
        // do more work here...
      
        newtime=MPI_Wtime();
        matrix_workt+=newtime-oldtime;
        oldtime=newtime;
      
        // other code
      
        // Generate the timings report on each task
        //matrix_workt/=matrix_count;
        cout << "Task " << ME << " matrix wrk " << matrix_workt
             << " matrix msg " << matrix_msgt
             << " matrix_count " << matrix_count << endline
             << "Task " << ME << " othertime " << othertime << endline;
      
    2. Compile, run and debug the modified program.
    3. Submit a brief report to the deumens@qtp.ufl.edu.
    4. Extra You can let each task send the results to the master task (task 0 by convention), and it can then compute averages and standard deviation and print out the timing report in a more organized and more meaningful way.
  6. Homework 6 Lecture 9: OpenMP programming
    Compile and run an OpenMP Fortran 90 program.

    1. First download the program for this homework.
      If you prefer, you can rewrite the program in the language of your choice, such as Fortran 77, Fortran 90 or C 89. Your can start from the C++ 98 version supplied below.
      If you do not have access to a system with OpenMP, send an e-mail to deumens@qtp.ufl.edu to request a class account on the QTP system and run on BUDDY, the SIMU or XENA III cluster.
    2. Download the files for the software project, you may have to hold down the Shift key to force your browser to download the files instead of displaying them: openmp.f90, Makefile
    3. Set the OpenMP environment variable OMP_NUM_THREADS to the number of threads and run the program.
    4. Blocks Implement the blocked matrix of homework 4 multiplication algorithm in this OpenMP program.
    5. Send the modified program, with the blocked algorithm to deumens@qtp.ufl.edu.
  7. Homework 7 Lecture 10: POSIX threads programming
    Compile and run a C program using POSIX threads.

    1. First download the program for this homework.
      If you prefer, you can rewrite the program in the language of your choice, such as Fortran 77, Fortran 90 or C 89. Your can start from the C++ 98 version supplied below.
      If you do not have access to a system with OpenMP, send an e-mail to deumens@qtp.ufl.edu to request a class account on the QTP system and run on BUDDY, the SIMU or XENA III cluster.
    2. Download the files for the software project, you may have to hold down the Shift key to force your browser to download the files instead of displaying them: posix.c, Makefile
    3. Compile the program and run it with the folloing options:
      Use 4 threads, process-wide scheduling, and perform 25 iterations of 30 counts each
      ./posix -p -n 4 -i 25 -c 30
      Use 5 threads, system-wide scheduling, and perform 25 iterations of 30 counts each
      ./posix -s -n 5 -i 25 -c 30
    4. Blocks Implement the blocked matrix of homework 4 multiplication algorithm into the work portion of the program.
    5. Send the modified program, with the blocked algorithm to deumens@qtp.ufl.edu.