# ALGORITHMS FOR PARALLEL COMPUTING

Academic year and teacher

If you can't find the course description that you're looking for in the above list,
please see the following instructions >>

- Versione italiana
- Academic year
- 2022/2023
- Teacher
- WALTER BOSCHERI
- Credits
- 6
- Didactic period
- Secondo Semestre
- SSD
- MAT/08

#### Training objectives

- The main objective of this course is to provide students with the basics of parallel computing, for both distributed and shared memory architectures.

Acquired knowledge will be:

- characteristics of modern architecture for parallel computing;

- metrics for evaluating the performance of parallel codes through computing time measure (speedup, scaling and efficiency);

- introduction to MPI paradigm for distributed memory architectures: point-to-point and group communications, reduction and synchronization directives;

- introduction to OpenMP for shared memory architectures;

- examples of use of widely used libraries for linear algebra, factorization and eigenvalue problems: BLAS, ATLAS, LAPACK, MKL and their parallel implementation;

- examples of use of different profilers for CPU;

-scientific computing for the solution of partial differential equations

The main skills are:

- Analyze serial algorithms and identify a correct workload partitioning methodology to get a relevant speedup;

- Analyze serial algorithms and identify data access patterns to maximize aligned and contiguous memory accesses.

- Analyze code performances when executing on actual architectures. #### Prerequisites

- Students need to master the following prerequisites, provided by the course of "Numerical Analysis":

- QR factorization, LU factorization and its use for solving linear systems;

- Floating point representation, machine precision, finite arithmetic.

Familiarity with FORTRAN language programming is advised.

Personal notebook is highly recommended. #### Course programme

- Introduction (10 hours)

- introduction to parallel computing, CPU architecture, computational cost;

- introduction to FORTRAN;

- performance measures (speedup, scaling and efficiency), FORTRAN code to measure serial computational time;

- Example: implementation of conjugate gradient method for the solution of linear systems;

Using MPI (12 hours)

- introduction to MPI, scripts for compiling and launching parallel codes, locking and not-locking point-to-point communications, definition of deadlock, send-receive deadlock and it solution using sendreceive function, collective communications, reductions, synchronization directives;

- Example: implementation of the conjugate gradient method;

Insights

- OpenMP(8 hours)

Examples of parallelization for scientific computing applied to hyperbolic partial differential equations (18 hours):

- heat conduction equation on Cartesian meshes

- free surface Navier-Stokes equations on Cartesian meshes

- parallelization of general unstructured meshes using Metis library #### Didactic methods

- Main topics will be introduced thanks to slide presentations as well as practical examples.

Possible applications of parallel codes to the supercomputer CINECA. #### Learning assessment procedures

- Learning objectives level will be measured through an oral test, composed of two parts:

- talk on a project directly developed by the student;

- discussion on some topic of the course.

The instructions for the projects will be forwarded to the students during the teaching period. #### Reference texts

- - Lecture notes.

- Textbook: V. Kumar, A. Grama, A. Gupta, G. Karypis, Introduction to Parallel Computing: design and analysis of Algorithms, Addison-Wesley, 2003

- Suggested for further study: D.P. Bertsekas, J. Tsisiklis, Parallel and Distributed Computation: Numerical Methods, Trentine-Hall, 1989.