Introducing Chapel: A Programming Language for Productive Parallel Computing from Laptops to Supercomputers

202 116 6MB

English Pages [55] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Introducing Chapel: A Programming Language for Productive Parallel Computing from Laptops to Supercomputers

Citation preview

INTRODUCING CHAPEL:

A PROGRAMMING LANGUAGE FOR PRODUCTIVE PARALLEL COMPUTING FROM LAPTOPS TO SUPERCOMPUTERS Brad Chamberlain, Distinguished Technologist LinuxCon, May 11, 2023

PARALLEL COMPUTING IN A NUTSHELL Parallel Computing: Using the processors and memories of multiple compute resources • in order to run a program… – faster than we could otherwise – and/or using larger problem sizes

Compute Node 0

Compute Node 1

Compute Node 2

Compute Node 3

Processor Core Memory 2

PARALLEL COMPUTING HAS BECOME UBIQUITOUS Traditional parallel computing:

Today:

• supercomputers • commodity clusters

Compute Node 0

• multicore processors • GPUs • cloud computing

Compute Node 1

Compute Node 2

Compute Node 3

Processor Core Memory 3

OAK RIDGE NATIONAL LABORATORY'S FRONTIER SUPERCOMPUTER TOP500

• 74 HPE Cray EX cabinets • 9,408 AMD CPUs, 37,632 AMD GPUs • 700 petabytes of storage capacity, peak write speeds of 5 terabytes per second using Cray ClusterStor storage system

GREEN500

HPL-MxP

1 2 1 Built by HPE, ORNL’s Frontier supercomputer is #1 on the TOP500.

1.1 exaflops of performance.

• HPE Slingshot networking cables providing 100 GB/s network bandwidth.

Source: May 30, 2022 Top500 release, HPL-MxP mixed-precision benchmark (formerly HPL-AI).

Built by HPE, ORNL’s TDS and full system are ranked #2 & #6 on the Green500.

62.68 gigaflops/watt power efficiency for ORNL’s TDS system, 52.23 gigaflops/watt power efficiency for full system.

Built by HPE, ORNL’s Frontier supercomputer is #1 on the HPL-MxP list.

7.9 exaflops on the HPL-MxP benchmark (formerly HPL-AI).

4

HPC BENCHMARKS USING CONVENTIONAL PROGRAMMING APPROACHES STREAM TRIAD: C + MPI + OPENMP #include #ifdef _OPENMP #include #endif

if (!a || !b || !c) { if (c) HPCC_free(c); if (b) HPCC_free(b); if (a) HPCC_free(a); if (doIO) { fprintf( outFile, "Failed to allocate memory (%d).\n", VectorSize ); fclose( outFile ); } return 1; }

static int VectorSize; static double *a, *b, *c; int HPCC_StarStream(HPCC_Params *params) { int myRank, commSize; int rv, errCount; MPI_Comm comm = MPI_COMM_WORLD; MPI_Comm_size( comm, &commSize ); MPI_Comm_rank( comm, &myRank ); rv = HPCC_Stream( params, 0 == myRank); MPI_Reduce( &rv, &errCount, 1, MPI_INT, MPI_SUM, 0, comm ); }

return errCount;

int HPCC_Stream(HPCC_Params *params, int doIO) { register int j; double scalar;

#ifdef _OPENMP #pragma omp parallel for #endif for (j=0; j