ACH & TSVV meeting on the ERO2.0 code (ACH-CIEMAT)

Europe/Berlin
Zoom

Zoom

Mervi Mantsinen
    • 1
      ERO2.0 Performance assessment
      Speakers: Joan Vinyals (BSC), Marta Garcia (BSC)

      Issues due to different lengths (times) of particle trajectories:

      1. OpenMP imbalance (between threads/CPUs within one MPI process)

      2. MPI imbalance (between MPI processes, all threads/CPUs of different MPI processes affected)

      Both types of imbalance become very pronounced for rear extremely long particle trajectories.

      Possible solutions (BSC team):

      1. Reduce cut-off time for long trajectories (reasonable time would be 0.1s compared to 1.0s used so far)

      2. Check the imbalance for much higher particle statistics: 500.000 particles in total, ~250 particles for maxMpiChunkSize, 64 nodes, 4 MPI tasks per node, 12 OpenMP threads per MPI task

      3. For MPI imbalance: Dynamic Load Balancing (DLB) library: give job of one MPI task to other unoccupied MPI tasks

      4. For OpenMP imbalance: advance communication: start new particles per thread without waiting for other threads

      Issues with serial parts (communication with the master process):

      Under test conditions (40.000 particles, maxMpiChunkSize=50, 32 nodes) serial communication takes ~40% of the total time. This fraction can become even larger when the issue with long trajectories will be solved!

      Particular code parts involved:
      SparseData2D/3D (.h)
      ero2::DensityManager::gatherParticleDensities2D/3D
      ero2::DensityManager::gatherEmissionDensities2D/3D

      To do:
      ERO team: assess the possibility of some parallelization of the serial communication part
      BSC team: compare the serial communication for the cases when no volumetric data has to be collected

      Further considerations regarding the serial part:
      - Can parallelization via domain decomposition help to parallelize the serial communication?
      - How much it will complicate the post-processing / visualization of results?