Welcome to CCSM

expand   collapse










 

Community Atmosphere Model (CAM) Performance

Scaling

IBM SP

Compaq-Alpha cluster

SGI-Origin 3000

 

 

 

 

Performance on all machines

Platform Tasks Threads Wallclock Time (sec)/simulation-day
IBM-SP (Hybrid) 2 4 196.6
IBM-SP (Hybrid) 4 4 111.5
IBM-SP (Hybrid) 8 4 64.9
IBM-SP (Hybrid) 16 4 39.8
IBM-SP (Pure MPI) 2 1 594.5
IBM-SP (Pure MPI) 4 1 337.0
IBM-SP (Pure MPI) 8 1 187.8
IBM-SP (Pure MPI) 16 1 108.9
IBM-SP (Pure MPI) 32 1 67.7
IBM-SP (Pure MPI) 64 1 48.9
SGI-O3 (Pure Open-MP) 1 16 85.3
SGI-O3 (Pure Open-MP) 1 32 57.2
SGI-O3 (Pure Open-MP) 1 64 49.9
SGI-O3 (Pure MPI) 16 1 80.8
SGI-O3 (Pure MPI) 32 1 47.9
SGI-O3 (Pure MPI) 64 1 31.0
Compaq-Alpha (Hybrid) 2 4 112.8
Compaq-Alpha (Hybrid) 4 4 64.7
Compaq-Alpha (Hybrid) 8 4 37.4
Compaq-Alpha (Pure MPI) 4 1 181.3
Compaq-Alpha (Pure MPI) 8 1 107.5
Compaq-Alpha (Pure MPI) 16 1 62.3
Compaq-Alpha (Pure MPI) 32 1 38.8
Linux-Lahey (Pure MPI) 2 1 714.4
Linux-PGI(Pure Open-MP) 1 2 559.3
Solaris (Pure MPI) 2 1 1473.6

All tests were performed with CAM2.0.dev12 in August, 2002.

The time-metric is the wall-clock time for "stepon" for a standard T42 Eulerian-dynamics 10 day simulation, according to the standard timing library diagnostic that comes as part of CAM2.

Tests where the number of threads is one, were performed with Open-MP turned off. Likewise, tests where the number of tasks is one, were performed with SPMD (MPI) turned off.

  • Compaq-Alpha results were performed on colt.ornl.gov (@ Oak Ridge National Lab) (OSF1 V5.1 732, f90 X5.4A-1684, 16 nodes, With four 667 MHz Alpha EV67 processors and 2GB RAM per node)
  • IBM-SP results were performed on blackforest.ucar.edu (AIX 4.3.3.79, F90 7.1.1.2, 293 WinterHawk II nodes with four 64-bit 375-MHz POWER3 CPUs and 2 GB of memory per node.)
  • Linux-Lahey results were performed on apache.cgd.ucar.edu with the Lahey F95 compiler (Lahey lf95 L6.10a, 2-cpu Intel Pentium-III 1133MHz, 770MB RAM)
  • Linux-PGI results were performed on apache.cgd.ucar.edu with the Portland-group F90 compiler (pgf90 3.3-2, 2-cpu Intel Pentium-III 1133MHz, 770MB RAM)
  • SGI-O3 results were performed on chinook.ucar.edu (MIPSpro 7.30, 128 500-MHz R14000 CPUs, 64 GB distributed shared memory )
  • Solaris results were performed on sanitas.cgd.ucar.edu (Sun WorkShop 6 update 2, Solaris 5.6, 2-CPU ultra-2, 296MHz processor, with 769 MB RAM)