Welcome to CCSM
expand
collapse
Community Atmosphere Model (CAM) Performance
Scaling
IBM SP

Compaq-Alpha cluster

SGI-Origin 3000

Performance on all machines
| Platform | Tasks | Threads | Wallclock Time (sec)/simulation-day |
| IBM-SP (Hybrid) | 2 | 4 | 196.6 |
| IBM-SP (Hybrid) | 4 | 4 | 111.5 |
| IBM-SP (Hybrid) | 8 | 4 | 64.9 |
| IBM-SP (Hybrid) | 16 | 4 | 39.8 |
| IBM-SP (Pure MPI) | 2 | 1 | 594.5 |
| IBM-SP (Pure MPI) | 4 | 1 | 337.0 |
| IBM-SP (Pure MPI) | 8 | 1 | 187.8 |
| IBM-SP (Pure MPI) | 16 | 1 | 108.9 |
| IBM-SP (Pure MPI) | 32 | 1 | 67.7 |
| IBM-SP (Pure MPI) | 64 | 1 | 48.9 |
| SGI-O3 (Pure Open-MP) | 1 | 16 | 85.3 |
| SGI-O3 (Pure Open-MP) | 1 | 32 | 57.2 |
| SGI-O3 (Pure Open-MP) | 1 | 64 | 49.9 |
| SGI-O3 (Pure MPI) | 16 | 1 | 80.8 |
| SGI-O3 (Pure MPI) | 32 | 1 | 47.9 |
| SGI-O3 (Pure MPI) | 64 | 1 | 31.0 |
| Compaq-Alpha (Hybrid) | 2 | 4 | 112.8 |
| Compaq-Alpha (Hybrid) | 4 | 4 | 64.7 |
| Compaq-Alpha (Hybrid) | 8 | 4 | 37.4 |
| Compaq-Alpha (Pure MPI) | 4 | 1 | 181.3 |
| Compaq-Alpha (Pure MPI) | 8 | 1 | 107.5 |
| Compaq-Alpha (Pure MPI) | 16 | 1 | 62.3 |
| Compaq-Alpha (Pure MPI) | 32 | 1 | 38.8 |
| Linux-Lahey (Pure MPI) | 2 | 1 | 714.4 |
| Linux-PGI(Pure Open-MP) | 1 | 2 | 559.3 |
| Solaris (Pure MPI) | 2 | 1 | 1473.6 |
All tests were performed with CAM2.0.dev12 in August, 2002.
The time-metric is the wall-clock time for "stepon" for a standard T42 Eulerian-dynamics 10 day simulation, according to the standard timing library diagnostic that comes as part of CAM2.
Tests where the number of threads is one, were performed with Open-MP turned off. Likewise, tests where the number of tasks is one, were performed with SPMD (MPI) turned off.
- Compaq-Alpha results were performed on colt.ornl.gov (@ Oak Ridge National Lab) (OSF1 V5.1 732, f90 X5.4A-1684, 16 nodes, With four 667 MHz Alpha EV67 processors and 2GB RAM per node)
- IBM-SP results were performed on blackforest.ucar.edu (AIX 4.3.3.79, F90 7.1.1.2, 293 WinterHawk II nodes with four 64-bit 375-MHz POWER3 CPUs and 2 GB of memory per node.)
- Linux-Lahey results were performed on apache.cgd.ucar.edu with the Lahey F95 compiler (Lahey lf95 L6.10a, 2-cpu Intel Pentium-III 1133MHz, 770MB RAM)
- Linux-PGI results were performed on apache.cgd.ucar.edu with the Portland-group F90 compiler (pgf90 3.3-2, 2-cpu Intel Pentium-III 1133MHz, 770MB RAM)
- SGI-O3 results were performed on chinook.ucar.edu (MIPSpro 7.30, 128 500-MHz R14000 CPUs, 64 GB distributed shared memory )
- Solaris results were performed on sanitas.cgd.ucar.edu (Sun WorkShop 6 update 2, Solaris 5.6, 2-CPU ultra-2, 296MHz processor, with 769 MB RAM)