Performance measurements are never easy to perform or analyze with precision as they tend to vary greatly depending upon configuration and run-time controls. In most cases, some degree of knowledge of the code base is required to achieve consistent and accurate measurements. This is particularly true of Open MPI as the range of control is very large and the code is shipped in a generic state to ensure operability out-of-the-box across a wide range of platforms and environments.
Unfortunately, the various benchmark codes are actually rather easy to run, thus leading to sometimes erroneous results if the user isn't familiar with and/or doesn't take the time to determine the optimal configuration and control settings. This section is intended to:
As always, users are reminded that performance benchmarks rarely provide an accurate predictor of actual application performance. They are simply a sometimes useful way of measuring the relative behavior of a specific feature that may or may not be relevant to your application. Accordingly, the OMPI developers don't exert a lot of effort optimizing benchmark performance, preferring instead to focus on providing features of interest to users and researchers, while maintaining good application performance.
Please feel free to contact the community on the mailing lists with questions regarding tuning your cluster. Performance benchmark contributions are welcome.Developer Trunk (v1.9)