WoTUG - The place for concurrent processes

Paper Details


%T Configurable Collective Communication in LAM\-MPI
%A John Markus Bjørndalen, Otto J. Anshus, Tore Aarsen, Brian Vinter
%E James S. Pascoe, Roger J. Loader, Vaidy S. Sunderam
%B Communicating Process Architectures 2002
%X In another paper, we observed that PastSet (our experimental
   tuple space system) was 1.83 times faster on global
   reductions than LAM\-MPI. Our hypothesis was that this was
   due to the better resource usage of the PATHS framework (an
   extension to PastSet that supports orchestration and
   configuration) due to a mapping of the communication and
   operations which matched the computing resources and cluster
   topology better. This paper reports on an experiment to
   verify this and represents on\-going work to add some of the
   same configurability of PastSet and PATHS to MPI. We show
   that by adding run\-time configurable collective
   communication, we can reduce the latencies without
   recompiling the application source code. For the same
   cluster where we experienced the faster PastSet, we show
   that Allreduce with our configuration mechanism is 1.79
   times faster than the original LAM\-MPI Allreduce. We also
   experiment with the configuration mechanism on 3 different
   cluster platforms with 2\-, 4\-, and 8\-way nodes. For the
   cluster of 8\-way nodes, we show an improvement by a factor
   of 1.98 for Allreduce.


If you have any comments on this database, including inaccuracies, requests to remove or add information, or suggestions for improvement, the WoTUG web team are happy to hear of them. We will do our best to resolve problems to everyone's satisfaction.

Copyright for the papers presented in this database normally resides with the authors; please contact them directly for more information. Addresses are normally presented in the full paper.

Pages © WoTUG, or the indicated author. All Rights Reserved.
Comments on these web pages should be addressed to: www at wotug.org

Valid HTML 4.01!