Communicating Process Architectures
Communicating Process Architectures 2014,
the 36th. WoTUG conference on concurrent and parallel systems, takes place from
Sunday August 24th to Wednesday August 27th 2014 and is hosted by the
Department of Computer Science, University of Oxford.
Accommodation and evening Fringe sessions will be at
St. Anne's College,
a few minutes walk from the Department.
WoTUG provides a forum for the discussion and promotion of concurrency ideas,
tools and products in computer science.
It organises specialist workshops and annual conferences that address
key concurrency issues at all levels of software and hardware granularity.
WoTUG aims to progress the leading state of the art in:
and to stimulate discussion and ideas on the roles concurrency will play in the future:
theory (programming models, process algebra, semantics, ...);
practice (multicore processors and run-times, clusters, clouds, libraries, languages, verification, model checking, ...);
education (at school, undergraduate and postgraduate levels, ...);
applications (complex systems, modelling, supercomputing, embedded systems, robotics, games, e-commerce, ...);
Of course, neither of the above sets of bullets are exclusive.
for the next generation of scalable computer infrastructure (hard and soft) and application,
where scaling means the ability to ramp up functionality (stay in control as complexity increases)
as well as physical metrics (such as absolute performance and response times);
for system integrity (dependability, security, safety, liveness, ...);
for making things simple.
A database of papers and presentations from WoTUG conferences is here.
The Abstract below has been randomly selected from this database.
Configurable Collective Communication in LAM-MPI
In another paper, we observed that PastSet (our experimental tuple space system) was 1.83 times faster on global reductions than LAM-MPI. Our hypothesis was that this was due to the better resource usage of the PATHS framework (an extension to PastSet that supports orchestration and configuration) due to a mapping of the communication and operations which matched the computing resources and cluster topology better. This paper reports on an experiment to verify this and represents on-going work to add some of the same configurability of PastSet and PATHS to MPI. We show that by adding run-time configurable collective communication, we can reduce the latencies without recompiling the application source code. For the same cluster where we experienced the faster PastSet, we show that Allreduce with our configuration mechanism is 1.79 times faster than the original LAM-MPI Allreduce. We also experiment with the configuration mechanism on 3 different cluster platforms with 2-, 4-, and 8-way nodes. For the cluster of 8-way nodes, we show an improvement by a factor of 1.98 for Allreduce.