From: Jean-Marc Adamo <adamo@cpe.fr>
Newsgroups: comp.parallel.mpi
Subject: Multi-threaded, Object-Oriented MPI-Based Message Passing
    Interface: The ARCH Library
Date: Tue, 06 Oct 1998 12:57:18 +0200
Organization: CPE Lyon
Message-Id: <3619F78E.5E559581@cpe.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Cc: adamo@cpe.fr


For Immediate Release
October, 1998



New MPI-based tools for Parallel Software Development:
The ARCH library.



1. Motivation
   ----------
Concurrency appears to be essential in a variety of applications.
Processes in some parallel/distributed applications naturally come 
into sets of concurrent activities. Other applications may rely on 
sophisticated distributed algorithms (e.g. dynamic load-balancing, 
distributed completion checking) that typically resort to multi-
threaded programming.

ARCH is a library of tools designed and developed on top of MPI 
that allows writing multi-threaded codes within MPI processes. ARCH 
threads are lightweight processes defined and run inside MPI 
processes as asynchronous event-driven activities.

ARCH has been written with C++ that was not simply used as a 
development language. Instead, It was attempted to transmit through 
the library the object-oriented style of program development. The 
library offers eight sets of classes, organized around the central 
notion of threads, and respectively dedicated to (see further 
details in section 2):

	- Threading,
	- Synchronous point-to-point communication,
	- Asynchronous point-to-point communication,
	- Distributed data and global access functions,
	- Global pointers and spread arrays,
	- Thread-compatible process group definition,
	- Thread-compatible collective communication functions,
	- Thread-compatible Parallel I/Os.

The library package, as well as application codes in several 
fields of interest (global completion detection, distributed 
process serialization, parallel combinatorial optimization, 
image segmentation), can be found in the ARCH web pages: 
http://www.cpe.fr/~arch.

ARCH has been installed and run so far on diverse platforms, 
including the Cray T3E, the IBM SP2, PC-networks under Linux 
or Windows NT.

Details on the Library and application code design can be found
in the book:

"Multi-threaded, Object-Oriented MPI-Based Message Passing 
Interface: The ARCH Library", by Jean-Marc Adamo, 
Kluwer Academic Publishers, 1998.



2. For those who whish further details
   -----------------------------------
Here are details about the ARCH design philosophy.

	- Threading

ARCH offers two thread classes: the Thread and S_Thread classes. 
The Thread objects are co-routines. The Thread class has member 
functions intended for co-routine construction, setup, scheduling, 
suspension, yielding and destruction. The S_Thread objects are co-
routine with restricted behavior targeted on structured concurrent 
program writing. By exclusively using s_threads in conjunction 
with synchronous point-to-point communication functions (see 
below), concurrent programs can be written in a structured style 
similar to OCCAM’s. For illustration, refer to the book previously 
mentionned. Several applications are presented there in which the 
process logics are displayed in a way similar to an electronic 
board design.

	- Synchronous point-to-point communication

The synchronous communication mode relies on specific data objects, 
called synchronous channels. The synchronous channels are data 
structures used by the library to monitor synchronous communication. 
A thread in an MPI process can synchronize and communicate via a 
synchronous channel with another thread run by the same or another 
process. Synchronous message passing is based on a rendezvous 
mechanism. The synchronous communication functions generate events 
that are monitored by the thread system which automatically takes 
care of thread suspension and scheduling.

	- Asynchronous point-to-point communication

The asynchronous communication mode relies on asynchronous channels. 
The asynchronous channel classes have been designed according to the 
same philosophy as their synchronous counterparts. They simply 
differ in the semantics of the communication functions they supply. 
The library provides two communication functions. Their semantics 
are similar to the semantics of MPI non-blocking point-to-point 
communication functions. However, the completion of a call is 
handled differently with ARCH. The thread is not expected to poll 
for testing the completion of a send or receive. The thread system 
implicitly and automatically takes care of it. At channel 
construction, the thread has to define what should be done upon send 
and receive completion and tell it to the channel by passing 
completion handlers. Then the completion events are automatically 
(and asynchronously) caught by the thread system, which silently 
picks the right completion handler and executes.

	- The 'Global' distributed data type and global (i.e. possibly
          non-local access functions)

The library supplies a distributed data type denoted as 'Global' 
dedicated to data global access. The 'Global' type is aimed at 
defining distributed data structures whose members (called G-
members) can remotely be accessed via global read/write functions. 
Any 'Global' is attached an access type, which can be direct (local 
pointer to G-member is statically known) or indirect (local pointer 
to G-member is not statically known hence it is to be dynamically 
computed, which is done via a sort of remote procedure call). The 
type 'Global' supplies two types of global read and write functions: 
blocking and non-blocking. A thread executing a call to a blocking 
function is suspended until completion of the call is detected. 
Meanwhile, other threads waiting in the scheduling queues can run the 
CPU. The completion of the call is automatically (and asynchronously) 
caught by the thread system which silently re-schedules the thread 
for subsequent execution. A thread executing a call to a non-blocking 
function is required to supply the call with a completion handler. 
The call returns immediately, so the thread can go on executing. 
The completion of the call is automatically caught by the library 
that silently (and asynchronously) executes the completion handler.

	- Global pointers and Spread Arrays

A global pointer is a data object aimed at pointing to any memory 
location -whether remote or local- within a communication universe. 
A global pointer essentially is dedicated to global reading and 
writing. The global pointer class is built upon the class 'Global'. 
A spread array is a data structure of any dimension that is spread 
over a process group according to a user specified spreading rule. 
The data items in a spread array can symbolically be accessed via 
their coordinates, regardless of their location within the process 
group. The global pointer and spread array classes have the same 
relationship as the regular C++ pointers and arrays. In particular, 
a spread array also is a global pointer. A global pointer can be 
dereferenced, incremented, and so forth.

	- ARCH process groups

The MPI functions that create communicators and operate on them 
involve collective operations that may be incompatible with ARCH 
threading. The library supplies a set of classes dedicated to 
process group and communication universe creation and manipulation 
specially designed to be safe with respect to ARCH threading.

	- ARCH collective functions

ARCH threads are lightweight processes that are defined and run 
inside MPI-type processes. The MPI collective functions are 
blocking, hence they are incompatible with ARCH threading. The 
library supplies a set of non-blocking collective functions 
specially designed to be safe with respect to the ARCH threads. 
The ARCH collective functions are similar to those in MPI(-2) but 
they are blocking at the thread level, not at the process level.

	- Parallel I/Os

The classes dedicated to the parallel I/Os rely on ROMIO from 
Argonne (Unix release) or WMPI 1.01 from the University of 
Coimbra, Portugal (Windows NT release). They essentially provide a 
harness that allows the ARCH threading system to handle ROMIO I/Os.


3. For more information 
   --------------------

Visit the ARCH web pages, http://www.cpe.fr/~arch.


-- 
________________________________________________________________________
Jean-Marc Adamo, 			 	| Tel.(33)04 72 44 84 81
Universite Claude-Bernard de Lyon & Ecole	| Fax.(33)04 72 43 15 91
Superieure de Chimie, Physique et Electronique,	|
Laboratoire Image, Signal et Accoustique (LISA),|
Bat. 308, 43 bd du 11 novembre 1918, B.P. 2077,	| adamo@cpe.fr 
La Doua, 69616 Villeurbanne cedex - France 	| adamo@univ-lyon1.fr
________________________________________________________________________

