Newsgroups: comp.parallel.mpi,comp.parallel.pvm
From: Kamran Karimi <kamran@wallybox.cei.net>
Subject: Re: Why explicit message passing??
Organization: World Lynx, Inc.
Date: Mon, 6 Apr 1998 22:44:29 +0000
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Message-ID: <Pine.LNX.3.96.980406224312.5511B-100000@wallybox.cei.net>

Hi,

 This is an answer to a private email, but I thought it might help clarify
some of the concepts I've been refering to in this thread, like what I
exactly mean by Transparent Distributed Shared Memory.

 Some one wrote:

>>  Could I ask the people in this newsgroup why they use (or are interested in)
>> explicit message passing systems and continue to use FORTRAN? 
>
>1.0  Why did you include FORTRAN in this question?

 FORTRAN is the first high level language designed for scientific work. It
did not obey many software engineering principles that are the norm nowadays.
It did not even support recursive functions. FORTRAN 90 has made improvements.
I am not an enemy of FORTRAN, but for me, clinging to it implies non-modern
tendencies: One can guess about someone's preferences by seeing that he
uses this language.

 However, my main concern is about explicit message passing.

>> What are the
>> reasons of using PVM or MPI instead of distributed shared memory or
>> distributed object oriented systems?
>
>2.0  T3E is a distributed memory system.
>        What is the difference between "distributed memory" vs
>        "distributed SHARED memory"?

 Distributed memory exists in a number of stand-alone computers (this
"computer" can be just a CPU and some memory). You get that when you connect
the computers together with some sort of network. These machines can not
access each other's memory directly.

 Shared memory is directly accessable by more than one CPU. This is found
in, for example, many dual or qaud processor PCs.

 Distributed shared memory tries to simulate shared memory by using
distributed memory. Many distributed shared memory systems require the
programmer to declare when his access to the shmem will start or has ended
(so that the contents of the shared memory can be kept consistent), but
"Transparent" distributed shared memory makes the access to local memory and
remote memory look the same. The programmer does not need to do anything.

 Here is an example in matrix multiplication. If you want to multiply matrix
a[10x10] with matrix b[10x10] and put the results in c[10x10], you write
something like this, which is very much like the definition of matrix
multiplication in any high school math book:

(in Pascal)
FOR i := 1 TO 10 DO
 FOR j := 1 TO 10 DO
 BEGIN
  c[i,j] := 0;
  FOR k := 1 TO 10 DO
   c[i,j] := c[i,j] + a[i,k] * b[k,j];
 END

 If you want to run this algorithm with explicit message passing (PVM or
MPI), then you _have_ to transfer the relavant parts of a and b to other
machines yourself, let the computation be done and then bring the results
back to c. This is a distraction for the programmer (remember that he just
wants to execute the algorithm, not getting involved in data transfers). In
Transparent distributed shared memory, you don't have to do that. The
_system_ will note that the needed data are missing, brings them from
wherever they are, and then does the computation. So the above code would
execute in a distributed and non-distributed environment without any
changes!

 That is why transparent distributed shared memory is so much easier to use.
The main problem is that the system does not know about the program as much
as the programmer does, so it may not do the transfers optimally: It may
transfer more data than is actually needed, or bring some data to one machine
that is immediately needed in another computer. I hope these issues will
become less important as the processing speeds increase and network transfer
times decrease. Artificial Intelligence may also help us some day.

>>  Explicit message passing belongs to the stone age of the computer science.
>
>3. Can you distinquish between "message passing" and "explicit Message
>passing"?
>    (I also already understand the difference between "active" and
>"passive" message
>        passing, but could you tell me if that distinction is related to the
>        explict/non-explicit distinction?

 When computers do not have access to each other's memory, data _has_ to be
put in messages and transfered from one computer to another (maybe via
TCP/IP packets). There is no escape from that. The question is, who does
this transfer? In explicit message passing, the application programmer does
this. non-explicit (like transparent DSM) systems do this behind the scenes,
without the user's intervention, or in at least with less intervention.

 The trend in software has been to off-load from the programmers as much of
the work as possible. Transparent Distributed Shared Memory systems follow
this trend much better.

>>  Systems like PVM or MPI make it difficult for many people to enter the
>> distributed programming era. 
>
>4. Shmem is the native message passing tool on the T3E.  
>
>        Would you list shmem in the same category as MPI and PVM?

 I am not familiar with Cray computers, but I think T3Es use many Alphas.
Based on this, I guess shmem is a distributed shared memory library, and so
is different from PVM and PMI. Maybe some one else could provide a better
answer to this.

>Finally...
>
>        "First, let's clarify that the T3E is a distributed memory, shared
>nothing
> parallel computer. "
>
>
>So, from this it seems there can be:
>        a) Distributed shared memory
>        b) Distributed non-shared memory.
>
>        Can one have a [...]
>        c) Shared, non-distributed memory?
>
>        Can you give me a few examples of each?

 a) Distributed Shared Memory: DIPC for Linux. This is Transparent DSM.

 b) Distributed non-shared memory: Systems like PVM or MPI.

 c) Shared, non-distributed memory: Ordinary memory in PCs, that can be
    shared between threads and processes running in a single computer. This
    computer can be a shared memory multi-processor.


-Kamran