[ Editors note: the tr975.ps paper is available here at this site in
   /parallel/papers/surveys/par-prog-workstation-clusters.ps.Z
 djb1
]

Newsgroups: comp.parallel
From: Zhe Li <li@cs.columbia.edu>
Subject: PVM vs. Linda Summary
Organization: Yale University, Department of Computer Science, New Haven, CT
Date: 31 Jan 1994 11:24:37 -0500

Weeks ago I posted the above inquiry in the net and got some very good 
responses. The editted summary is attached below. Special thanks to people 
who gave pointers on the subject and carried out further discussions.

Cheers!

/Jay Li

------------------ cut here ---------------
From: douglas-craig@CS.YALE.EDU (Craig Douglas)
To: li@ground.cs.columbia.edu
Subject: Re:  PVM vs. Linda

Hi,

Try pub/tr975.ps on casper.cs.yale.edu.  Ease of use might be answered in
the paper, but probably not.  I have recently used an implementation of
MPI which will become the de facto message passing standard soon.  It was
easy to learn, did everything I wanted, and will exist on every companies'
computers shortly.  I believe a public domain library exists on netlib.

I have to admit no fondness for pvm.  I find it a real pain to use.  C-Linda
is expensive, but easy to use.  Glenda, which is C-Linda knockoff from
Mississippi State is nice and the price is right (free).  Look at MPI before
deciding between PVM and Linda.

Regards,
Craig Douglas

From: rfinch@water.ca.gov
To: Zhe Li <li@ground.cs.columbia.edu>
In-Reply-To: Zhe Li's message of Tue, 4 Jan 1994 14:22:53 GMT
Subject: PVM vs. C-linda

I'm interested in what you find.  A few years ago I parallelized a
calibration program using ISIS; at that time the supported version of
ISIS was PD, but now it is commercial.  Good software though.  I would
also consider Express.

From: Ron Kerr <R.Kerr@newcastle.ac.uk>
Subject: Re: Comparison of Linda and PVM
In-Reply-To: <199401041209.MAA23497@eata.ncl.ac.uk>

You guys have recently sent similar postings to comp.parallel.pvm. 

We, our university Computing Service, have been using Network Linda off
and on for almost a year. At the same time, our theoretical physicists
have been using PVM. In order to be able to give advice on the relative
merits of these two, last autumn I asked one of my colleagues to produce a
report comparing them. If you wish, I can e-mail you a copy of this. It is
written in rather an informal style and I am currently revising it into a
form better suited for others' eyes. Because of the time available, the
report is probably nothing like as exhaustive as it could be but it may
give you a few pointers. 

Apart from the above report, again for comparison purposes, I have been
converting a Linda program into PVM and have formed a few impressions of
my own. 

As a rather old programmer with a hopefully modern outlook on software
construction, I find Linda much more comfortable to work with. Linda
represents a parallel programming model in which I have so far found it
quite easy to describe parallel computations. The emphasis is fairly high
level concentrating on the computational interaction between concurrent
components in terms of the tasks to be undertaken and how they are
synchronised. 

PVM does not represent a parallel programming model as such but is a set
of fairly low-level tools (subroutines) addressing the mechanics of
passing messages between processes. As such, it forces on the user a much
more mechanistic view of the parallelism rather than the higher level
semantics of the computation. 

In distributing a computation over co-operating processes, Linda is
concerned primarily with the identification of data which needs to be
shared or communicated. The Linda compilation system uses a preprocessor
which uses data identity to deduce the lower level details of data types
etc. and to construct automatically the messages which need to be
communicated between processes. In contrast, PVM depends upon the
programmer specifying not only the identity of data but details of data
representations. Message buffers need to be initialised, packed/unpacked
and transmitted explicitly. This is very error-prone both in the early
stages of development and later when it may be necessary to alter
representations and consequently locate all references to the data
concerned. Moreover, this attention to low-level detail makes PVM very
verbose compared to Linda. 

PVM seems to be concerned more with where tasks are performed rather than
with what needs to be done. In some situations this can be rather an
impediment to ones logical view of the computation. On the other hand, the
location of a particular sub-computation may be important and PVM permits
firm control over that. Linda is more concerned with the logical structure
of the computation and forcing a sub-computation on to a particular host
needs an unnatural element of contrivance. 
   
There is probably nothing that one system can do that the other can't.
However, I am getting the impression that it is much easier to convert PVM
into Linda than the other way round. This would indicate that Linda is
inherently a much more expressive vehicle for ones intuitive perception of
computational structure. In simple situations this may not be important
but, since parallel programming is notoriously difficult, Linda's
programming model may be particularly advantageous in more complex
scenarios. 

We use SCA's (Scientific Computing Associates) implementation of Linda on
workstation clusters. This comes with an excellent X-based visual
debugging system. This enables dynamic visualisation of the distributed
computation and it has been our experience that this has been
indispensible in locating synchronisation errors. Program execution can be
controlled at the high-level Linda synchronisation/communication level as
well as at the detailed statement level (through integration with host
debuggers, e.g. dbx). There is also a powerful postmortem analyser. I
don't think PVM is as well-equipped for debugging. The importance of
debugging facilities for parallel programs cannot be over-stressed. 

On the other hand, associated with PVM is HeNCE, a graphical system for
composing parallel programs. I have only seen a simple demo of this so
although this seems an ideal way of specifying the logical parallelism I
have no idea how well it copes with complicated computational topologies. 

I have no hard details of relative performance of PVM and Linda other than
that I think they are comparable. I suspect that comparisons in which a
program in one system is systematically translated into the other may
yield false information since the differences in programming model may
have important implications for synchronisation and the quantity of data
communicated. Last year, Yale produced the report "Parallel Programming
Systems for Workstation Clusters" (Douglas, Mattson & Schultz) which you
might care to read. 

In using these systems on workstation clusters you need to be aware of the
implications parallel computations have on other users, in terms of both
CPU effects and network traffic. Here, PVM and Linda are used on public
workstations and at times this has caused significant interference
problems. PVM, being public domain can be installed by anyone and used to
run parallel computations on an unlimited number of workstations. (On one
occasion we observed 173 processes executing on 65 workstations spread
over several ethernet segments.) Linda could have similar effects but
since SCA's licence locks us into a fixed number of identified machines,
it is harder for the situation to get totally out of hand. Whether you
used shared or dedicated workstations may be extremely important. 

In the context of interference, there is a Linda variant called Piranha
which is designed to address this problem. Piranha processes advance and
retreat according to other concurrent use of the workstation. I am
currently trying to convert a regular Linda program into Piranha and I get
the impression that the Piranha programming is much less expressive than
Linda. I don't know whether PVM has a handle on the interference problem. 

I hope some of this is useful. Let me know if you want a copy of the
report I mention at the beginning. I would be very interested to see any
other comparative information you acquire. 

Cheers....Ron

------------------------------------------------------------------------
  Ron Kerr, Computing Service, Newcastle University, NE1 7RU, England.
            Tel. +44 91 222 8187       Fax. +44 91 222 8765 
------------------------------------------------------------------------


From: cagan@CERES.SCA.COM
Subject: your net posting regarding Linda
To: li@ground.cs.columbia.edu

Jay--

I'd be happy to send you an information packet on our 
Linda environments, including a couple of items that directly 
address a number of your questions regarding Linda as contrasted 
with message passing.  Please send me your full U.S. mail address, 
and we'll get it out to you this week.  

Regards, 

--Leigh Cagan

-------

From: sherman@sca.com (Andrew H. Sherman)
To: Zhe Li <li@ground.cs.columbia.edu>
Subject: Re: PVM vs. C-linda
Organization: Scientific Computing Associates, Inc.


There are a number of articles on this available from the CS folks at Yale
--- you might contact Nick Carriero (carriero@cs.yale.edu). In addition,
you might check out the article I wrote with Leigh Cagan in the December
issue of IEEE Spectrum. That addresses a number of issues, including most
of the ones you list.

Let me know if you need additional information.

-- 
********************************************************************
Dr. Andrew H. Sherman
Vice President, Technology  
SCIENTIFIC Computing Associates, Inc.
One Century Tower, 265 Church Street, New Haven, CT 06510-7010
Email: sherman@sca.com  * Phone:(203)777-7442 * Fax:(203)776-4074       
********************************************************************

From: Heng Kek <nsrcchk@leonis.nus.sg>
To: li@ground.cs.columbia.edu (Zhe Li)
Subject: Re: PVM vs. C-linda

Hi, I'd be grateful if you could summarise if you get enough
positive responses.  Otherwise, there is one paper which compares
the various parallel programming packages.  The details:

PARALLEL PROGRAMMING SYSTEMS FOR WORKSTATION CLUSTERS
-Craig C. Douglas, Timothy G. Mattson and Martin H. Schultz

ftp'able from casper.na.cs.yale.edu.

Sorry I don't remember the filename, but I'm sure you'll be able to
guess it.   The paper compares the major packages: pvm, c-linda, p4,
posybl, tcgmsg.

From: ZDV153%DJUKFA11@CUVMB.CC.columbia.edu
Organization: Forschungszentrum Juelich GmbH
Subject:      Re: PVM vs. C-linda
To: li@ground.cs.columbia.edu

Hi .........

I wrote a report (warning: german language) about parallel processing
on workstation cluster with Express and Network Linda. It is full with
graphics, so you can understand most of it with little german knowledge.
I can send you the report (400 KB compressed PostScript ), if you want.

You can compare pvm with the host-node-programming-model
of Express (Express has two models: host-node and cubix, cubix provides
I/O via a generic host ). The most important differences between message
passing and Linda are :

- Linda is easy to learn (only four commands)
- dynamic load balancing is great with Linda
- Linda has more overhead than message passing

Joerg

From: Rusty Lusk <lusk@mcs.anl.gov>
To: Zhe Li <li@ground.cs.columbia.edu>
Subject: Re: PVM vs. C-linda

In article <1994Jan4.142253.1056@hubcap.clemson.edu> you write:
>
>Hi world:
>
>I am in the process of choosing a parallel processing toolkit to perform a
>data intensive computations (e.g., heavy inter-host communication) as part of
>my PHD thesis work. I playe around with PVM but have not gained any
>experience with C-linda yet. Could anyone share some hands-on experience

You might also be interested in p4.  I enclose a blurb.

Regards,
Rusty Lusk


				      p4


p4 is a library of macros and subroutines developed at Argonne National
Laboratory for programming a variety of parallel machines in C and Fortran.
Its predecessor was the m4-based "Argonne macros" system described in the
Holt, Rinehart, and Winston book "Portable Programs for Parallel Processors,
by Lusk, Overbeek, et al., from which p4 takes its name.  The current p4
system maintains the same basic computational models described there (monitors
for the shared-memory model, message-passing for the distributed-memory model,
and support for combining the two models) while significantly increasing ease
and flexibility of use.

The current release is version 1.3.  Features include: 

  + library of useful monitors for shared-memory programming
  + portable monitor-building primitives
  + send/receive for shared-memory, distributed memory, and clusters
  + support for heterogeneous computing
  + Emacs info version of the manual for on-line help
  + Shared-memory programming even on uniprocessor workstations
  + instrumentation for automatic logging/tracing
  + Either automatic or user-controlled buffer-pool management
  + Remote startup; no daemons necessary
  + optional secure server for faster startup on networks
  + optional automatic logging of events for upshot tracing
  + asynchronous communication of large messages
  + global operations (broadcast, global sum, max, etc.)
  + both master-slave and SPMD models for message-passing programs

p4 is intended to be portable, simple to install and use, and efficient.  It
can be used to program networks of workstations, distributed-memory parallel
supercomputers like the Intel Paragon, the Thinking Machines CM-5, and the IBM
SP-1, as well as shared-memory multiprocessors like the Kendall Square.  It
has currently been installed on the following list of machines: Sequent
Symmetry (Dynix and PTX), Convex, Encore Multimax, Alliant FX/8, FX/800, and
FX/2800, Cray X/MP, Sun (SunOS and Solaris), NeXT, DEC, Silicon Graphics, HP,
and IBM RS6000 workstations, Stardent Titan, BBN GP-1000 and TC-2000, Kendall
Square, nCube, Intel IPSC/860, Intel Touchstone Delta, Intel Paragon, Alliant
Campus, Thinking Machines' CM-5, and the IBM SP-1 (TCP/Ethernet, TCP/switch,
EUI, and EUI-H).  It is not difficult to port to new systems.

A useful companion system is the upshot logging and X-based trace examination
facility.  The macros to create logs are included in p4.  Upshot (an X program
for graphically displaying the logs) is available separately.

You can obtain the complete distribution of p4 by anonymous ftp from
info.mcs.anl.gov.  Take the file p4-1.3.tar.Z from the directory pub/p4.  The
distribution contains all source code, installation instructions, a User's
Guide in both ascii text and latexinfo format, and a collection of examples in
both C and Fortran.  A copy of the postscript for the manual is available
separately as p4-manual.ps.Z.  An article on p4 is available in p4-paper.ps.Z.
There are a few features that are not implemented on certain machines.  See
the machine-specific section of the manual for details.

To ask questions about p4, report bugs, contribute examples, etc., send mail
to p4@mcs.anl.gov.  To subscribe to a list to receive announcements about
new releases and bug fixes, send your request to the same place,
p4@mcs.anl.gov.

                                                   Rusty Lusk
                                                   lusk@mcs.anl.gov

From: Joerg <ZDV153%DJUKFA11@CUVMB.CC.columbia.edu>
To: Zhe Li <li@ground.cs.columbia.edu>
Organization: Forschungszentrum Juelich GmbH
Subject:      Re: PVM vs. C-linda
In-Reply-To:  Your message of Wed, 5 Jan 94 10:28:28 EST


I am talking about linda in general, not about piranha. But Piranha
demonstrates these feature clearly.

You can implement dynamic load scheduling very good with Linda because
of the ability to access data from every process (this means you can
store data for unknown(later known) processes. You need no addressing
(in sometimes only identifiers). Linda works as a third party in such
situations. You can build job pools without additional processes.
( Think about implementing such features with pvm or Express !?!)

But if you know the receiver exactly, Linda is less efficient than
message passing. If you plan to implement software-pipelinig (overlapping
communication and computation) then you will see strange (?) effects
with Linda. If the tuple is directly send to the receiver, then there is
no difference, but when a request is neccessary, then Linda needs time
for the request (and there is no overlapping). The developers of Linda
implement mechanisms, which analayze communication pattern at compile and
runtime, but this means overhead and is no guarantee. If you have simple
pattern it will work, but if you have more complicated pattern, then ???

Now, my personal opinion: Linda is great, but is not always the best
solution (static data decomposition). Cubix (part of Express) allows
IO from every cubix-process, pvm is freeware. Have a look on my ping-pong
benchmarks and you will see the effect, but remember the benefits of
of Linda's tuplespace. (Network Linda 2.4.5 is very old, newer versions
show a higher bandwith and a lower lateny; the pattern analyse has been
improved; but the general behaviour is the same ).


Joerg

From: "Timothy G. Mattson" <tgm@SSD.intel.com>
To: li@ground.cs.columbia.edu
Subject: PVM vs C-Linda


I have heavilly used PVM and Linda.  You may find my
technical report on the topic of some interest.  You
can grab it by anonymous FTP from 

    casper.cs.yale.edu

in the file pub/tr975.ps

This paper only looks at performance.  I have also written
a paper where I discuss more qualitative issues (ease of
use, debugging, etc).  You can grab this by anonymous ftp
from 

   export.ssd.intel.com

in the file

   pub/tmp/mattson/hicss.ps

Basically, what the two papers will tell you are:

   1.  Performance isn't really the way to compare Linda and
       PVM.    If all you care about is performance, use 
       TCGMSG.

   2.  Both Linda and PVM seriously suffer from a lack of global
       communication operations (use TCGMSG or P4 if you need these).

   3.  Linda is so much easier to debug than PVM, that there is
       no comparison.  When I had access to a Linda system,
       I used to write my programs with Linda first even if I 
       ultimately wanted a message-passing code just so I 
       could use the wonderful Linda debugger to develop the
       basic parallel algorithm.

So there you have my two-cents worth.  If you have any problems
grabbing those papers, let me know.

--Tim

------------------------------------------------------------------
Timothy G. Mattson, Ph.D.    Math Library Hacker and Chemist

             Intel Supercomputer Systems Division
             Phone: (503) 531-5627    FAX: (503) 531-5502
             tgm@ssd.intel.com
------------------------------------------------------------------