From: nmm1@cus.cam.ac.uk (Nick Maclaren)
Newsgroups: comp.parallel.mpi
Subject: Re: Beowulf, LAM MPI and MPICH
Date: 5 Jan 1999 13:12:21 GMT
Organization: University of Cambridge, England
Message-Id: <76t33l$80q$1@pegasus.csx.cam.ac.uk>
References: <VlXj2.525$fK.2807@axe.netdoor.com>
    <19990105071059.01167.00007845@ng-cc1.aol.com>
Originator: nmm1@taurus.cus.cam.ac.uk


In article <19990105071059.01167.00007845@ng-cc1.aol.com>, engrbohn@aol.com (Engr Bohn) writes:
|> On 1/3/99 11:20 PM Eastern Standard Time, "Anthony Skjellum"
|> <skjellum@netdoor.com> wrote:
|> 
|> >>The purists' definition is a dedicated cluster
|> >>built from commodity components with an open-source free-license OS that
|> >>maintains a single-system image. 
|> 
|> >It does not appear, according to this definition, that there are
|> >any Beowulf clusters in existence, since Single System Image
|> >is not supported by Linux
|> [...]
|> >That would make
|> >execution transparent of location, which is not the case with
|> >MPI.
|> 
|> I /think/ that what is meant by "single system image" is that casual inspection
|> of use would not reveal that this is a Pile of PCs rather than a traditional
|> supercomputer, such as an SP2.  If, when you invoke mpirun, you specify the
|> number of processors but rely on the system's default machinefile, then the
|> researcher does not have to think of the system as a collection of  individual
|> computers.  Some teams have even gone so far as to use scheduling software so
|> researchers can submit their jobs and the scheduling software decides which
|> nodes the application will run on.

An SP 2 is most definitely NOT a traditional supercomputer, and uses
the same model that Beowulf does.  I.e. it is precisely a pile of
workstations, with enough glue to enable users to run parallel codes
as a single application under Unix.  Beowulf includes a few kludges
to simplify process management across the systems, but otherwise is
just a layer on top of a pile of workstations.

For a much closer approximation to a "single system image" on
distributed memory, look at the Hitachi SR2201 (even more than the
Cray T3-E) - as far as USERS see it, it is a single Unix system.
With one or two very minor exceptions, only administrators need to
view it as separate machines, and then rarely.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

