From: "Anthony Skjellum" <skjellum@netdoor.com>
Newsgroups: comp.parallel.mpi,comp.parallel.pvm
References: <UTkG2.4447$_n2.99958@carnaval.risq.qc.ca>
Subject: Re: Dual- vs. single processor motherboards
Message-Id: <qf_G2.2843$gU1.12654@axe.netdoor.com>
Date: Sun, 14 Mar 1999 20:25:22 -0600
Organization: Internet Doorway, Inc. -- http://www.netdoor.com/
Xref: ukc comp.parallel.mpi:4750 comp.parallel.pvm:8116


Some vendors, such as Dell, are offering the second processor
for as little as 5% above 1 processor prices.  At this low increment,
it definitely pays to get it.

Depending on the MPI implementation, you may not see positive benefit from
the second processor, because of polling.  This has
been noted repeatedly on the reflector over the last year or so.
Also, because some MPI's are not thread safe, it will be difficult to do
threads + MPI.

In a short time, thread-safe MPI for Linux will be available from certain
vendors, making this less of an issue.

I would go with the two processor systems for now, and expect MPI quality to
improve over next six months to support same, both for NT and Linux.

-Tony


Michael Guevara wrote in message ...
>We are planning to build a cluster made up of 16 Pentium II (350 MHz) CPUs.
>We intend to use Linux RedHat as operating system and to run mostly PVM and
>some MPI.
>
>At Montreal prices, we can save about Canadian $4K by using 8 dual-CPU
>boards
>rather than 16 single-CPU boards (overall system price C$20K vs. C$24K).
>Another advantage of using dual-CPU boards is that, when operating in
>non-cluster
>mode, one then has access to individual workstations that are more
powerful.
>
>Does it make any sense to use dual-CPU machines? Will there be any penalty?
>
>For example, is it harder to write PVM or MPI code for a dual-processor
>machine?
>
>Will overall computing speed for network of 8 dual-processor machines be
>reduced
>in comparison with network of 16 single-processor machines?
>
>Our application involves numerically integration of a partial differential
>equation (a nonlinear cable equation of the reaction-diffusion sort).
>On a network of 16 single-CPU machines, one basically ends up partitioning
>a matrix into submatrices, with each of the 16 processors doing almost
>exactly the same amount of computation at each iteration before entering
>the message-passing phase.  At present, less than 10% of the total
>computation
>time is spent in message-passing (using PVM in a network of single-CPU
>machines),
>with node i sending and receiving info from nodes i-1 and i+1 at the end of
>each numerical time-step.
>
>I can visualize two scenarios when using a dual-processor motherboard:
>(1) partition the problem into 16 parts, with each of the two CPUs on each
>board
>working independently on different submatrices, so that each CPU computes
>1/16 of
>the overall problem;
>(2) partition the problem into 8 parts, with the two CPUs on each
>motherboard somehow
>working jointly on a submatrix representing 1/8 of the problem size.
>
>In the first case a typical CPU (node i) would be communicating with the
>other CPU on its
>own board (node i-1) and another CPU off-board (node i+1) via Fast Ethernet
>100 Mbps, full-duplex.  Unfortunately, it's my guess that any increased
>inter-processor communication speed within the
>dual-processor board (with respect to Ethernet) would probably not result
in
>savings in overall
>computation time, since node i would still have to wait on node i+1 (which
>is on another
>board) via Ethernet.
>
>Two questions:
>(1) are both scenarios above possible?
>(2) if so, which one is better from point of view of:
>          (a) overall computation speed?
>          (b) ease of PVM/MPI programming?
>
>Thanks for any help in the above.
>
>
>Michael Guevara
>Department of Physiology
>McGill University
>
>
>