From: Josh Guffin <guffin@purdue.edu>
Newsgroups: comp.parallel.mpi
Subject: Re: timing for Wait(all|any)
Date: Fri, 9 Jul 1999 10:14:11 -0500
Organization: Purdue University
Message-Id: <Pine.SOL.3.96.990709101110.22833A-100000@herald.cc.purdue.edu>
References: <Pine.LNX.4.10.9907080046420.3782-100000@krispc6.physik.uni-karlsruhe.de>
    <7m1mj5$eed@cs.vu.nl>
    <Pine.LNX.4.10.9907081113190.4838-100000@krispc6.physik.uni-karlsruhe.de>
    <7m2enl$eq0@cs.vu.nl>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
In-Reply-To: <7m2enl$eq0@cs.vu.nl>
Xref: ukc comp.parallel.mpi:5309


On Thu, 8 Jul 1999, Thilo Kielmann wrote:

> Johannes Zellner <johannes@zellner.org> wrote:
> 
> :> : But: running this on a single machine with for example `-np 3'
> :> :      and each process gets about 33% cpu, also the master.
> :> :      what's going on there? (How is the Wait stuff implemented)
> :> 
> :> As you do not write which operating system/MPI implementation you are using,
> :> I strongly doubt that somebody will come up with a hint how Wait is implemented
> :> on your platform.
> 
> : mpich on LINUX. (and probably also on other system in the future, e.g. sgi)
> 
> I vaguely remember that on Linux you may need a patch for TCP/IP
> because otherwise you may have problems with missing events that cause
> your system to wait far too long. 
> 
> Any Linux/Beowulf experts out here who know more details???

The problem is rather well documented.  Read Josip Loncaric's
description of the problem for more information...

It supposedly increases response time 20 fold on 2.0.xx kernels, and
10 fold on 2.2.x kernels.

http://www.icase.edu/coral/LinuxTCP.html

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
= Josh Guffin                                 guffin@purdue.edu =
= Purdue U. HEP - TASK E           expert.cc.purdue.edu/~guffin =
=                   #include <std/disclaimer>                   =
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=     

