From: Nick <nevin@shell.gis.net>
Newsgroups: comp.parallel.mpi
Subject: Re: Isend taking forever in mpich linux
Date: Tue, 01 Jun 1999 12:00:28 GMT
Message-Id: <928238428.712.56@news.remarQ.com>
References: <7hac7h$tj8$1@peque.uv.es> <7hba5u$p7e@cs.vu.nl>
    <928191870.576.18@news.remarQ.com> <7j02d8$96u@cs.vu.nl>
User-Agent: tin/pre-1.4-980117 (UNIX) (SunOS/5.7 (sun4m))
Xref: ukc comp.parallel.mpi:5154


> We implemented our own device for our Myrinet cluster. I rerun your test
> and get the following:

> $ for n in 0 5 10 15; do prun a.out 2 $n; done
> nap =  0 and time for isend is 0.003760 s
> nap =  5 and time for isend is 0.000022 s
> nap = 10 and time for isend is 0.000023 s
> nap = 15 and time for isend is 0.000023 s

> Which is roughly as it should be...
> Interesting to see what happens in this case with ch_p4. May this be a problem
> with the Linux sockets???
> Thilo

I don't think it's a socket problem. I believe the behaviour is device
dependent and depends on how much data the device can buffer. In the
case of my test run over sockets the socket could buffer the first
message but not both the first and second. I chose a BUFSIZE of 100000
bytes for this reason.  If your device can buffer both messages or if
it cannot buffer the first (in which case the first isend will block)
then that might explain why you see the above. You could try
experimenting with varying BUFSIZE and timing the first isend as well
to try and get a handle on what's actually happening in your case.
On the other hand your device may handle non-blocking sends quite
differently to ch_p4 in which case none of the above applies.

-nick

