Newsgroups: comp.parallel.mpi
From: colella@unwanted.nimue.plk.af.mil
Subject: MPI_Request_free
Organization: Air Force Phillips Lab.
Date: 13 May 1998 21:11:59 GMT
Message-ID: <6jd2av$3e0$2@pr1.plk.af.mil>

hello all,

I'm trying to fix up some message passing in a code that I didn't write.  What
happens is that we seem to be using up switch memory on the IBM SP2 because
we use non-blocking sends and receives with probes instead of a Test or Wait.
A Probe won't reset the request handle to null and deallocate it.  What I'm
trying to do as a temporary fix, (since this whole thing apparently needs to
be rewritten, but we'd really like to do some physics in the meantime) is put
an MPI_Request_free immediately after a call to MPI_ISend.  In the book
"MPI:  The Complete Reference" by Snir, et al, this is shown in the examples on
pages 59-60.  The MPI standard says that pending communication requests are not 
deallocated until the communication is complete.  From page 59 of the same book:
"If the communication associated with this object is still ongoing, and the 
object is required for its correct completion, then MPI will not deallocate the
object until after its completion."  According to the error messages that my
code gives (on the SP2), there are some messages it's not getting, presumably
because of the MPI_Request_free.  I only added the one line, and when I take it
out, everything runs correctly until I get the memory exhausted error.
I've also done the same thing on the same version of the code that is installed
on a network of Pentiums running LINUX with MPICH, and my code doesn't give me errors,
but I get some weird error messages well into the run about an shandle, it gives
me some cookie info, and then says: "Aborting program Bad address in Rendezvous
send (ack)".  At least on the SP2, the code seems to abort right about the time
it first starts to use a non-blocking send.  On the Pentiums, it stops in the
same cycle (time step) consistently, but it takes about 58 cycles for it to
happen, and there's been quite a bit of non-blocking sends and receives going on.
If I take out the MPI_Request_free, the Pentiums run my code to completion with
no problems.

Have I misunderstood the standard?  Am I doing something wrong?  Can anyone give
me some insight into this?  Please don't suggest I stick in Waits or Tests instead
of Probes, as this requires a significant rewrite of the code.  It's meeting season
and we have to have some physics results to present.  I need something quick to get
this memory exhausted problem out of the way.  IBM already knows about it (and I 
doubt that they'll do anything about it in their implementation).


Thanks for any help at all!

Shari Colella
colella@unwanted.nimue.plk.af.mil

to respond by email, get rid of the unwanted.