Newsgroups: comp.parallel.mpi
From: andrei <andrei@nd.edu>
Reply-To: andrei@atomic2.phys.nd.edu
Subject: Q:Fault tolerance in LAM
Organization: University of Notre Dame
Date: Sat, 01 Nov 1997 14:54:45 -0500
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <345B8904.9E981936@nd.edu>

Dear netters :

There is an option -x for  mpirun, which presumably
allows to catch a signal (LAM_SIGSHRINK)
when one of the processing nodes dies.

Howone would go about dealing with
catching  this signal and recovering
in FORTRAN.

Also will MPI_errhandler_set  catch this sysgnal ?

Thanks,

                            Andrei


