From: peterm@tesla.EECS.Berkeley.EDU (Peter Mardahl)
Newsgroups: comp.parallel.mpi
Subject: Re: Help
Date: 16 Jun 1999 11:54:51 -0700
Organization: UC Berkeley
Message-Id: <7k8rtr$5b6$1@tesla.EECS.Berkeley.EDU>
References: <3766832B.56340DAB@t12.lanl.gov> <3767C26F.38A066B4@ing.unife.it>
Xref: ukc comp.parallel.mpi:5208


In article <3767C26F.38A066B4@ing.unife.it>,
Gaetano Bellanca  <gbellanca@ing.unife.it> wrote:

>Running a simulation, for the same data, I'm having correct results on
>the two architecture for 1,2 and 3 PEs but, running for more than 3,
>only on the PC cluster, I have the message:
>
>p1_20859:  p4_error: interrupt SIGFPE: 8

It looks like one of the processes crashed because of a bug in
it, hence the SIGFPE.  It could be the compiler which has produced
the bug, or it could be a bug which simply doesn't show up on the
Cray.

Most likely it is a problem with your program, a bug, though:
find out what is causing the SIGFPE.

PeterM


>rm_l_1_20860:  p4_error: interrupt SIGINT: 2
>P4 procgroup file is insieme.pg.
>p2_6397: (518511.412711) Trying to receive a message when there are no
>connections; Bailing out

