From: Lars Rzymianowicz <larsrzy@ti.uni-mannheim.de>
Newsgroups: comp.parallel.mpi
Subject: Re: Linux MPI_BCAST Problem
Date: Mon, 21 Jun 1999 08:44:39 +0200
Organization: Dept. of Computer Engineering, University of Mannheim, Germany
Message-Id: <376DDF57.7AA1DBE0@ti.uni-mannheim.de>
References: <Pine.GSO.4.10.9906181008180.3461-100000@glsn15.ews.uiuc.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Xref: ukc comp.parallel.mpi:5233


Will Reed wrote:
> I have a large numerical code written in F90 that I'm trying to run in
> linux (RH6 cluster -- 12 proc, ch_p4).  The code works on SGI O2k under
> SGI's MPI implementation.  Unfortunately, I have been unable to run it
> under linux because I get the following runtime error:  (mpirun -np 4 in
> this case)
> 
> p2_2229:  p4_error: interrupt SIGSEGV: 11
> rm_l_2_2230:  p4_error: interrupt SIGINT: 2
> p1_5730:  p4_error: interrupt SIGSEGV: 11
> rm_l_1_5731:  p4_error: interrupt SIGINT: 2
> p3_2232: (5.008455) Trying to receive a message when there are no
> connections; Bailing out
> bm_list_5727:  p4_error: net_recv read:  probable EOF on socket: 1


Hi Will,
the segmentation fault is normally called by accessing a wrong
address. Have you checked the parameters of the Bcast? If it's the
first process, you could start a debugger too (mpirun -xxgdb).
Strange that it works under SGI...

Lars
-- 
Address:  University of Mannheim; B6, 26; 68159 Mannheim, Germany
Tel:      +(49) 621 292-1619, Fax: -5597
email:    larsrzy@{ti.uni-mannheim.de, atoll-net.de, computer.org}
Homepage: http://mufasa.informatik.uni-mannheim.de/lsra/persons/lars/

