From: Josh Guffin <guffin@purdue.edu>
Newsgroups: comp.parallel.mpi
Subject: Strange occurance
Date: Wed, 14 Jul 1999 08:52:35 -0500
Organization: Purdue University
Message-Id: <Pine.SOL.3.96.990714084704.9955A-100000@herald.cc.purdue.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Xref: ukc comp.parallel.mpi:5322


(I'm using MPICH over a network of RH Linux 6.0 boxen)

I am attempting to run a simple program which writes the ID # of a
processor to a file called data.out on that machine.  

Two of my machines are down, so i'm using the -machinefile option to
specify the machines that are still up...  For some reason, the binary
is apparently not being executed on the last machine in the table.

Here is what my machinefile looks like:

meco01
meco02
meco03
meco04
meco05
meco07
meco09

I'm using rdist to send the source to each machine and compile it
there.  The code is:

	program main
	include "mpif.h"

	integer myid

	call MPI_INIT(ierr)
	call MPI_COMM_RANK(MPI_COMM_WORLD,
     *  myid,ierr)
	open(unit=5,file="data.out")
	write(5,101)myid
 101	format(1x," my id number is = ",i10)


	call MPI_FINALIZE(ierr)

	stop
	end

It is compiled with mpif77 on each machine.  Once there i use the
command: mpirun -np 7 -machinefile machines a.out

The last processor in the list never writes data, and is apparently
not even being run.  Any ideas as to why this is?

Thanks, 

Josh

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
= Josh Guffin                                 guffin@purdue.edu =
= Purdue U. HEP - TASK E           expert.cc.purdue.edu/~guffin =
=                   #include <std/disclaimer>                   =
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=     

