Newsgroups: comp.parallel.mpi
From: jjv5@psu.edu
Subject: Re: MPI jobs on Origin200 array - ??
Organization: Penn State University, Center for Academic Computing
Date: Wed, 05 Aug 1998 10:46:29 -0500
Message-ID: <jjv5-0508981046300001@mica1ppp17.cac.psu.edu>


Thanks for your list of possible problems. I think my trouble may be
coming from where the executable is:

In article <35C81811.BB829458@minster.york.ac.uk>, Daniel Kustrin
<dan@minster.york.ac.uk> wrote:

> how are o200s connected? properly or via ether?

   ethernet, regular TCP/IP networking

> did you install array services? is it done properly?
   
   yes, 4 processor jhobs run fine on each machine

> do you have localhost in your etc/hosts.equiv?
   
   yes

> is prog.exe executable? 

   yes, runs fine as a 4 processor jobs on either machine

> is it in your path on both machines?

   very good question...

> is it on the same file system or is it copied onto another or is it
> linked?

   well...... it's in the same named path on each machine, but seperate
copies on each machine, i.e. it's /usr/people/jjv5/prog.exe on both
machines
I wonder about this....

> linking for some reason always messes up (for me).

   no links

> i assume arrayd is up, right?

   according to aview, ascheck, all other things, yes

> have you had a look at /var/adm/SYSLOG for error messages?

   no messages

> you didn't play with LD_LIBRARY64_PATH, LD_LIBRARY_PATH, and
> LD_LIBRARYN32_PATH, i hope.

   negative

> you checked all manpage troubleshooting things?

   Yes, mpi man pages aren't very helpful. I've sifted through everything
I could find.

   The real problem is there is no step by step instructions listed
anywahere for starting an mpi job using multiple machines in an array. 

> 
> 
> you did type
> 
> > mpirun hosta 2 prog.exe : hostb 2 prog.exe
> 
> which starts prog.exe with 2 proc on hosta and 2 proc on hostb, right?
> (this is what my manpage claims is the right syntax)

   Yes, but this fails.


> 
> which MPT version are you running?

I  mpi                  06/16/98  MPI 3.1.0.1 (MPT 1.2.0.1)
I  arraysvcs            06/16/98  Array Services 3.1

> 
> 
> i am sorry i don't have a canned answer but such is mpt...

   I appreciate your help. Could you tell me very briefly exactly how you
fire off an mpi job on multiple machines under an array of O200's? Where
do you put your executbale, do you have copies on each machine? If it's
SPMD do you have multiple input files also? 


         Thanks a heap,

      Jim

