Newsgroups: comp.parallel.mpi
From: Daniel Kustrin <dan@minster.york.ac.uk>
Subject: Re: MPI jobs on Origin200 array - ??
Organization: ACAG, Dep. of Computer Science, University of York
Date: Wed, 05 Aug 1998 09:30:09 +0100
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <35C81811.BB829458@minster.york.ac.uk>

jjv5@psu.edu wrote:
> 
>    Recently I have attempted to connect two Origin200's together under
> Array Services 3.0. This was a simple thing to do and everything appears
> normal, according to ascheck, aview etc. However, I seem to be missing
> somthing. I cannot run an MPI job across the two machines - that is using
> 4 processors from each of the machines. Has anyone done this? The man
> pages on mpirun state you can simply fire off ajob like so:
> 
> mpirun hosta 4, hostb 4  prog.exe
> 
> but this gives the ubiquitous MPI: cannot run executable error. I can run
> MPI jobs successfully on all four processors of either machine however.
> Any ideas?

there are a zillion reasons why you could get the "cannot run" error.

how are o200s connected? properly or via ether?
did you install array services? is it done properly?
do you have localhost in your etc/hosts.equiv?
is prog.exe executable? 
is it in your path on both machines?
is it on the same file system or is it copied onto another or is it
linked?
linking for some reason always messes up (for me).
i assume arrayd is up, right?
have you had a look at /var/adm/SYSLOG for error messages?
you didn't play with LD_LIBRARY64_PATH, LD_LIBRARY_PATH, and
LD_LIBRARYN32_PATH, i hope.
you checked all manpage troubleshooting things?


you did type

> mpirun hosta 2 prog.exe : hostb 2 prog.exe

which starts prog.exe with 2 proc on hosta and 2 proc on hostb, right?
(this is what my manpage claims is the right syntax)

which MPT version are you running?


i am sorry i don't have a canned answer but such is mpt...

dan

