Newsgroups: comp.parallel.mpi
From: paul@cee.hw.ac.uk (Paul Bristow)
Subject: Re: [HELP] lam under Alpha/LINUX
Organization: Dept of Computing & Electrical Engineering, Heriot-Watt University, Scotland
Date: Thu, 12 Feb 1998 16:45:33 GMT
Message-ID: <Eo9yJy.937@cee.hw.ac.uk>

J-M Beuken (beuken@pcpm.ucl.ac.be) wrote:
: Hello,

: I have a Alpha/LINUX(RedHat 5.0) cluster and I want to use LAM/MPI.

: I have a lot of problems , not during compilation but during execution of
: recon,lamboot, mpirun, fctl...

: Somebody have already use with success LAM with Alpha/LINUX ?


: for compilation (EGCS 1.0.1 gcc) , I put in Config/config

: OS=LINUX
: CPU=alpha

: Error report during execution :

: 1) problems with recon or lamboot
: >lamboot -v Nodes
: local host not present
: >

: in the lam/share/boot/lamnet.c

: /*
:  * Get network interface configuration.
:  */
:         if (ioctl(sock, SIOCGIFCONF, &config)) {
:                 close(sock);
:                 return(LAMERROR);
:         }

: the ioctl return only the lo interface, not the eth0 , then lamboot
: doesn't find the local host in the Node's file because the ip 127.0.0.1
: (lo) is different of the ip address (eth0) of the origin node.

: if  this test is bypassed then lamboot can run (I must put the localhost,
: first in the list of nodes )

: 2) this is the error message with mpirun :

: mpirun -v myapp
: mpirun (set_stdio): Invalid argument

: the problem is with the 'sendmsg' call (libc.a or libpthread.a) used in
: the source file "/lam/lib/otb/t/etc/sfrd.c"

:         if (sendmsg(stream, &msg, 0) != 1) {
:                 return(-1);
:         }


: conclusions : I believe that there are problems with Alpha/LINUX kernel or
: libraries. (under Intel/LINUX , there are no problems !)

: any ideas ? comments ? pointers ?

: thanks in advance,

: jmb

: -- 
: J-M Beuken
: Lab PCPM/MAPR/FSA
: University of Louvain-la-Neuve
: Belgium
: http://www.mapr.ucl.ac.be/~beuken

I vaguely remember hearing about there being a problem with mpi under
Redhat 5.0.  I think it was something to do with Redhat 5.0's version of
libc crashing mpi (or vice versa if you like).  It was probably in this
newsgroup that it was mentioned - try trawling the archives.  You say it
works fine under Intel/Linux but which Linux release, is it Redhat 5.0
too?
 --
Paul C Bristow, 
Dept. Computing & Elec. Eng. Heriot-Watt University, Edinburgh EH14 4AS.
Phone: (+44) 131 451 5111 ext 4179.
e-mail: paul@cee.hw.ac.uk, personal e-mail: pcb@cableinet.co.uk
'veni, vidi, vibi'


