From: Javier Fernandez <javier@casip.ugr.es>
Newsgroups: comp.parallel.mpi
Subject: Help! LAM, XMPI 2.1 traces & lamtrace
Date: Sat, 28 Nov 1998 14:43:40 +0100
Organization: Centro Informatico Cientifico de Andalucia
Message-Id: <365FFE0C.2CDB5C6A@casip.ugr.es>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


        Hi!:

        I'm using lam6.1 and xmpi2.1 (with lesstif-0.87.0
under a Sun SparcStation10 SunOS 5.3). The user's guide tells
that in the simple case of 1 world, no spawned processes (ie,
all of them were launched from an application schema with
mpirun), lamtrace -v -mpi can be used to collect all traces.
It works.

        The problem is... what happens when there are spawned
processes? Clicking the "trace" button does exactly the same
as lamtrace -mpi, ie, it gets only the 1st world's traces.
And, yes, there are more traces, that can be collected with
lamtrace h, or lamtrace n0 (I'm currently using only one computer)
but they are not recognized by xmpi. "Bad magic number". I can see
in mpitrace.h that

#define LAM_TRMAGIC     1279348022 /* LAM trace file magic number */

        Err, well, if somebody knew how could I dump the
remaining traces with a correct magic number, or yet better,
how could I dump them all and view them with xmpi, it would be
great.

        My application is really simple, the very same "spmd"
from pvm, arranged in either spmd or master-slave fashion. Just
in case anybody is interested (thanks!), here is the code

Application schema file, "spmd.app"____________________________
spmd -c 12

source file "spmd.c"___________________________________________
#include "mpi.h"

#define TAG 13
#define HOST

static void             master();
static void             slave();

/*
 * main
 * This program is really MIMD, bit is written SPMD for
 * simplicity in launching the application.
 */
int
main(argc, argv)

int                     argc;
char                    *argv[];

{
        int             myrank, ntasks;

        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD,   /* group of everybody */
                        &myrank);       /* 0 thru N-1 */

        MPI_Comm_size(MPI_COMM_WORLD,
                        &ntasks);       /* #processes in app */

        if (myrank == 0) {
                master();
        } else {
                slave(myrank, ntasks);
        }

        MPI_Finalize();
        return(0);
}
static void
master()
{
        int             dst, msg;
        MPI_Status      status;

        dst = 1;
        msg = 7;
        MPI_Send(&msg,          /* message buffer */
                1,              /* one data item */
                MPI_INT,        /* of this type */
                dst,            /* to this rank */
                TAG,            /* a work message */
                MPI_COMM_WORLD);/* always use this */

        MPI_Recv(&msg,          /* message buffer */
                1,              /* one data item */
                MPI_INT,        /* of this type */
                MPI_ANY_SOURCE, /* from anybody */
                MPI_ANY_TAG,    /* any message */
                MPI_COMM_WORLD, /* communicator */
                &status);       /* recv'd msg info */
        printf("Token ring done\n");
#ifdef HOST
        MPI_Send(&msg, 1, MPI_INT,      /* Avisar a host */
                    0, TAG, MPI_COMM_PARENT);
#endif
}

static void
slave(myrank, ntasks)
        int myrank, ntasks;
{
        int             msg, src, dst;          /* Enviar enteros */
        MPI_Status      status;

        src =  myrank-1;
        dst = (myrank+1) % ntasks;
        MPI_Recv(&msg, 1, MPI_INT, src, TAG, MPI_COMM_WORLD, &status);
        MPI_Send(&msg, 1, MPI_INT, dst, TAG, MPI_COMM_WORLD);
}
__________________________________________________________________

        Yes, you're right, I should have removed the #define HOST,
but even if I don't, I could still get the traces in xmpi. The
host is used with this other application schema...

Application schema file "spmdhost.app"____________________________
spmdhost -c 1

Source file "spmdhost.c"__________________________________________
#include "mpi.h"

#define TAG 13
#define NPROC 12

int
main(argc, argv)

int                     argc;
char                    *argv[];

{
        int errs[NPROC];                /* errcodes de los procesos
malla */
        MPI_Comm        spwn;           /* Comunicador para hijos */
        MPI_Status      status;         /* Objeto de estado para
MPI_Recv */

        MPI_Init(&argc, &argv);
        MPI_Spawn("spmd", 0,            /* Programa y Argumentos */
                   NPROC, MPI_INFO_NULL,        /* Copias e Informacion
*/
                       0, MPI_COMM_WORLD,       /* Quien arranca y su
comm */
                   &spwn, errs);                /* Comm y errs de la
Malla */
        MPI_Recv(&errs[0], 1  , MPI_INT,        /* Recibir del lider */
                       0 , TAG, spwn,
                         &status);

        MPI_Finalize();
        return(0);
}
___________________________________________________________________

        And only the 1st world can be collected with either the
"Trace" button or the "lamtrace -mpi" command. Any other use of
"lamtrace" gives out files that cannot be loaded into xmpi,
"Bad magic number". I would prefer not digging into mpitrace.h,
nor translating LAM_TRMAGIC from decimal to hex to check whether
it just says "LAM6". Is there any way to see the traces under xmpi?

        Thank you very much in advance. And thank you for bearing
with the /* */'s in spanish :-)

-javier

