From: dave@steinhardt.hep.upenn.edu (Axe Murderer)
Newsgroups: comp.parallel.mpi
Subject: Linux-2.1.79/MPICH-1.1.0 MPI socket problems
Date: 3 Oct 1998 20:36:23 GMT
Organization: University of Pennsylvania
Message-Id: <6v61s7$th3$1@netnews.upenn.edu>


Hi,
On a 4-node Dell 6100, Linux 2.1.79, MPICH 1.1.0, I keep
getting timeouts from a trivially modified cpi problem,
(cpi calculates pi, its in the examples/basic directory). The
trivial modification consists of doing the broadcast many times
over instead of once, as in the original program:

        for(j=0;j<=100000;j++){
		MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
	}


The idea was to stress test the networking. What happens is that each such run,
when the master and one slave are on the same machine, lands up in the program
never ending, and the socket going into a FIN_WAIT1 stage, and never coming out.
I have to kill the program by hand. On the odd occasion that the program does finish,
it takes a long time in the finalize.

Two questions: has anyone experienced this--any pointers to what it may be?

Secondly, after the program has been killed but has left its socket around in FIN_WAIT1,
how do I free the socket?

Thanks a zillions,
Rahul

--
Rahul Dave (aka T.I.G)  R: 908-214-9083 ||
O: 215-898-2948         F: 215-898-2010 ||
dave@steinhardt.hep.upenn.edu           ||
http://www.physics.upenn.edu/~dave      ||
=========================================|==========
WARNING: Abuse of this email address for unsolicited
commercial advertisements ("spam") is prohibited and
will result in a lawsuit.

