Newsgroups: comp.parallel
From: isclyhb@leonis.nus.sg (Benjamin Lian)
Subject: Summary---BSP references
Organization: National University of Singapore
Date: 7 Feb 1994 12:54:28 -0500

Following is a summary of the pointers I received in
response to my request for more on Valiant's BSP model.
Thanks to all who took the time to email me. Attributions
are as shown. Some email replies were merged so those names
don't appear. Repetitions galore, but because the other
stuff surrounding the references might be of interest.

Thanks once again,

Benjamin Lian
isclyhb@leonis.nus.sg

-------------
From: nbm@epcc.edinburgh.ac.uk (Neil MacDonald)

@article{bBSP:abmfpc,
	TITLE = 	{A Bridging Model for Parallel Computation},
	AUTHOR = 	{Leslie G. Valiant},
	JOURNAL = 	{Communications of the ACM},
	YEAR = 		{1990},

	VOLUME = 	{33},
	NUMBER = 	{8},
	PAGES = 	{103--111},
	MONTH =		{August},
	NOTE = 		{},
	ABSTRACT = 	{The success of the von Neumann model of
sequential computation is attributable to the fact that it is an
efficient bridge between software and hardware: high-level languages
can be efficiently compiled on to this model; yet it can be
efficiently implemented in hardware.  The author argues that an
analagous bridge between software and hardware is required for
parallel computation if that is to become as widely used.  This
article introduces the bulk-synchronous parallel (BSP) model as a
candidate for this role, and gives results quantifyin its efficiency
both in implementing high-level language features and algorithms, as
well as in being implemented in hardware}
}
	
@article{bBSP:asffpc,
	TITLE = 	{A Scheme for Fast Parallel Communication},
	AUTHOR = 	{L.G. Valiant},
	JOURNAL = 	{Siam J. Comput.},
	YEAR = 		{1982},

	VOLUME = 	{11},
	NUMBER = 	{2},
	PAGES = 	{350--361},
	MONTH =		{May},
	NOTE = 		{Cited in \cite{bBSP:abmfpc}.},
	ABSTRACT = 	{Consider $N = 2^{n}$ nodes connected by wires
to make an $n$-dimensional binary cube.  Suppose that initially the
nodes contain one packet each addressed to distinct nodes of the cube.
We show that there is a distributed randomized algorithm that can
route every packet to its destination without two packets passing down
the same wire at any time, and finishes within time $O({\rm log}~n)$
with overwhelming probability for all such routing requests.  Each
packet carries with it $O({\rm log}~n)$ bits of bookkeeping information.  No other communication between the nodes takes place.  

The algorithm offers the only scheme known for realizing arbitrary
permutations in a sparse $N$ node network in $O({\rm log}~n)$ time and
has evident applications in the design of general purpose parallel
computers.}
}

@incollection{bBSP:gppa,
	AUTHOR =	{Valiant, L.G.},
	TITLE =		{General purpose parallel architectures},
	BOOKTITLE =	{Handbook of Theoretical Computer Science},
	PUBLISHER =	{North Holland},
	YEAR =		{1990},
	
	EDITOR =	{van Leeuwen, J.},
	CHAPTER =	{},
	PAGES =		{},
	ADDRESS = 	{Amsterdam},
	MONTH =		{},
	NOTE = 		{},
	ABSTRACT =	{The possibilities for efficient general
purpose parallel computers are examined.  First some network models
are reviewed.  It is then shown how these networks can efficiently
implement some basic message routing functions.  Finally, various high
level models of parallel computation are described and it is shown
that the above routing functions are sufficient to implement them
efficiently.}
}

@incollection{bBSP:oupc,
	TITLE = 	{Optimally universal parallel computers},
	AUTHOR = 	{L. G. Valiant},
	BOOKTITLE =	{Opportunities and Constraints of Parallel Computing},
	PUBLISHER =	{Springer-Verlag},
	YEAR =		{1989},
	
	EDITOR =	{J.L.C.Sanz},
	CHAPTER =	{},
	PAGES =		{155--158},
	ADDRESS = 	{},
	MONTH =		{},
	NOTE = 		{Also appeared Phil. Trans. R. Soc. Lond. Vol A326, pages 373--376, 1988},
	ABSTRACT =	{It is shown that any program written for the idealized shared-memory model of parallel computation can be simulated on a hypercube architecture with only constant factor inefficiency, provided that the original program has a certain amount of parallel slackness.}
}

@incollection{bBSP:bspc,
	AUTHOR =	{Valiant, L.G.},
	TITLE =		{Bulk-synchronous parallel computers},
	BOOKTITLE =	{Parallel Processing and Artificial Intellligence},
	PUBLISHER =	{Wiley},
	YEAR =		{1989},
	
	EDITOR =	{M. Reeve and S.E. Zenith},
	CHAPTER =	{},
	PAGES =		{15--22},
	ADDRESS = 	{},
	MONTH =		{},
	NOTE = 		{Cited in \cite{bBSP:abmfpc}},
	ABSTRACT =	{}
}

From: Ville Lepp{nen <villep@utu.fi>

@INPROCEEDINGS{Valiant-90a,
        AUTHOR="L.G. Valiant",
        TITLE="{General Purpose Parallel Architectures}",
        BOOKTITLE="Algorithms and Complexity, Handbook of Theoretical
         Computer Science",
        VOLUME="A",
        YEAR=1990,
        PAGES={934--971}}

@inproceedings{SWAT-GV-92,
        author="A.V. Gerbessiotis and L.G. Valiant",
        title="{Direct Bulk-Synchronous Parallel Algorithms}",
        booktitle="Algorithm Theory -- SWAT'92, Lecture Notes in Computer
          Science 621, Helsinki, Finland",
        editor="O. Nurmi and E. Ukkonen",
        year=1992,
        month="July",
        pages={1 -- 18}}


 In case you are interested of XPRAM-style work-optimal PRAM
 implementations on mesh structures, then I can "recommend"
 my Ph.Lic. (something between MSc. and PhD.) thesis (report
 R-93-9) that I just finished. From there you can also find
 a lot of pointers to other BSP-related papers.

@techreport{utu-R-93-9,
        author={V. Lepp{\"a}nen},
        title="{PRAM Computation on Mesh Structures}",
        institution="University of Turku, Computer Science Department",
        number="R-93-9",
        note="Ph.Lic.\ thesis",
        month="November",
        year=1993}

 I will put my thesis available tomorrow via anonymous-ftp from
 cs.utu.fi:/pub/techreports  (that directory will exist tomorrow).

-----------------------
From: claudia@idec04.inf.uni-jena.de (Claudia Leopold)

L.G.Valiant: Why BSP Computers? Proc. of the 1993 Int. Parallel Processing
Symposium IEEE Computer Society Press

W.F.McColl: General purpose parallel computing, in: A.M.Gibbons, P.Spirakis
(eds) Lectures on Parallel Computation. Cambridge University Press, UK, 1993

Furthermore I remember having seen some paper in CONPAR (I believe CONPAR'92),
by other autors, on practical experiments with BSP.

Also,          

D.Bilardi, F.P.Preparata: Horizons of Parallel Computing, in: Future
Tendencies in Computer Science, Control and Applied Mathematice, A.
Bensoussan, J.P.Verjus (eds) (LNCS series)

includes discussion of BSP.

-------------------------
From: Andreas Savva <andreas@cs.titech.ac.jp>

Hermann Hellwagner. 
On the Practical Efficiency of Randomised Shared Memory.
Lecture Notes in CS, Parallel processing CONPAR92-VAPPV, pp429-444-, 1992

"Using the Bulk-Synchronous Parallel Model with Randomised Shared Memory
for Graceful Degradation."
Technical Report of IEICE. FTS93-21(1993-08)
Andreas Savva and Takashi Nanya

(Note: IEICE = The Institute of Electronics, Information 
		and Communication Engineers, Japan)
-------------------------
Newsgroups: comp.parallel
From: dbader@Glue.umd.edu (David Bader)
Subject: Re: BULK SYNCHRONOUS MODEL
Organization: Project Glue, University of Maryland, College Park
Date: Wed, 14 Sep 1994 01:02:54 GMT
Message-ID: <351sli$5sl@analog.eng.umd.edu>

[snip]

Also, here are two references to Dr. JaJa's Block Distributed Memory
(BDM) model:

@techreport{JR94a,
   AUTHOR    = {J. Ja'Ja' and K.W.~Ryu},
   TITLE     = {The Block Distributed Memory Model},
   YEAR      = {1994},
   ADDRESS   = {College Park},
   INSTITUTION = {Computer Science Department, University of Maryland},
   MONTH     = {January},
   NUMBER    = {CS-TR-3207},
   TYPE      = {Technical Report}
}

@inproceedings{JR94b,
   AUTHOR    = {J.F. Ja'Ja' and K.W.~Ryu},
   TITLE     = {The Block Distributed Memory Model 
                for Shared Memory Multiprocessors},
   BOOKTITLE = {Proceedings of the 8th International 
                Parallel Processing Symposium},
   YEAR      = {1994},
   ADDRESS   = {Cancun, Mexico},
   MONTH     = {April},
   NOTE      = {(Extended Abstract)},
   PAGES     = {752-756}
}

David A. Bader 
Electrical Engineering Department
A.V. Williams Building 
University of Maryland
College Park, MD 20742
Office: 301-405-6755   
FAX:    301-314-9658

Internet: dbader@eng.umd.edu
WWW:      http://www.umiacs.umd.edu/research/dbader/
---------------------------
Newsgroups: comp.parallel
From: tariq@prl.ufl.edu (Tariq Rashid)
Subject: BULK Syn. Model
Organization: University of Florida Parallel Reasearch Lab.
Date: Sun, 18 Sep 1994 20:40:19 GMT
Message-ID: <35i8jj$dc@sand.cis.ufl.edu>


Here are the responses I got on my request for "Bulk Synchronous Model".
I would like to thank these people.
---------------------------------------------------------------------------
From: <kilian@ksr.com>

Hi,

	The BSP model has been developed primarily by Les Valiant at Harvard
and is actively being researched by other professors and students there.

I'm sure there are a number of technical reports available.  Try sending a
request to harlow@das.harvard.edu (Carol Harlow, Les Valiant's secretary).
Another person to try would be Allyn Dimock (dimock@das.harvard.edu) a

---------------------------------------------------------------------------
From: <A.D.Ben-Dyke@computer-science.birmingham.ac.uk>

Tariq,

	I  attended  a  workshop on  the BSP  model  last  year -- the
proceedings  are available  from  the  Parallel Processing  Specialist
Group of the British  Computer Society (the title  of the workshop was
"General Purpose Parallel Computing".)   A  wide range  of topics were
discussed, ranging from the highly theoretic to implementation in p4.
@Article{BSP:Model,
  author = 	{Valiant, L.G.},
  title = 	{A Bridging Model for Parallel Computation},
  journal = 	{CACM},
  volume = 	{33},
  number = 	{8},
  month = 	{August},
  year = 	{1990},
  pages = 	{103--111},
  annote =	{Outlines the BSP - bulk-synchronous parallel - an 
		  extension to the PRAM model, enabling async 
		  operation and modelling latency and limited 
		  bandwidth.  The basic thesis of the paper is that
		  some model is needed to bridge the gap between
		  hardware and software, enabling advances in both to
		  be decoupled form the other. If a certain
		  programming methodology is adhered to, only a few
		  extra parameters are necessary  (i.e. only a certain
		  number of messages are sent/received by any one
		  processes or groups of processes).  If there is
		  sufficient slack in the program, automatic
		  compilation will result in near optimal efficiency
		  (using hashing techniques), though direct control by
		  the programmer is possible, and in certain
		  situations, recommended.  Examples show how several
		  well know operations can be performed on the
		  machine, and also discusses two possible techniques
		  for implementing such a system in hardware.},
  got =		{yes}
}

---------------------------------------------------------------------------
From: Margaret Reid-Miller <mrmiller@GS5.SP.CS.CMU.EDU>

I use the bulk synchronous model to implement a parallel list ranking
and list scan algorithm on a CRAY C-90.  See,

@inproceedings{spaa-listscan94,
        author = "Margaret Reid-Miller",
        title = "List Ranking and List Scan on the {C}RAY {C}-90",
        booktitle = "Proceedings Symposium on 
                Parallel Algorithms and Architectures",
	address = "Cape May, NJ",
	month = jun,
	year = "1994",
	pages = "104--113"}

The paper is also available on World Wide Web via the following URL:

http://www.cs.cmu.edu:8001/Web/Group/Scandal/www/papers.html

Also, Marco Zagha at CMU is preparing a PhD thesis that uses the Bulk
Synchronous Model as a way to hide latency for efficient
implementations of irregular computations on vector multiprocessors.
---------------------------------------------------------------------------
From: Sanjay Ranka <ranka@top.cis.syr.edu>

There was a paper in the last issue of JPDC on BSP. This paper
has also all the references to the previous work.

WE are currently investigating implementing communication structures
to implement BSP model on coarse grained distributed memory m/c.
Let me know if you are interested and I would be happy to provide
you the details.
---------------------------------------------------------------------------
From: Rob Bisseling <bisselin@math.ruu.nl>

Dear Tariq,
In reply to your comp.parallel message I advise you to look
at http://www.comlab.ox.ac.uk/oucl/people/bill.mccol.html
and look at his publist, part of which is available.
I wrote a paper with him, which is also avalaible:

R. H. Bisseling and W. F. McColl,
``Scientific Computing on Bulk Synchronous Parallel Architectures,"
Preprint 836, Dept. of Mathematics,  Utrecht University, Dec. 1993,
31pp.

which explains the model and its use foer sparse matrix computations.
The paper is available through the above homepage.
Regards, Rob Bisseling
---------------------------------------------------------------------------
From: "W.Mikanik" <W.Mikanik@ukc.ac.uk>

In relpy to your message form Mon, 12 Sep 1994:

1. I think (but I'm not sure) BSP was proposed by L.G. VALIANT in
   "Handbook of theoretical computer science"
    J. van LEEUVEN (ed.)
    North-Holland, Amsterdam, Elsevier Science Publishers B.V 1990
    Chapter: "General purpose parallel architectures"

2. I've found another article:
   "Bulk synchronuous parallelism", Michael G. NORMAN
    in "WoTUG Newsletters" No. 17, July 1992

Hope this helps.
I'm interested in this topic too, so please send my summary of
information you get or just follow me e-mail letters.
Wojciech Mikanik