[DBPP] previous next up contents index [Search]
Next: Index Up: Designing and Building Parallel Programs Previous: 12 Further Reading

References

1
ACM. Resources in Parallel and Concurrent Systems. ACM Press, 1991.

2
G. Adams, D. Agrawal, and H. Siegel. A survey and comparison of fault-tolerant multistage interconnection networks. IEEE Trans. Computs., C-20(6):14--29, 1987.

3
J. Adams, W. Brainerd, J. Martin, B. Smith, and J. Wagener. The Fortran 90 Handbook. McGraw-Hill, 1992.

4
A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Commun. ACM, 31(9):1116--1127, 1988.

5
G. Agha. Actors. MIT Press, 1986.

6
G. Agrawal, A. Sussman, and J. Saltz. Compiler and runtime support for structured and block structured applications. In Proc. Supercomputing '93, pages 578--587, 1993.

7
A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.

8
S. Akl. The Design and Analysis of Parallel Algorithms. Prentice-Hall, 1989.

9
S. G. Akl and K. A. Lyons. Parallel Computational Geometry. Prentice-Hall, 1993.

10
E. Albert, J. Lukas, and G. Steele. Data parallel computers and the FORALL statement. J. Parallel and Distributed Computing, 13(2):185--192, 1991.

11
G. S. Almasi and A. Gottlieb. Highly Parallel Computing. Benjamin/Cummings, second edition, 1994.

12
G. Amdahl. Validity of the single-processor approach to achieving large-scale computing capabilities. In Proc. 1967 AFIPS Conf., volume 30, page 483. AFIPS Press, 1967.

13
S. Anderson. Random number generators. SIAM Review, 32(2):221--251, 1990.

14
G. R. Andrews. Concurrent Programming: Principles and Practice. Benjamin/Cummings, 1991.

15
G. R. Andrews and R. A. Olsson. The SR Programming Language: Concurrency in Practice. Benjamin/Cummings, 1993.

16
ANSI X3J3/S8.115. Fortran 90, 1990.

17
S. Arvindam, V. Kumar, and V. Rao. Floorplan optimization on multiprocessors. In Proc. 1989 Intl Conf. on Computer Design, pages 109--113. IEEE Computer Society, 1989.

18
W. C. Athas and C. L. Seitz. Multicomputers: Message-passing concurrent computers. Computer, 21(8):9--24, 1988.

19
J. Auerbach, A. Goldberg, G. Goldszmidt, A. Gopal, M. Kennedy, J. Rao, and J. Russell. Concert/C: A language for distributed programming. In Winter 1994 USENIX Conference. Usenix Association, 1994.

20
A. Averbuch, E. Gabber, B. Gordissky, and Y. Medan. A parallel FFT on an MIMD machine. Parallel Computing, 15:61--74, 1990.

21
D. Bailey. FFTs in external or hierarchical memory. J. Supercomputing, 4:23--35, 1990.

22
J. Bailey. First we reshape our computers, then they reshape us: The broader intellectual impact of parallelism. Daedalus, 121(1):67--86, 1992.

23
H. E. Bal, J. G. Steiner, and A. S. Tanenbaum. Programming languages for distributed computing systems. ACM Computing Surveys, 21(3):261--322, 1989.

24
V. Bala and S. Kipnis. Process groups: A mechanism for the coordination of and communication among processes in the Venus collective communication library. Technical report, IBM T. J. Watson Research Center, 1992.

25
V. Bala, S. Kipnis, L. Rudolph, and M. Snir. Designing efficient, scalable, and portable collective communication libraries. Technical report, IBM T. J. Watson Research Center, 1992. Preprint.

26
P. Banerjee. Parallel Algorithms For VLSI Computer-Aided Design. Prentice-Hall, 1994.

27
U. Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, 1988.

28
S. Barnard and H. Simon. Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. Concurrency: Practice and Experience, 6(2):101--117, 1994.

29
J. Barton and L. Nackman. Scientific and Engineering C++. Addison-Wesley, 1994.

30
K. Batcher. Sorting networks and their applications. In Proc. 1968 AFIPS Conf., volume 32, page 307. AFIPS Press, 1968.

31
BBN Advanced Computers Inc. TC-2000 Technical Product Summary, 1989.

32
M. Ben-Ari. Principles of Concurrent and Distributed Programming. Prentice-Hall, 1990.

33
M. Berger and S. Bokhari. A partitioning strategy for nonuniform problems on multiprocessors. IEEE Trans. Computs., C-36(5):570--580, 1987.

34
F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. J. Parallel and Distributed Computing, 4(5):439--458, 1987.

35
D. Bertsekas and J. Tsitsiklis. Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, 1989.

36
D. P. Bertsekas, C. Ozveren, G. D. Stamoulis, P. Tseng, and J. N. Tsitsiklis. Optimal communication algorithms for hypercubes. J. Parallel and Distributed Computing, 11:263--275, 1991.

37
G. Blelloch. Vector Models for Data-Parallel Computing. MIT Press, 1990.

38
F. Bodin, P. Beckman, D. B. Gannon, S. Narayana, and S. Yang. Distributed pC++: Basic ideas for an object parallel language. In Proc. Supercomputing '91, pages 273--282, 1991.

39
S. Bokhari. On the mapping problem. IEEE Trans. Computs., C-30(3):207--214, 1981.

40
G. Booch. Object-Oriented Design with Applications. Benjamin-Cummings, 1991.

41
R. Bordawekar, J. del Rosario, and A. Choudhary. Design and evaluation of primitives for parallel I/O. In Proc. Supercomputing '93, pages 452--461. ACM, 1993.

42
Z. Bozkus, A. Choudhary, G. Fox, T. Haupt, and S. Ranka. Fortran 90D/HPF compiler for distributed memory MIMD computers: Design, implementation, and performance results. In Proc. Supercomputing '93. IEEE Computer Society, 1993.

43
W. Brainerd, C. Goldberg, and J. Adams. Programmer's Guide to Fortran 90. McGraw-Hill, 1990.

44
R. Butler and E. Lusk. Monitors, message, and clusters: The p4 parallel programming system. Parallel Computing, 20:547--564, 1994.

45
D. Callahan and K. Kennedy. Compiling programs for distributed-memory multiprocessors. J. Supercomputing, 2:151--169, 1988.

46
G. F. Carey, editor. Parallel Supercomputing: Methods, Algorithms and Applications. Wiley, 1989.

47
N. Carriero and D. Gelernter. Linda in context. Commun. ACM, 32(4):444--458, 1989.

48
N. Carriero and D. Gelernter. How to Write Parallel Programs. MIT Press, 1990.

49
N. Carriero and D. Gelernter. Tuple analysis and partial evaluation strategies in the Linda pre-compiler. In Languages and Compilers for Parallel Computing. MIT-Press, 1990.

50
R. Chandra, A. Gupta, and J. Hennessy. COOL: An object-based language for parallel programming. Computer, 27(8):14--26, 1994.

51
K. M. Chandy and I. Foster. A deterministic notation for cooperating processes. IEEE Trans. Parallel and Distributed Syst., 1995. to appear.

52
K. M. Chandy, I. Foster, K. Kennedy, C. Koelbel, and C.-W. Tseng. Integrated support for task and data parallelism. Intl J. Supercomputer Applications, 8(2):80--98, 1994.

53
K. M. Chandy and C. Kesselman. CC++: A declarative concurrent object-oriented programming notation. In Research Directions in Concurrent Object-Oriented Programming. MIT Press, 1993.

54
K. M. Chandy and J. Misra. Parallel Program Design. Addison-Wesley, 1988.

55
K. M. Chandy and S. Taylor. An Introduction to Parallel Programming. Jones and Bartlett, 1992.

56
B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran. Scientific Programming, 1(1):31--50, 1992.

57
B. Chapman, P. Mehrotra, and H. Zima. Extending HPF for advanced data-parallel applications. IEEE Parallel and Distributed Technology, 2(3):15--27, 1994.

58
D. Y. Cheng. A survey of parallel programming languages and tools. Technical Report RND-93-005, NASA Ames Research Center, Moffett Field, Calif., 1993.

59
J. Choi, J. Dongarra, and D. Walker. PUMMA: Parallel Universal Matrix Multiplication Algorithms on distributed memory concurrent computers. Concurrency: Practice and Experience, 6, 1994.

60
A. Choudhary. Parallel I/O systems, guest editor's introduction. J. Parallel and Distributed Computing, 17(1--2):1--3, 1993.

61
S. Chowdhury. The greedy load-sharing algorithm. J. Parallel and Distributed Computing, 9(1):93--99, 1990.

62
M. Colvin, C. Janssen, R. Whiteside, and C. Tong. Parallel Direct-SCF for large-scale calculations. Technical report, Center for Computational Engineering, Sandia National Laboratories, Livermore, Cal., 1991.

63
D. Comer. Internetworking with TCP/IP. Prentice-Hall, 1988.

64
S. Cook. The classification of problems which have fast parallel algorithms. In Proc. 1983 Intl Foundation of Computation Theory Conf., volume 158, pages 78--93. Springer-Verlag LNCS, 1983.

65
T. Cormen, C. Leiserson, and R. Rivest. Introduction to Algorithms. MIT Press, 1990.

66
B. Cox and A. Novobilski. Object-Oriented Programming: An Evolutionary Approach. Addison-Wesley, 1991.

67
D. Culler et al. LogP: Towards a realistic model of parallel computation. In Proc. 4th Symp. Principles and Practice of Parallel Programming, pages 1--12. ACM, 1993.

68
G. Cybenko. Dynamic load balancing for distributed memory multiprocessors. J. Parallel and Distributed Computing, 7:279--301, 1989.

69
W. Dally. A VLSI Architecture for Concurrent Data Structures. Kluwer Academic Publishers, 1987.

70
W. Dally and C. L. Seitz. The torus routing chip. J. Distributed Systems, 1(3):187--196, 1986.

71
W. Dally and C. L. Seitz. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans. Computs., C-36(5):547--553, 1987.

72
W. J. Dally et al. The message-driven processor. IEEE Micro., 12(2):23--39, 1992.

73
C. R. Das, N. Deo, and S. Prasad. Parallel graph algorithms for hypercube computers. Parallel Computing, 13:143--158, 1990.

74
C. R. Das, N. Deo, and S. Prasad. Two minimum spanning forest algorithms on fixed-size hypercube computers. Parallel Computing, 15:179--187, 1990.

75
A. L. DeCegama. The Technology of Parallel Processing: Parallel Processing Architectures and VLSI Hardware: Volume 1. Prentice-Hall, 1989.

76
J. del Rosario and A. Choudhary. High-Performance I/O for Parallel Computers: Problems and Prospects. Computer, 27(3):59--68, 1994.

77
J. W. Demmel, M. T. Heath, and H. A. van der Vorst. Parallel numerical linear algebra. Acta Numerica, 10:111--197, 1993.

78
P. M. Dew, R. A. Earnshaw, and T. R. Heywood. Parallel Processing for Computer Vision and Display. Addison-Wesley, 1989.

79
D. DeWitt and J. Gray. Parallel database systems: The future of high-performance database systems. Commun. ACM, 35(6):85--98, 1992.

80
E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1:269--271, 1959.

81
E. W. Dijkstra, W. H. J. Feijen, and A. J. M. V. Gasteren. Derivation of a termination detection algorithm for a distributed computation. Information Processing Letters, 16(5):217--219, 1983.

82
J. Dongarra, I. Duff, D. Sorensen, and H. van der Vorst. Solving Linear Systems on Vector and Shared Memory Computers. SIAM, 1991.

83
J. Dongarra, R. Pozo, and D. Walker. ScaLAPACK++: An object-oriented linear algebra library for scalable systems. In Proc. Scalable Parallel Libraries Conf., pages 216--223. IEEE Computer Society, 1993.

84
J. Dongarra, R. van de Geign, and D. Walker. Scalability issues affecting the design of a dense linear algebra library. J. Parallel and Distributed Computing, 22(3):523--537, 1994.

85
J. Dongarra and D. Walker. Software libraries for linear algebra computations on high performance computers. SIAM Review, 1995. to appear.

86
J. Drake, I. Foster, J. Hack, J. Michalakes, B. Semeraro, B. Toonen, D. Williamson, and P. Worley. PCCM2: A GCM adapted for scalable parallel computers. In Proc. 5th Symp. on Global Change Studies, pages 91--98. American Meteorological Society, 1994.

87
R. Duncan. A survey of parallel computer architectures. Computer, 23(2):5--16, 1990.

88
R. Duncan. Parallel computer architectures. In Advances in Computers, volume 34, pages 113--152. Academic Press, 1992.

89
D. L. Eager, J. Zahorjan, and E. D. Lazowska. Speedup versus efficiency in parallel systems. IEEE Trans. Computs., C-38(3):408--423, 1989.

90
Edinburgh Parallel Computing Centre, University of Edinburgh. CHIMP Concepts, 1991.

91
Edinburgh Parallel Computing Centre, University of Edinburgh. CHIMP Version 1.0 Interface, 1992.

92
M. A. Ellis and B. Stroustrup. The Annotated C++ Reference Manual. Addison-Wesley, 1990.

93
V. Faber, O. Lubeck, and A. White. Superlinear speedup of an efficient parallel algorithm is not possible. Parallel Computing, 3:259--260, 1986.

94
T. Y. Feng. A survey of interconnection networks. IEEE Computer, 14(12):12--27, 1981.

95
J. Feo, D. Cann, and R. Oldehoeft. A report on the SISAL language project. J. Parallel and Distributed Computing, 12(10):349--366, 1990.

96
M. Feyereisen and R. Kendall. An efficient implementation of the Direct-SCF algorithm on parallel computer architectures. Theoretica Chimica Acta, 84:289--299, 1993.

97
H. P. Flatt and K. Kennedy. Performance of parallel processors. Parallel Computing, 12(1):1--20, 1989.

98
R. Floyd. Algorithm 97: Shortest path. Commun. ACM, 5(6):345, 1962.

99
S. Fortune and J. Wyllie. Parallelism in random access machines. In Proc. ACM Symp. on Theory of Computing, pages 114--118. ACM, 1978.

100
I. Foster. Task parallelism and high performance languages. IEEE Parallel and Distributed Technology, 2(3):39--48, 1994.

101
I. Foster, B. Avalani, A. Choudhary, and M. Xu. A compilation system that integrates High Performance Fortran and Fortran M. In Proc. 1994 Scalable High-Performance Computing Conf., pages 293--300. IEEE Computer Society, 1994.

102
I. Foster and K. M. Chandy. Fortran M: A language for modular parallel programming. J. Parallel and Distributed Computing, 25(1), 1995.

103
I. Foster, M. Henderson, and R. Stevens. Data systems for parallel climate models. Technical Report ANL/MCS-TM-169, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1991.

104
I. Foster, C. Kesselman, and S. Taylor. Concurrency: Simple concepts and powerful tools. Computer J., 33(6):501--507, 1990.

105
I. Foster, R. Olson, and S. Tuecke. Productive parallel programming: The PCN approach. Scientific Programming, 1(1):51--66, 1992.

106
I. Foster, R. Olson, and S. Tuecke. Programming in Fortran M. Technical Report ANL-93/26, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1993.

107
I. Foster and S. Taylor. Strand: New Concepts in Parallel Programming. Prentice-Hall, 1989.

108
I. Foster, J. Tilson, A. Wagner, R. Shepard, R. Harrison, R. Kendall, and R. Littlefield. High performance computational chemistry: (I) Scalable Fock matrix construction algorithms. Preprint, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1994.

109
I. Foster and B. Toonen. Load-balancing algorithms for climate models. In Proc. 1994 Scalable High-Performance Computing Conf., pages 674--681. IEEE Computer Society, 1994.

110
I. Foster and P. Worley. Parallel algorithms for the spectral transform method. Preprint MCS-P426-0494, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1994.

111
G. Fox et al. Solving Problems on Concurrent Processors. Prentice-Hall, 1988.

112
G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and M. Wu. Fortran D language specification. Technical Report TR90-141, Dept. of Computer Science, Rice University, 1990.

113
G. Fox, R. Williams, and P. Messina. Parallel Computing Works! Morgan Kaufman, 1994.

114
P. Frederickson, R. Hiromoto, T. Jordan, B. Smith, and T. Warnock. Pseudo-random trees in Monte Carlo. Parallel Computing, 1:175--180, 1984.

115
H. J. Fromm, U. Hercksen, U. Herzog, K. H. John, R. Klar, and W. Kleinoder. Experiences with performance measurement and modeling of a processor array. IEEE Trans. Computs., C-32(1):15--31, 1983.

116
K. Gallivan, R. Plemmons, and A. Sameh. Parallel algorithms for dense linear algebra computations. SIAM Review, 32(1):54--135, 1990.

117
N. Gehani and W. Roome. The Concurrent C Programming Language. Silicon Press, 1988.

118
G. A. Geist, M. T. Heath, B. W. Peyton, and P. H. Worley. A user's guide to PICL: A portable instrumented communication library. Technical Report TM-11616, Oak Ridge National Laboratory, 1990.

119
A. Gibbons and W. Rytter. Efficient Parallel Algorithms. Cambridge University Press, 1990.

120
G. A. Gibson. Redundant Disk Arrays: Reliable, Parallel Secondary Storage. MIT Press, 1992.

121
H. Goldstine and J. von Neumann. On the principles of large-scale computing machines. In Collected Works of John von Neumann, Vol. 5. Pergamon, 1963.

122
G. H. Golub and J. M. Ortega. Scientific Computing: An Introduction with Parallel Computing. Academic Press, 1993.

123
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir. The NYU ultracomputer: Designing a MIMD, shared memory parallel computer. IEEE Trans. Computs., C-32(2):175--189, 1983.

124
S. Graham, P. Kessler, and M. McKusick. gprof: A call graph execution profiler. In Proc. SIGPLAN '92 Symposium on Compiler Construction, pages 120--126. ACM, 1982.

125
A. S. Grimshaw. An introduction to parallel object-oriented programming with Mentat. Technical Report 91 07, University of Virginia, 1991.

126
W. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel Programming with the Message Passing Interface. MIT Press, 1995.

127
W. Gropp and B. Smith. Scalable, extensible, and portable numerical libraries. In Proc. Scalable Parallel Libraries Conf., pages 87--93. IEEE Computer Society, 1993.

128
A. Gupta. Parallelism in Production Systems. Morgan Kaufmann, 1987.

129
J. L. Gustafson. Reevaluating Amdahl's law. Commun. ACM, 31(5):532--533, 1988.

130
J. L. Gustafson, G. R. Montry, and R. E. Benner. Development of parallel methods for a 1024-processor hypercube. SIAM J. Sci. and Stat. Computing, 9(4):609--638, 1988.

131
A. Hac. Load balancing in distributed systems: A summary. Performance Evaluation Review, 16(2):17--19, 1989.

132
G. Haring and G. Kotsis, editors. Performance Measurement and Visualization of Parallel Systems. Elsevier Science Publishers, 1993.

133
P. Harrison. Analytic models for multistage interconnection networks. J. Parallel and Distributed Computing, 12(4):357--369, 1991.

134
P. Harrison and N. M. Patel. The representation of multistage interconnection networks in queuing models of parallel systems. J. ACM, 37(4):863--898, 1990.

135
R. Harrison et al. High performance computational chemistry: (II) A scalable SCF code. Preprint, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1994.

136
P. Hatcher and M. Quinn. Data-Parallel Programming on MIMD Computers. MIT Press, 1991.

137
P. Hatcher, M. Quinn, et al. Data-parallel programming on MIMD computers. IEEE Trans. Parallel and Distributed Syst., 2(3):377--383, 1991.

138
M. Heath. Recent developments and case studies in performance visualization using ParaGraph. In Performance Measurement and Visualization of Parallel Systems, pages 175--200. Elsevier Science Publishers, 1993.

139
M. Heath and J. Etheridge. Visualizing the performance of parallel programs. IEEE Software, 8(5):29--39, 1991.

140
M. Heath, E. Ng, and B. Peyton. Parallel algorithms for sparse linear systems. SIAM Review, 33(3):420--460, 1991.

141
M. Heath, A. Rosenberg, and B. Smith. The physical mapping problem for parallel architectures. J. ACM, 35(3):603--634, 1988.

142
W. Hehre, L. Radom, P. Schleyer, and J. Pople. Ab Initio Molecular Orbital Theory. John Wiley and Sons, 1986.

143
R. Hempel. The ANL/GMD macros (PARMACS) in Fortran for portable parallel programming using the message passing programming model -- users' guide and reference manual. Technical report, GMD, Postfach 1316, D-5205 Sankt Augustin 1, Germany, 1991.

144
R. Hempel, H.-C. Hoppe, and A. Supalov. PARMACS 6.0 library interface specification. Technical report, GMD, Postfach 1316, D-5205 Sankt Augustin 1, Germany, 1992.

145
M. Henderson, B. Nickless, and R. Stevens. A scalable high-performance I/O system. In Proc. 1994 Scalable High-Performance Computing Conf., pages 79--86. IEEE Computer Society, 1994.

146
P. Henderson. Functional Programming. Prentice-Hall, 1980.

147
J. Hennessy and N. Joupp. Computer technology and architecture: An evolving interaction. Computer, 24(9):18--29, 1991.

148
V. Herrarte and E. Lusk. Studying parallel program behavior with upshot. Technical Report ANL-91/15, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1991.

149
High Performance Fortran Forum. High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Center for Research on Parallel Computation, Rice University, Houston, Tex., 1993.

150
W. D. Hillis. The Connection Machine. MIT Press, 1985.

151
W. D. Hillis and G. L. Steele. Data parallel algorithms. Commun. ACM, 29(12):1170--1183, 1986.

152
S. Hiranandani, K. Kennedy, and C. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Commun. ACM, 35(8):66--80, 1992.

153
C. A. R. Hoare. Quicksort. Computer J., 5(1):10--15, 1962.

154
C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1984.

155
G. Hoffmann and T. Kauranne, editors. Parallel Supercomputing in the Atmospheric Sciences. World Scientific, 1993.

156
K. Hwang. Advanced Computer Architecture: Parallelism, Scalability, Programmability. McGraw-Hill, 1993.

157
J. JáJá. An Introduction to Parallel Algorithms. Addison-Wesley, 1992.

158
J. Jenq and S. Sahni. All pairs shortest paths on a hypercube multiprocessor. In Proc. 1987 Intl. Conf. on Parallel Processing, pages 713--716, 1987.

159
S. L. Johnsson. Communication efficient basic linear algebra computations on hypercube architectures. J. Parallel and Distributed Computing, 4(2):133--172, 1987.

160
S. L. Johnsson and C.-T. Ho. Optimum broadcasting and personalized communication in hypercubes. IEEE Trans. Computs., C-38(9):1249--1268, 1989.

161
M. Jones and P. Plassmann. Parallel algorithms for the adaptive refinement and partitioning of unstructured meshes. In Proc. 1994 Scalable High-Performance Computing Conf., pages 478--485. IEEE Computer Society, 1994.

162
R. Kahn. Resource-sharing computer communication networks. Proc. IEEE, 60(11):1397--1407, 1972.

163
M. Kalos. The Basics of Monte Carlo Methods. J. Wiley and Sons, 1985.

164
L. N. Kanal and V. Kumar. Search in Artificial Intelligence. Springer-Verlag, 1988.

165
A. Karp and R. Babb. A comparison of twelve parallel Fortran dialects. IEEE Software, 5(5):52--67, 1988.

166
A. H. Karp. Programming for parallelism. IEEE Computer, 20(9):43--57, 1987.

167
A. H. Karp and H. P. Flatt. Measuring parallel processor performance. Commun. ACM, 33(5):539--543, 1990.

168
R. Katz, G. Gibson, and D. Patterson. Disk system architectures for high performance computing. Proc. IEEE, 77(12):1842--1858, 1989.

169
W. J. Kaufmann and L. L. Smarr. Supercomputing and the Transformation of Science. Scientific American Library, 1993.

170
B. Kernighan and D. Ritchie. The C Programming Language. Prentice Hall, second edition, 1988.

171
J. Kerrigan. Migrating to Fortran 90. O'Reilly and Associates, 1992.

172
C. Kesselman. Integrating Performance Analysis with Performance Improvement in Parallel Programs. PhD thesis, UCLA, 1991.

173
L. Kleinrock. On the modeling and analysis of computer networks. Proc. IEEE, 81(8):1179--1191, 1993.

174
D. Knuth. The Art of Computer Programming: Volume 3, Sorting and Searching. Addison-Wesley, 1973.

175
D. Knuth. The Art of Computer Programming: Volume 2, Seminumerical Algorithms. Addison-Wesley, 1981.

176
C. Koelbel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel. The High Performance Fortran Handbook. MIT Press, 1994.

177
S. Koonin and D. Meredith. Computational Physics. Addison-Wesley, 1990.

178
J. S. Kowalik. Parallel Computation and Computers for Artificial Intelligence. Kluwer Academic Publishers, 1988.

179
V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing. Benjamin/Cummings, 1993.

180
V. Kumar, A. Grama, and V. Rao. Scalable load balancing techniques for parallel computers. J. Parallel and Distributed Computing, 22(1):60--79, 1994.

181
V. Kumar and V. Rao. Parallel depth-first search, part II: Analysis. Intl J. of Parallel Programming, 16(6):479--499, 1987.

182
V. Kumar and V. Singh. Scalability of parallel algorithms for the all-pairs shortest-path problem. J. Parallel and Distributed Computing, 13(2):124--138, 1991.

183
T. Lai and S. Sahni. Anomalies in parallel branch-and-bound algorithms. Commun. ACM, 27(6):594--602, 1984.

184
S. Lakshmivarahan and S. K. Dhall. Analysis and Design of Parallel Algorithms: Arithmetic and Matrix Problems. McGraw-Hill, 1990.

185
L. Lamport. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7):558--565, 1978.

186
H. Lawson. Parallel Processing in Industrial Real-time Applications. Prentice Hall, 1992.

187
F. T. Leighton. Introduction to Parallel Algorithms and Architectures. Morgan Kaufmann, 1992.

188
M. Lemke and D. Quinlan. P++, a parallel C++ array class library for architecture-independent development of structured grid applications. In Proc. Workshop on Languages, Compilers, and Runtime Environments for Distributed Memory Computers. ACM, 1992.

189
E. Levin. Grand challenges in computational science. Commun. ACM, 32(12):1456--1457, 1989.

190
F. C. H. Lin and R. M. Keller. The gradient model load balancing method. IEEE Trans. Software Eng., SE-13(1):32--38, 1987.

191
V. Lo. Heuristic algorithms for task assignment in distributed systems. IEEE Trans. Computs., C-37(11):1384--1397, 1988.

192
C. Loan. Computational Frameworks for the Fast Fourier Transform. SIAM, 1992.

193
D. Loveman. High Performance Fortran. IEEE Parallel and Distributed Technology, 1(1):25--42, 1993.

194
E. Lusk, R. Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehard, and Winston, 1987.

195
U. Manber. On maintaining dynamic information in a concurrent environment. SIAM J. Computing, 15(4):1130--1142, 1986.

196
O. McBryan. An overview of message passing environments. Parallel Computing, 20(4):417--444, 1994.

197
O. A. McBryan and E. F. V. de Velde. Hypercube algorithms and implementations. SIAM J. Sci. and Stat. Computing, 8(2):227--287, 1987.

198
S. McConnell. Code Complete: A Practical Handbook of Software Construction. Microsoft Press, 1993.

199
C. Mead and L. Conway. Introduction to VLSI Systems. Addison-Wesley, 1980.

200
P. Mehrotra and J. Van Rosendale. Programming distributed memory architectures using Kali. In Advances in Languages and Compilers for Parallel Computing. MIT Press, 1991.

201
J. D. Meindl. Chips for advanced computing. Scientific American, 257(4):78--88, 1987.

202
Message Passing Interface Forum. Document for a standard message-passing interface. Technical report, University of Tennessee, Knoxville, Tenn., 1993.

203
Message Passing Interface Forum. MPI: A message passing interface. In Proc. Supercomputing '93, pages 878--883. IEEE Computer Society, 1993.

204
M. Metcalf and J. Reid. Fortran 90 Explained. Oxford Science Publications, 1990.

205
R. Metcalfe and D. Boggs. Ethernet: Distributed packet switching for local area networks. Commun. ACM, 19(7):711--719, 1976.

206
J. Michalakes. Analysis of workload and load balancing issues in the NCAR community climate model. Technical Report ANL/MCS-TM-144, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1991.

207
B. Miller et al. IPS-2: The second generation of a parallel program measurement system. IEEE Trans. Parallel and Distributed Syst., 1(2):206--217, 1990.

208
E. Miller and R. Katz. Input/output behavior of supercomputing applications. In Proc. Supercomputing '91, pages 567--576. ACM, 1991.

209
R. Miller and Q. F. Stout. Parallel Algorithms for Regular Architectures. MIT Press, 1992.

210
R. Milner. Calculi for synchrony and asynchrony. Theoretical Computer Science, 25:267--310, 1983.

211
nCUBE Corporation. nCUBE 2 Programmers Guide, r2.0, 1990.

212
nCUBE Corporation. nCUBE 6400 Processor Manual, 1990.

213
D. M. Nicol and J. H. Saltz. An analysis of scatter decomposition. IEEE Trans. Computs., C-39(11):1337--1345, 1990.

214
N. Nilsson. Principles of Artificial Intelligence. Tioga Publishers, 1980.

215
Grand challenges: High performance computing and communications. A Report by the Committee on Physical, Mathematical and Engineering Sciences, NSF/CISE, 1800 G Street NW, Washington, DC 20550, 1991.

216
D. Nussbaum and A. Agarwal. Scalability of parallel machines. Commun. ACM, 34(3):56--61, 1991.

217
R. Paige and C. Kruskal. Parallel algorithms for shortest paths problems. In Proc. 1989 Intl. Conf. on Parallel Processing, pages 14--19, 1989.

218
C. Pancake and D. Bergmark. Do parallel languages respond to the needs of scientific programmers? Computer, 23(12):13--23, 1990.

219
Parasoft Corporation. Express Version 1.0: A Communication Environment for Parallel Computers, 1988.

220
D. Parnas. On the criteria to be used in decomposing systems into modules. Commun. ACM, 15(12):1053--1058, 1972.

221
D. Parnas. Designing software for ease of extension and contraction. IEEE Trans. Software Eng., SE-5(2):128--138, 1979.

222
D. Parnas and P. Clements. A rational design process: How and why to fake it. IEEE Trans. Software Eng., SE-12(2):251--257, 1986.

223
D. Parnas, P. Clements, and D. Weiss. The modular structure of complex systems. IEEE Trans. Software Eng., SE-11(3):259--266, 1985.

224
J. Patel. Analysis of multiprocessors with private cache memories. IEEE Trans. Computs., C-31(4):296--304, 1982.

225
J. Pearl. Heuristics---Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, 1984.

226
G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, V. A. Norlton, and J. Weiss. The IBM research parallel processor prototype (RP3): Introduction and architecture. In Proc. 1985 Intl Conf. on Parallel Processing, pages 764--771, 1985.

227
P. Pierce. The NX/2 operating system. In Proc. 3rd Conf. on Hypercube Concurrent Computers and Applications, pages 384--390. ACM Press, 1988.

228
J. Plank and K. Li. Performance results of ickp---A consistent checkpointer on the iPSC/860. In Proc. 1994 Scalable High-Performance Computing Conf., pages 686--693. IEEE Computer Society, 1994.

229
J. Pool et al. Survey of I/O intensive applications. Technical Report CCSF-38, CCSF, California Institute of Technology, 1994.

230
A. Pothen, H. Simon, and K. Liou. Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Mat. Anal. Appl., 11(3):430--452, 1990.

231
D. Pountain. A Tutorial Introduction to OCCAM Programming. INMOS Corporation, 1986.

232
A research and development strategy for high performance computing. Office of Science and Technology Policy, Executive Office of the President, 1987.

233
The federal high performance computing program. Office of Science and Technology Policy, Executive Office of the President, 1989.

234
M. Quinn. Analysis and implementation of branch-and-bound algorithms on a hypercube multicomputer. IEEE Trans. Computs., C-39(3):384--387, 1990.

235
M. Quinn. Parallel Computing: Theory and Practice. McGraw-Hill, 1994.

236
M. Quinn and N. Deo. Parallel graph algorithms. Computing Surveys, 16(3):319--348, 1984.

237
M. Quinn and N. Deo. An upper bound for the speedup of parallel best-bound branch-and-bound algorithms. BIT, 26(1):35--43, 1986.

238
S. Ranka and S. Sahni. Hypercube Algorithms for Image Processing and Pattern Recognition. Springer-Verlag, 1990.

239
V. Rao and V. Kumar. Parallel depth-first search, part I: Implementation. Intl. J. of Parallel Programming, 16(6):501--519, 1987.

240
D. A. Reed. Experimental Performance Analysis of Parallel Systems: Techniques and Open Problems. In Proc. 7th Intl Conf. on Modeling Techniques and Tools for Computer Performance Evaluation, 1994.

241
D. A. Reed, R. A. Aydt, R. J. Noe, P. C. Roth, K. A. Shields, B. W. Schwartz, and L. F. Tavera. Scalable Performance Analysis: The Pablo Performance Analysis Environment. In Proc. Scalable Parallel Libraries Conf., pages 104--113. IEEE Computer Society, 1993.

242
D. A. Reed and R. M. Fujimoto. Multicomputer Networks: Message-Based Parallel Processing. MIT Press, 1989.

243
A. Reinefeld and V. Schnecke. Work-load balancing in highly parallel depth-first search. In Proc. 1994 Scalable High-Performance Computing Conf., pages 773--780. IEEE Computer Society, 1994.

244
B. Ries, R. Anderson, W. Auld, D. Breazeal, K. Callaghan, E. Richards, and W. Smith. The Paragon performance monitoring environment. In Proc. Supercomputing '93, pages 850--859. IEEE Computer Society, 1993.

245
A. Rogers and K. Pingali. Process decomposition through locality of reference. In Proc. SIGPLAN '89 Conf. on Program Language Design and Implementation. ACM, 1989.

246
K. Rokusawa, N. Ichiyoshi, T. Chikayama, and H. Nakashima. An efficient termination detection and abortion algorithm for distributed processing systems. In Proc. 1988 Intl. Conf. on Parallel Processing: Vol. I, pages 18--22, 1988.

247
M. Rosing, R. B. Schnabel, and R. P. Weaver. The DINO parallel programming language. Technical Report CU-CS-501-90, Computer Science Department, University of Colorado at Boulder, Boulder, Col., 1990.

248
Y. Saad and M. H. Schultz. Topological properties of hypercubes. IEEE Trans. Computs., C-37:867--872, 1988.

249
Y. Saad and M. H. Schultz. Data communication in hypercubes. J. Parallel and Distributed Computing, 6:115--135, 1989.

250
P. Sadayappan and F. Ercal. Nearest-neighbor mapping of finite element graphs onto processor meshes. IEEE Trans. Computs., C-36(12):1408--1424, 1987.

251
J. Saltz, H. Berryman, and J. Wu. Multiprocessors and runtime compilation. Concurrency: Practice and Experience, 3(6):573--592, 1991.

252
J. Schwartz. Ultracomputers. ACM Trans. Program. Lang. Syst., 2(4):484--521, 1980.

253
C. L. Seitz. Concurrent VLSI architectures. IEEE Trans. Computs., C-33(12):1247--1265, 1984.

254
C. L. Seitz. The cosmic cube. Commun. ACM, 28(1):22--33, 1985.

255
C. L. Seitz. Multicomputers. In C.A.R. Hoare, editor, Developments in Concurrency and Communication. Addison-Wesley, 1991.

256
M. S. Shephard and M. K. Georges. Automatic three-dimensional mesh generation by the finite octree technique. Int. J. Num. Meth. Engng., 32(4):709--749, 1991.

257
J. Shoch, Y. Dalal, and D. Redell. Evolution of the Ethernet local computer network. Computer, 15(8):10--27, 1982.

258
H. Simon. Partitioning of unstructured problems for parallel processing. Computing Systems in Engineering, 2(2/3):135--148, 1991.

259
J. Singh, J. L. Hennessy, and A. Gupta. Scaling parallel programs for multiprocessors: Methodology and examples. IEEE Computer, 26(7):42--50, 1993.

260
M. Singhal. Deadlock detection in distributed systems. Computer, 22(11):37--48, 1989.

261
P. Sivilotti and P. Carlin. A tutorial for CC++. Technical Report CS-TR-94-02, Caltech, 1994.

262
A. Skjellum. The Multicomputer Toolbox: Current and future directions. In Proc. Scalable Parallel Libraries Conf., pages 94--103. IEEE Computer Society, 1993.

263
A. Skjellum, editor. Proc. 1993 Scalable Parallel Libraries Conf. IEEE Computer Society, 1993.

264
A. Skjellum, editor. Proc. 1994 Scalable Parallel Libraries Conf. IEEE Computer Society, 1994.

265
A. Skjellum, N. Doss, and P. Bangalore. Writing libraries in MPI. In Proc. Scalable Parallel Libraries Conf., pages 166--173. IEEE Computer Society, 1993.

266
A. Skjellum, S. Smith, N. Doss, A. Leung, and M. Morari. The design and evolution of Zipcode. Parallel Computing, 20:565--596, 1994.

267
J. R. Smith. The Design and Analysis of Parallel Algorithms. Oxford University Press, 1993.

268
L. Snyder. Type architectures, shared memory, and the corollary of modest potential. Ann. Rev. Comput. Sci., 1:289--317, 1986.

269
H. S. Stone. High-Performance Computer Architectures. Addison-Wesley, third edition, 1993.

270
B. Stroustrup. The C++ Programming Language. Addison-Wesley, second edition, 1991.

271
C. Stunkel, D. Shea, D. Grice, P. Hochschild, and M. Tsao. The SP1 high-performance switch. In Proc. 1994 Scalable High-Performance Computing Conf., pages 150--157. IEEE Computer Society, 1994.

272
R. Suaya and G. Birtwistle, editors. VLSI and Parallel Computation. Morgan Kaufmann, 1990.

273
J. Subhlok, J. Stichnoth, D. O'Hallaron, and T. Gross. Exploiting task and data parallelism on a multicomputer. In Proc. 4th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming. ACM, 1993.

274
X.-H. Sun and L. M. Ni. Scalable problems and memory-bounded speedup. J. Parallel and Distributed Computing, 19(1):27--37, 1993.

275
V. Sunderam. PVM: A framework for parallel distributed computing. Concurrency: Practice and Experience, 2(4):315--339, 1990.

276
Supercomputer Systems Division, Intel Corporation. Paragon XP/S Product Overview, 1991.

277
P. Swarztrauber. Multiprocessor FFTs. Parallel Computing, 5:197--210, 1987.

278
D. Tabak. Advanced Multiprocessors. McGraw-Hill, 1991.

279
A. Tantawi and D. Towsley. Optimal load balancing in distributed computer systems. J. ACM, 32(2):445--465, 1985.

280
R. Taylor and P. Wilson. Process-oriented language meets demands of distributed processing. Electronics, Nov. 30, 1982.

281
Thinking Machines Corporation. The CM-2 Technical Summary, 1990.

282
Thinking Machines Corporation. CM Fortran Reference Manual, version 2.1, 1993.

283
Thinking Machines Corporation. CMSSL for CM Fortran Reference Manual, version 3.0, 1993.

284
A. Thomasian and P. F. Bay. Analytic queuing network models for parallel processing of task systems. IEEE Trans. Computs., C-35(12):1045--1054, 1986.

285
E. Tufte. The Visual Display of Quantitative Information. Graphics Press, 1983.

286
J. Ullman. Computational Aspects of VLSI. Computer Science Press, 1984.

287
Building an advanced climate model: Program plan for the CHAMMP climate modeling program. U.S. Department of Energy, 1990. Available from National Technical Information Service, U.S. Dept of Commerce, 5285 Port Royal Rd, Springfield, VA 22161.

288
L. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990.

289
R. A. van de Geijn. Efficient global combine operations. In Proc. 6th Distributed Memory Computing Conf., pages 291--294. IEEE Computer Society, 1991.

290
E. F. van de Velde. Concurrent Scientific Computing. Number 16 in Texts in Applied Mathematics. Springer-Verlag, 1994.

291
Y. Wallach. Parallel Processing and Ada. Prentice-Hall, 1991.

292
W. Washington and C. Parkinson. An Introduction to Three-Dimensional Climate Modeling. University Science Books, 1986.

293
R. Williams. Performance of dynamic load balancing algorithms for unstructured mesh calculations. Concurrency: Practice and Experience, 3(5):457--481, 1991.

294
S. Wimer, I. Koren, and I. Cederbaum. Optimal aspect ratios of building blocks in VLSI. In Proc. 25th ACM/IEEE Design Automation Conf., pages 66--72, 1988.

295
N. Wirth. Program development by stepwise refinement. Commun. ACM, 14(4):221--227, 1971.

296
M. Wolfe. Optimizing Supercompilers for Supercomputers. MIT Press, 1989.

297
P. H. Worley. The effect of time constraints on scaled speedup. SIAM J. Sci. and Stat. Computing, 11(5):838--858, 1990.

298
P. H. Worley. Limits on parallelism in the numerical solution of linear PDEs. SIAM J. Sci. and Stat. Computing, 12(1):1--35, 1991.

299
J. Worlton. Characteristics of high-performance computers. In Supercomputers: Directions in Technology and its Applications, pages 21--50. National Academy Press, 1989.

300
X3J3 Subcommittee. American National Standard Programming Language Fortran (X3.9-1978). American National Standards Institute, 1978.

301
J. Yan, P. Hontalas, S. Listgarten, et al. The Automated Instrumentation and Monitoring System (AIMS) reference manual. NASA Technical Memorandum 108795, NASA Ames Research Center, Moffett Field, Calif., 1993.

302
H. Zima, H.-J. Bast, and M. Gerndt. SUPERB: A tool for semi-automatic MIMD/SIMD parallelization. Parallel Computing, 6:1--18, 1988.

303
H. Zima and B. Chapman. Supercompilers for Parallel and Vector Computers. Addison-Wesley, 1991.


© Copyright 1995 by Ian Foster