Newsgroups: comp.parallel From: adb@cs.bham.ac.uk (Andy D Ben-Dyke) Subject: Re: Architecture Classification/Description Organization: University of Birmingham Date: Mon, 28 Jun 1993 18:19:19 GMT This is the summary of all the information I've received on the suibject of computer taxonomy. Many thanks to everyone on comp.parallel who sent me references, including (in no particular order) Bryan Hunt, Gerd Bleher, Wolfgang Karl, Walter B. Ligon III (though I haven't had chance to follow up the references - I'll post any further info as I get it), sd@informatics.rutherford.ac.uk, Michial Gunter, and Hugues Leroy. Special thanks must go to Judith D. Schlesinger, who supplied me with an excellent bibliography and some very interesting conversations. Thanks again, and I'm very sorry if I've accidently missed anyone. I've included a brief review of some of the literature and included a bibliography at the end of the review. Thanks again, Andy. 1 Architectural Taxonomy - A Brief Review ============================================ Andy D. Ben-Dyke June 28, 1993 1 Introduction ================= Before diving into the specifics of computer taxonomies, let's consider the driving forces behind their development: o Enable us to clearly see what has been achieved to date in the field of architecture. o Quickly estimate the suitability of an architecture to solving a given problem. o There is the potential that such systems may reveal configurations that may not have occured to designers. o Performance models can be built that cover a wide range of systems with little, or no, modification. An ideal taxonomy would consist of the following attributes: Hierarchical Starting from the most abstract level, the system should en- able the architecture to be refined into levels of further detail. Closely related architectures should be grouped together until the seperate ma- chines have been described in very fine detail. Universal Each unique computer systems should have a unique "signature" assigned to it by the taxonomy. Extendable The same naming system should be applicable to future ma- chines, with minor changes to the underlying conventions. Concise To be of any real use, the generated signatures should be as short as possible. The following section describes existing taxonomies, followed by a short (and highly subjective :-) conclusion, which contains a few suggested exten- sions (please take these with a pinch of salt - they're more personal ramblings than serious proposals). 2 Existing Taxonomies ======================== 2.1 Flynn ------------ Flynn[Fly72 ] was arguably the first to get the ball rolling with his clasification system based upon the number of instruction and data streams that can be simultaneosly processed. The categories are: SISD (Single Instruction, Single Data) - the classical definition of a unipro- cessor. SIMD (Single Instruction, Multiple Data) - vector/array processor. MISD (Multiple Instruction, Single Data) MIMD (Multiple Instructions, Multiple Data) - covers the range of multi- processor systems. Several criticisms were levelled at the system, particularly that it con- tained non-existent models (MISD) and was too coarse for classifying mul- tiprocessor systems. This resulted in the introduction of the loosely coupled and tightly coupled catgories for MIMD architectures. Johnson[Joh88 ] fur- ther refined the MIMD model by using the memory system and communi- cations and synchronisation methods as criteria, as shown below: GMSV (Global Memory, Shared Variables) - the classical shared memory computer. GMMP (Global Memory, Message Passing) DMSV (Distributed Memory, Shared Variable) - a hybrid between the shared memory and message passing systems. DMMP (Distributed Memory, Message Passing) - the classical message passing computer model. 2.2 Feng ----------- Feng used the word length of the processing units (n) and the bit slice length (m - a product of the number of pipelines and their depth) to classify systems as follows: WSBS (Word Serial, Bit Serial) - bit serial processing, (n = 1; m = 1) WPBS (Word Parallel, Bit Serial) - bit slice processing (n = 1; m > 1) WSBP (Word Serial, Bit Parallel) - word slice processing (n > 1; m = 1) WPBP (Word Parallel, Bit Parallel) - fully parallel (n > 1; m > 1) 2.3 Feustel and Reddi ------------------------ Feustel and Reddi[RF76 ] argues that architecture can be viewed as composed of three components: Organisation independent functional units, pipeline stages or array based. Control/Flow of Information Centralised or decentralised, and control initiated or data initiated (e.g. interrupts or data flow computers). Representation/Interpretation/Transformation of Information the var- ious data formats the machine can support and how they are manipu- lated within the system. Unfortunately, as far as the author is aware, no concrete notation was developed. 2.4 H"andler --------------- H"andler[H"82] identified three logical levels of parallelism: Program level (multiple processors), Instruction level (multiple ALUs) and the Word level (multiple bits). The Erlangen classification system therefore uses the triple (K, D, W) to represent a machine, where K is the number of processors, D is the number of ALUs and W is the wordlength of each ALU. On top of this, pipelining can be included (macro-, instruction- and arithmetic-pipelining respectively), giving rise to (K*K', D*D', W*W'), where the multipliers are the pipeline depth at each level. The system also enabled representations to be combined using the follow- ing operators: + indicates the existence of more than one structure that operates indepen- dently in parallel. * indicates the existence of sequentially ordered structures where all data is processed through all structures. v indicates that a certain system may have multiple configurations. The main drawback of this system is the lack of interconnection informa- tion, so all multiprocessors are lumped together. However, it is very compact and does convey an idea of machine "size", even if value based comparisons are meaningless. 2.5 Skillicorn ----------------- Skillicorn introduced the idea of modelling the inter-connection networks that may exist within a system. Such networks include the processor to memory, processor to ALU, and processor to processor subsystems. The system is therefore represented by the following: (no. of instruction processors (IP), no. of instruction memories (IM), the IP to IM network, no. of ALU (DP), DP to data memory network, IP to DP network, DP to DP network) This system is very flexible, and, depending on the interprocess notation, is capable of representing most current systems. However, it is a little un- wieldly, and is probably best used in combination with Flynn's system (so only the departures from the base class need to be specified). 3 Conclusion =============== Of the systems discussed, Skilicorn's comes closest to the ideal, with a hy- brid of Flynn's and Skillicorn's being the most usable. One small point to make, though, is that none of the models covers "intelligent" interconnection networks capable of broadcasting (except in the SIMD case), or combinning multiple values headed for the same destination. For example, the CM- 200[OR93 ], can apply arithmetic operations to multiple values, while the latest connection machine is capable of lock stepping the individual process- ing elements using an instruction broadcast bus. Also, the notation for the exact interconnection networks is very loose. For example, Skillicorn only uses simple one to one, and all to all models, while hierarchical communci- cations systems, such as the ALLCACHE system found in the KSR-1 are difficult to specify (this detailed information is needed for dealing with issues such as data or control locality and communication time modelling). 4 Acknowledgements ===================== Many thanks to everyone on comp.parallel who sent me references, including (in no particular order) Bryan Hunt, Gerd Bleher, Wolfgang Karl, Walter B. Ligon III (though I haven't had chance to follow up the references - I'll post any further info as I get it), sd@informatics.rutherford.ac.uk, Michial Gunter, and Hugues Leroy. Special thanks must go to Judith D. Schlesinger, who supplied me with an excellent bibliography and some very interesting conversations. Thanks again, and I'm very sorry if I've accidently missed anyone. References ---------- [Fly72]M. J. Flynn. Some computer organizations and their effectiveness. IEEE Trans. Computers, 21(9):948-960, September 1972. [H"82]Wolfgang H"andler. Innovative computer architecture - how to increase parallelism but not complexity. In David J. Evans, editor, Parallel Processing Systems - An Advanced Course, pages 1-42. Cambridge University Press, 1982. [Joh88] Eric E. Johnson. Completing an MIMD multiprocessor taxonomy. Computer Architecture News, 16(3):44-47, June 1988. [OR93] Joel Oren and Gowri Ramanathan. Survey of commercial parallel machines, 1993. [RF76] S. S. Reddi and E. A. Feustel. A conceptual framework for computer architecture. Computing Surveys, 8(2):277-300, June 1976. Bibliography ============ @book{BeN71, author = "Bell, C.G. and Newell, A.", title = "Computer Structures: Readings and Examples", publisher = "McGraw-Hill", year = 1971 } @techreport{Bre73, author = "Bredt, T.H.", title = "A Survey of Models for Parallel Computing", institution = "Stanford University Computer Systems Laboratory", month = "December", year = 1973, number = "CSL-TR-70-8 } @inproceedings{Fen72, author = "Feng, T.-Y.", title = "Some Characteristics of Associative/Parallel Processing", booktitle = "Proceedings of the 1972 Sagamore Computer Conference", month = "August", year = 1972, pages = 5--16 } @article{Fly66, author = Flynn, M.J.", title = "Very High-Speed Computing Systems", journal = "Proceedings of the IEEE", volume = 54, number = 12, month = "December", year = 1966, pages = 1901--1909 } @Article{Flynn:Taxonomy, author = {M. J. Flynn}, title = {Some Computer Organizations and Their Effectiveness}, journal = {IEEE Trans. Computers}, volume = {C-21}, number = {9}, year = {1972}, month = {September}, pages = {948--960}, anote = {This is the paper that first addressed the issues of computer taxonomy. The resulting system is very coarse and even contains some redundant classes. However, it is one of the few systems that has been widely accepted.}, got = {no}, } @inproceedings{Fly72, author = "Flynn, M.J.", title = "Toward More Efficient Computer Organizations", booktitle = "Spring Joint Computer Conference", year = 1972, pages = 1211--1217 } @inproceedings{Gil81, author = "Giloi, W.K.", title ="A Complete Taxonomy of Computer Architecture Based on the Abstract Data Type View", booktitle = "IFIP Workshop on Taxonomy in Computer Architecture", month = "June", year = 1981, pages = 19--38 } @inproceedings{Gol78, author = "Goldschlager, L.M.", title = "A Unified Approach to Models of Synchronous Parallel Machines", booktitle = "The 10th Annual ACM Symposium on Theory of Computing", month = "May", year = 1978, organization = "ACM", pages = 89--94 } @inproceedings{Han77, author = "H\"{a}ndler, W.", title = "The Impact of Classification Schemes on Computer Architecture", booktitle = "1977 International Conference on Parallel Processing", month = "August", year = 1977, pages = 7--15 } @inproceedings{Han81, author = "H\"{a}ndler, W.", title = "Standards, Classification, and Taxonomy: Experiences with ECS", booktitle = "IFIP Workshop on Taxonomy in Comp;uter Architecture", month = "June", year = 1981, pages = 39--75 } @InCollection{Handler:Taxonomy, author = {Wolfgang H\"{a}ndler}, title = {Innovative Computer Architecture - How to Increase Parallelism but not Complexity}, booktitle = {Parallel Processing Systems - An Advanced Course}, publisher = {Cambridge University Press}, editor = {David J. Evans}, year = {1982}, pages = {1--42}, anote = {Describes the Erlangen classification system, based on a tuple (K*K',D*D',W*W') where K is the number of processing elements, D is the number of ALU's and W is the word length of the ALU's. X' indicates the level of pipeline parallelism in each seperate system (K => macropipelining, D => instruction pipelinning, W => arithmetic pipelining). Operators +, * and v are defined (+ => parallel processing, * => pipelining and v => multiple configurations of the base system). The paper then details a home brewed architecture before discussing some very broad issues (with a vague relevance to parallelism).}, got = {yes}, } @techreport{Hoa81, author = "Hoare, C.A.R.", title = "A Model for Communicating Sequential Processes", institution = "Oxford University Programming Research Group", year = 1981, number = "PRG-22" } @inbook{HoJ88, author = "Hockney, R.W. and Jesshope, C.R.", title = "Parallel Computers 2", chapter = "1.2", pages = 53--81, publisher = "Adams Hilger", year = 1988, note = "review of other taxonomies; introduction of ASN" } @Book{Hwang:Briggs, author = {Kai Hwang and Faye A. Briggs}, title = {Computer Architecture and Parallel Processing}, year = {1985}, publisher = {McGraw-Hill International}, anote = {The standard reference book on parallel architectures.}, got = {yes}, } @Article{MIMD:Taxonomy, author = {Eric E. Johnson}, title = {Completing an {MIMD} Multiprocessor Taxonomy}, journal = {Computer Architecture News}, year = {1988}, volume = {16}, number = {3}, month = {June}, pages = {44-47}, anote = {Extends flynn's classification of MIMD style multiprocessors by classifiying the memory hierarchy into global and distributed, as well as partitionning communincation/synch into shared variable or message passing based. Therefore, GMSV = classic tightly coupled system, DMMP = transputer, DMSV = hybrid.}, got = {yes}, } @incollection{Kuc80, author = "Kuck, D.J.", title = "High Speed Machines and Their Compilers", booktitle = "Parallel Processing Systems---An Advanced Course", editor = "Evans, D.J.", pages = 193--214, publisher = "Cambridge University Press", year = 1980 } H. T. Kung, "The Structure of Parallel Algorithms", Advances in Computers v19, 1980 @incollection{LaL90, author = "Lamport, L. and Lynch, N.", title = "Distributed Computing: Models and Methods", booktitle = "Handbook of Theoretical Computer Science", editor = "Van Leeuwen, J.", volume = "B", chapter = 18, pages = 1157--11999, publisher = "Elsevier Science Publishers, B.V." year = 1990 } @inproceedings{LiS90, author = "Lin, C. and Snyder, L.", title = "A Comparison of Programming Models for Shared Memory Multiprocessors", booktitle = "1990 International Conference on Parallel Processing", month = "August", year = 1990, volume = "II", pages = 163--170 } L Snyder, "A Taxonomy of Synchronous Parallel Machines", ICPP 1988, 281-285. L Snyder, "An Inquiry into the Benefits of Multiguage Parallel Computation", ICPP 1985 @article{MPB91, author = {Martin, B.E. and Pedersen, C.H. and Bedford-Roberts, J.}, title = {An Object-Based Taxonomy for Distributed Computing Systems}, journal = {Computer}, volume = {24}, number = {8}, month = {August}, year = {1991}, pages = {17--27}, anote = {Describes how distributed computing systems can be classified using an object orientated approach. The system is first broken down into three main objects, threads, properties and seperation. These are then discussed individually, with the subcomponents being as follows: threads (creation, control and max no.), object properties (replicated, protected, persistent), and seperation (identification, comms, failure, migration).} } @article{MaM87, author = "Martin, J.L. and Mueller-Wichards, D.", title = "Supercomputer Performance Evaluation: Status and Directions", journal = "The Journal of Supercomputing", volume = 1, year = 1987, pages = 87--104 } @article{Mil73, author = "Miller, R.E.", title = "A Comparison of Some Theoretical Models of Parallel Computation", journal = "IEEE Transactions on Computers", volume = "C-22", number = 8, month = "August", year = 1973, pages = 710--717 } @Article{Taxonomy:PAWS, author = {Pease, D. et al}, title = {{PAWS}: A Performance Evaluation Tool for Parallel Computing Systems}, journal = {Computer}, volume = {24}, number = {1}, month = {January}, year = {1991}, pages = {18--29}, anote = {PAWS (parallel assessement window system) is a system for performing machine evaluation and comparisons. Consists of 2 front end stages: the application charicterisation and the machine char. The appl. tool determines, using data analysis, the order of execution and the granuality of the application, before converting from the spec language (APL) to IF1 (intermediate form 1 - originally developed as the target language for SISAL). The machine is then modelled using number and flexability of the functional units, the number of processors, the memory bandwidth and hierarchy and the type of interconnection network as the basis for comparison. Then each processor is described in greater depth, almost to the sub-componenet level.}, got = {In library}, } @article{Pra76, author = "Pratt, V.R. and Stockmeyer, L.J.", title = "A Characterization of the Power of Vector Machines", journal = "Journal of Computer and System Sciences", volume = 12, year = 1976, pages = 198--221 } @Article{Reddi:Taxonomy, author = {S. S. Reddi and E. A. Feustel}, title = {A Conceptual Framework for Computer Architecture}, journal = {Computing Surveys}, year = {1976}, volume = {8}, number = {2}, month = {June}, pages = {277--300}, anote = {Architectures can be viewed as composed of 3 components: physical organisation (modular, pipelined, array + memor org), control and flow of information (control/data driven, centralised/ decentralised) and the representation/interpretation/ transformation of info. Several examples are used to illustarte these points, but no formal classification is developed, leaving you wondering why they bothered.}, got = {In library, classmark QA76.A1C68}, } @techreport{SSS83, author = "Siegel, L.J. et al", title = "Distributed Computing for Signal Processing: Modeling of Asynchronous Parallel Computation", institution = "Purdue University School of Electrical Engineering", year = 1983, note = "chapter 2: Modeling Architectures: Classification Schemes" } @article{Sho73, author = "Shore, J.E.", title = "Second Thoughts on Parallel Processing", journal = "Computing and Electrical Engineering", volume = 1, year = 1973, pages = 95--109, publisher = "Pergamon Press" } @Article{Skill:Taxonomy, author = {D. B. Skillicorn}, title = {A Taxonomy for Computer Architectures}, journal = {Computer}, volume = {21}, number = {11}, month = {November}, year = {1988}, pages = {46--57}, anote = {Gives a brief summary of existing taxonomies, then discusses the uses of such classifications (understand what's been achieved, reveals unobvious potential configs, and in the building of GP performance models. Then describes the use of a hirearchical system, with model of computation, abstract machine and machine implementation being the proposed levels needed. Then describes how the majority of computers can be modelled using the following params: no. CPUs, no ALU's, CPU->ALU SN, CPU->Mem SN, ALU->ALU mem SN, ALU->ALU SN and the no. CPU memories). The SN (switching networks) are characterised by the no. elements to no. elements e.g. 1-1, n-n, nxn. A list of approximately 20 classes (many unamed) are presented covering a very wide range of current machine models. Says that further refinements can be made by describing the pipeline (and depth) of the various components in the system (i.e. CPU and ALU) along with their associated depth (similar to Handler's system). Finsihes by describing how non vonneumann archs can be modelled (alice and flagship?).}, got = {yes}, } @Article{Skill:Indep, author = {David B. Skillicorn}, title = {Architecture-Independent Parallel Computation}, journal = {Computer}, year = {1990}, volume = {23}, number = {12}, pages = {38-50}, anote = {The paper reviews some of the standard parallel architectures and then goes on to propose a new model for realising AIPC - the {B}ird-{M}eertens formalism. This method is briefly described and then compared with other existing techniques.}, got = {yes}, } @article{Val90, author = "Valiant, L.G.", title = "A Bridging Model for Parallel Computation", journal = "CACM", volume = 33, number = 8, month = "August", year = 1990, pages = 103--111 }