WoTUG - The place for concurrent processes

Refer Proceedings details


%T A Tool for Proving Deadlock Freedom
%A Jeremy M. R. Martin, S. A. Jassim
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X We describe a tool, programmed in Java, for the formal
   verification of the absence of deadlock and livelock in
   networks of CSP processes. The innovative techniques used
   scale well to very large networks, unlike the exhaustive
   state checking method employed by existing tools.


%T Scriptic: Parallel Programming in Extended Java
%A André van Delft
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X Scriptic is an expression based extension to the Java
   programming language, targeted at user interfaces,
   simulations and parallel computing on shared memory systems.
   The extras are mainly founded on the theory of Process
   Algebra: constructs for non\-deterministic choice,
   parallelism and communica\-tion. By default, these parallel
   constructs have interleaving seman\-tics, rather than
   multi\-threading or forking. Specific Java code fragments
   may run in their own threads or handle events from the
   windowing system. This makes interactive applications such
   as arcade games execute as fast as corresponding plain Java
   versions. GUI components such as buttons and menu items are
   enabled and disabled when applicable, without additional
   programming. This paper covers an example application in
   Scriptic, an overview of the language constructs, the
   implementation, originality, previous work and current work.


%T Higher\-Order Concurrency in Java
%A Erik D. Demaine
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X In this paper we examine an extension to Hoare\[rs]s
   Communicating Sequential Processes model called
   higher\-order concurrency, proposed by Reppy. In this
   extension, communication algorithms (or events) are
   first\-class objects and can be created and manipulated
   dynamically. In addition, threads are automatically garbage
   collected and channels are first\-class, that is, they can
   be passed over other channels. We describe the design of a
   Java package that implements the main features of
   higher\-order concurrency, with similar ease\-of\-use to
   Reppy\[rs]s Concurrent ML system. Our implementation can be
   easily extended to use a distributed system, which is a
   major limitation with Concurrent ML. We also hope to bring
   the idea of higher\-order concurrency to a wider audience,
   since it is extremely powerful and flexible, but currently
   only well known to the programming\-languages community.


%T Communicating Java Threads
%A Gerald H. Hilderink, Jan F. Broenink, Wiek Vervoort, André W. P. Bakkers
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X The incorporation of multithreading in Java may be
   considered as a significant part of the Java language,
   because it provides rudimentary facilities for concurrent
   programming. However, we belief that the use of channels is
   a fundamental concept for concurrent programming. The
   channel approach as described in this paper is a realization
   of a systematic design method for concurrent programming in
   Java based on the CSP paradigm. CSP requires the
   availability of a Channel class and the addition of
   composition constructs for sequential, parallel and
   alternative processes. The Channel class and the constructs
   have been implemented in Java in compliance with the
   definitions in CSP. As a result, implementing communication
   between processes is facilitated, the programmer can avoid
   deadlock more easily, and the programmer is freed from
   synchronization and scheduling constructs. The use of the
   Channel class and the additional constructs is illustrated
   in a simple application.


%T Beyond transputing : fully distributed semantics in Virtuoso\[rs]s Virtual Single Processor programming model and it\[rs]s implementation on of\-the\-shelf parallel DSPs.
%A Eric Verhulst
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X Virtuoso Classico /VSP is a fully distributed real\-time
   operating system originally developed on the INMOS
   transputer. Its generic architecture is based on a small but
   very fast nanokernel and a portable preemptive microkernel.
   It was further on ported in single and virtual single
   processor implementations to a wide range of processors. As
   the basis of any real\-time application is a scheduler that
   allows the developer to use the minimum schedule to satisfy
   the real\-time requirements, a number of derived Virtuoso
   tools have been developed with complementary
   functionalities. This paper describes the rationale for
   developing the distributed semantics of Virtuoso\[rs]s
   microkernel and describes some of the implementation issues.
   The analysis is based on the parallel DSP implementations as
   these push the performance limits most for hard real\-time
   applications. The Virtuoso tools are being ported and
   further developed in the DIPSAP\-II and EURICO OMI/Esprit
   projects.


%T Reconfigurable Computing
%A Roger Gook
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X The name may be familiar of old to the WoTUG community, but
   it has now been adopted by one of the fastest growing
   sectors of the silicon industry. Reconfigurable Computers
   are computing systems whose hardware architecture can be
   modified by software to suit the application at hand. The
   core component is the FPGA. Remarkable performance gains are
   achieved by placing an algorithm in an FPGA for embedded
   applications, compared with using a microprocessor or DSP.
   This is because an FPGA takes advantage of hardware
   parallelism while reducing the timing overheads needed for
   general\-purpose microprocessor applications. For example
   the time taken by load/store operations and instruction
   decoding can be eliminated. Reconfiguration enables the FPGA
   to provide a problem specific computer for highly optimised
   application performance. Just as high level programming
   languages liberated the first microprocessors programming
   languages will liberate the FPGA. The first of these
   languages to become commercially available is Handel\-C.
   Handel\-C is based on the CSP Model; it was developed by the
   Hardware Compilation Group at the University of Oxford and
   is to be marketed by ESL. The Handel\-C tools enable a
   software engineer to target directly FPGAs in a similar
   fashion to classical microprocessor cross\-compiler
   development tools, without recourse to a Hardware
   Description Language. Thereby allowing the software engineer
   to directly realise the raw real\-time processing capability
   of the FPGA. The skills and expertise gained by the WoTUG,
   provide the group with a competitive advantage to develop
   the innovative algorithms, applications and products in this
   domain.


%T An Open Systems Strategy for Distributed occam Execution
%A Paul Singleton, Barry M. Cook
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X We demonstrate the feasibility of distributed execution of
   occam programs within computer networks which support open
   systems standards for inter\-process communication, remote
   execution and program hosting (e.g. hardware\-independent
   programming languages and operating\-system\-independent
   APIs). We first propose a general architecture, and then
   describe a constructive proof of its viability (i.e. a
   prototype). which uses a novel multiplexed virtual channel
   protocol over UNIX sockets. Finally we summarise some of the
   opportunities for further development which this project has
   created.


%T Higher Levels of Process Synchronisation
%A Peter H. Welch, David C. Wood
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X Four new synchronisation primitives (SEMAPHOREs, RESOURCEs,
   EVENTs and BUCKETs) were introduced in the KRoC 0.8beta
   release of occam for SPARC (SunOS/Solaris) and Alpha (OSF/1)
   UNIX workstations. This paper reports on the rationale,
   application and implementation of two of these (SEMAPHOREs
   and EVENTs). Details on the other two may be found on the
   web. The new primitives are designed to support
   higher\-level mechanisms of SHARING between parallel
   processes and give us greater powers of expression. They
   will also let greater levels of concurrency be safely
   exploited from future parallel architectures, such as those
   providing (virtual) shared\-memory. They demonstrate that
   occam is neutral in any debate between the merits of
   message\-passing versus shared\-memory parallelism, enabling
   applications to take advantage of whichever paradigm (or
   mixture of paradigms) is the most appropriate. The new
   primitives could be (but are not) implemented in terms of
   traditional channels, but only at the expense of increased
   complexity and computational overhead. The primitives are
   immediately useful even for uni\-processors \-\- for
   example, the cost of a fair ALT can be reduced from O(n) to
   O(1). In fact, all the operations associated with new
   primitives have constant space and time complexities; and
   the constants are very low. The KRoC release provides an
   Abstract Data Type interface to the primitives. However,
   direct use of such mechanisms still allows the user to
   misuse them. They must be used in the ways prescribed below
   else else their semantics become unpredictable. No tool is
   provided to check correct usage at this level. The intention
   is to bind those primitives found to be useful into higher
   level versions of occam. Some of the primitives (e.g.
   SEMAPHOREs) may never themselves be made visible in the
   language, but may be used to implement bindings of
   higher\-level paradigms (such as SHARED channels and
   BLACKBOARDs). The compiler will perform the relevant usage
   checking on all new language bindings, closing the security
   loopholes opened by raw use of the primitives. The paper
   closes by relating this work with the notions of virtual
   transputers, microcoded schedulers, object orientation and
   Java threads.


%T Prefetch Data Management for Parallel Particle Tracing
%A Jonathan Tidmus, Roger Miles, Alan G. Chalmers
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X The particle tracing method uses a stochastic approach for
   global illumination computation of three\-dimensional
   environments. As with many graphics techniques the
   computation associated with the image generation is complex.
   Parallel processing offers the potential of solving the
   computational complex particle tracing more rapidly.
   Distributed memory parallel systems are scalable and readily
   available. However, large environmental models are often
   bigger than individual node storage capabilities requiring
   data management to distribute and control the movement of
   environmental data as computation proceeds. Prefetch data
   management attempts to reduce idle time associated with
   remote data fetches by anticipating the latency and
   requesting required data items prior to their actual use
   during computation. This paper demonstrates how attention to
   work division and supply coupled with prefetch data
   management can be utilised to minimise overheads associated
   with a parallel implementation and reduce overall image
   synthesis time.


%T WEAVE: A System for Dynamic Configuration of Virtual Links
%A S. R. Harrison, Chris R. Brown
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X This paper describes Weave, a system which has been
   developed to support the use of a DS Link (IEEE 1355) based
   parallel computer architecture. Weave is an extension layer
   on IIPC (a simple transputer and UNIX parallel processing
   environment which provides dynamic process management,
   hardware transparency and message passing.) Although Weave
   is suitable for any T9000 Transputer network, it has been
   specifically designed to support the use of the AIVRU DS
   Link Vision Engine. A brief description is given of this
   machine, which processes live digital video using a mixture
   of hardware modules and software processes, all
   interconnected by DS Links. Weave provides the ability to
   make virtual link connections between processes on demand at
   run\-time. These connections may be disconnected when no
   longer required, and hence the whole hardware architecture
   is dynamically reconfigured automatically to suit the
   requirements of the software application. A small functional
   interface provides processes with the ability to alter their
   own connectivity, and that of other processes. A temporal
   locking mechanism for each virtual link controls when it may
   be disconnected, and when pending connection requests can be
   fulfilled. This locking mechanism is driven by the action of
   communication over the virtual link. The Weave system
   supports transparently, the creation and destruction of
   connections between software processes and the image
   processing hardware modules (and between hardware modules
   directly). Also provided transparently by Weave is support
   for the use of hardware message replicator module(s) that
   multicast virtual link data to any number of DS Link
   recipients.


%T Data\-Strobe Links and Virtual Channel Processors
%A Barry M. Cook
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X Data\-strobe links and message transfer protocols as used by
   the SGS\-Thomson T9000 processor are described, as are the
   essential characteristics of the supporting virtual channel
   processor. A method of providing the same functionality
   without the use of a T9000 is suggested and illustrated by a
   T225 processor design using a software virtual channel
   processor and minimal supporting hardware. Finally,
   differences between the international standard, IEEE 1355,
   and the T9000 links from which it was derived are described
   and the implications for virtual channel links highlighted.


%T occam for Multi\-Processor DEC Alphas
%A Peter H. Welch, Michael D. Poole
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X A multi\-processor implementation of occam2.1 for
   interconnected DEC Alpha processors has been derived from
   the Kent Retargetable occam Compiler. Each Alpha processor
   is supported over a PCI bus by a T425 transputer, whose
   links complete the communications fabric. This paper reports
   on the software mechanisms for supporting these platforms
   from occam so that they appear just like any transputer
   system \-\- a collection of processing nodes connected by
   channels placed on links. Advantage was taken of a
   proprietary multi\-threading kernel, supplied as part of 3L
   Parallel C/AXP, to support parallel inter\-node
   communication. occam multi\- processing is supported by the
   KRoC kernel running within one of the 3L threads. The
   performance of generated code and networked systems has been
   benchmarked, with particular care being taken to measure the
   interaction overheads between the Alpha and its
   communication fabric. An image analysis program was also
   used in the benchmarking as an example of a real
   multi\-processor application.


%T A tool for optimisation of program execution in dynamic topology systems
%A Tomasz Kalinowski
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X In this paper, we present a tool for optimisation of
   execution of parallel programs in distributed memory
   multi\-processor systems with dynamic interconnection
   networks. The programs are described as Directed Acyclic
   Graphs (DAGs). The tool allows to compare simulated
   execution times for different task scheduling heuristics,
   target system topologies and communication models. A list
   scheduling algorithm, which has been applied, accounts for
   dynamic changes of interconnection structure. We demonstrate
   the efficiency of dynamic networks by comparing schedules
   obtained for dynamic and fixed topology systems. We propose
   a method of validating simulation results in a target system
   composed of T9000 transputers. The method relies on
   comparison of simulation results with execution times of
   synthetic OCCAM applications in the target system. The
   comparison indicates that assumptions taken on program
   execution and system model hold in the system under
   investigation.


%T The Design of JET: A Java Library for Embarrassingly Parallel Applications
%A Luis M. Silva, Hernâni Pedroso, João Gabriel Silva
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X JET is a parallel virtual machine. It has a dynamic number
   of processors which may run different proprietary operating
   systems. The processors communicate through the slowest
   intercommunication network of the world, do not provide peak
   performance and in the overall the parallel machine can be a
   very unstable computing surface. In other words, JET uses
   the idle CPU cycles of computers that are connected to the
   Internet, being a really inexpensive supercomputer. This
   paper presents the design of a Java parallel library that
   provides support for the execution of embarrassingly
   parallel applications. It inherits the security, robustness
   and portability features of Java and includes support for
   fault\-tolerance, scalability and high\-performance through
   the use of parallelism.


%T Fine\-grained global control constructs for parallel programming environments
%A Marek Tudruj
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X Problems of evolved control in fine\-grained parallel
   programs in distributed memory systems are discussed in the
   paper. Global control constructs are proposed which
   logically bind program modules, assign them to worker
   processors and define the involved flow of control.
   Implementation methods are discussed which assume control
   flow processing decoupled from data processing inside
   executive modules. The proposed constructs are extensions of
   the OCCAM 2 language. They can be incorporated into an
   intermediate code generated by a parallel language compiler
   or can be used by a programmer to define control flow
   between fine\-grained program modules assigned to different
   processors. Architectural requirements for efficient
   implementation of the proposed control constructs are
   discussed.


%T Compile\-Time Techniques for Mapping Loop Parallelism
%A R. Sakellariou
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X A synchronous extension to the library model for message
   passing (Inmos C, PVM, Parmacs, MPI, etc.) is presented.
   This extension, provides a comfortable expression of nested
   parallelism from inside the message passing model.
   Furthermode of being a valuable tool for the presentation
   and teaching of parallel algorithms, the computation results
   prove that an efficiency similar to or even bettern tahn the
   one obtained designing and implementing algorithms using the
   native language can be achieved.


%T Expanding the Message Passing Library Model with Nested Parallelism
%A C. Rodriguez, F. Sande, C. León, F. Garcia
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X A synchronous extension to the library model for message
   passing (Inmos C, PVM, Parmacs, MPI, etc.) is presented.
   This extension, provides a comfortable expression of nested
   parallelism from inside the message passing model.
   Furthermore of being a valuable tool for the presentation
   and teaching of parallel algorithms, the computational
   results prove that an efficiency similar to or even better
   than the one obtained designing and implementing algorithms
   using the native language can be achieved.


%T Dynamic Process Interaction
%A Lajos Schrettner, Innes Jelly
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X This paper concerns with the design of a building block for
   parallel and distributed software systems. We start with a
   very common problem of process interaction and successively
   derive a building block that can be used to construct
   systems that are correct by construction. When copies of
   this building block are connected to each other in an
   arbitrary fashion, the resulting system is deadlock/livelock
   free. An application is outlined where this method was used.
   We also would like to stress that the method of successive
   refinements used in this paper seems to be a fruitful
   approach in the design of protocols.


%T The Macram\[`e] 1024 Node Switching Network
%A S. Haas, D. A. Thornley, M. Zhu, R. W. Dobinson, B. Martin
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X To date, practical experience in constructing switching
   networks using IEEE 1355 technology has been confined to
   relatively small systems and there are no experimental
   results on how the performance of such systems will scale up
   to several hundred or even several thousand nodes. Some
   theoretical studies have been carried out for large networks
   of up to one thousand nodes for different topologies. We
   present results obtained on a large modular testbed using
   100 Mbits/s point to point DS links. One thousand nodes will
   be interconnected by a switching fabric based on the 32 way
   STC104 packet switch. The system has been designed and
   constructed in a modular way to allow a variety of different
   network topologies to be investigated (Clos, grid, torus,
   etc.). Network throughput and latency are being studied for
   various traffic conditions as a function of the topology and
   network size. Results obtained with the current 656 node
   setup are presented.


%T Java Threads in Light of occam/CSP (Tutorial)
%A Peter H. Welch
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X Java provides support for parallel computing through a model
   that is built into the language itself. However, the
   designers of Java chose to be fairly conservative and
   settled for the contepts of threads and monitors. Monitors
   were developed by Tony Hoare in the early 1970s as a
   structued way of using semaphores to control access to
   shared resources. Hoare moved away from this, in the late
   1970s, to develop the theory of Communicating Processes
   (CSP). One reason for this was that the semantics of
   monitors and threads are not WYSIWIG, so that designing
   robust parallel algorithms at this level is seriously hard.
   Fortunately, it is possible to introduce the CSP model into
   Java through sets of classes implemented on top of its
   monitor support. By restricting interaction between active
   Java objects to CSP synchronisation primitives, Jav thread
   semantics become compositional and systems with arbitrary
   levels of complexity become possible. Multi\-threaded Web
   applets and distributed applications become simpler to
   design and implement, race hazards never occured,
   difficulties such as starvation, deadlock and livelock are
   easier to confront and overcome, and performance is no worse
   than that obtained from directly using the raw monitor
   primitives. The advantages of teaching parallelism in Java
   purely through the CSP class libraries will be discussed.
   (These libraries were developed jointly at Kent and Oxford
   Universities in the UK and the University of Twente in the
   Netherlands.)


%T Communicating Java Threads Reference Manual
%A Gerald H. Hilderink
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X This document describes a csp package that contains the
   channel and composition classes in Java according to the CSP
   paradigm. The csp classes form a complete package that
   contains the necessary ingredients for concurrent
   programming with channels in Java. The channel and
   composition concepts are derived from CSp and are developed
   according to the object\-oriented paradigm. There is a clear
   pattern of concerns by means of object\-oriented techniques
   using inheritance, delegation, genericity , and
   polymorphism. This document is meant as a reference manual
   and gives background information about the realization of
   the csp classes. The use of the CSP channels in Java is
   illustrated by means of using building blocks.


%T A Multiprocessor OCCAM Development System for UNIX Network Clusters
%A D. G. Patrick, P. R. Green, T. A. York
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X This paper describes the OCCNIX multiprocessor environment,
   that enables OCCAM program development and testing, on
   clusters of UNIX workstations. Linked binary level
   interpreters form a virtual Transputer network that uses the
   TCP/IP client\-server model and provides hardware
   independent multiple platform access in a similar way to the
   recently released JAVA. Results show a single processor
   performance that is half that of an actual Transputer and a
   four\-processor speedup of 0.78. The system also has the
   ability to run development tools such as the ISPY network
   debugger.


%T How to Design Deadlock\-Free Networks Using CSP and Verification Tools \-\- A Tutorial Introduction
%A Jeremy M. R. Martin, S. A. Jassim
%E André W. P. Bakkers
%B Proceedings of WoTUG\-20: Parallel Programming and Java
%X The CSP language of C.A.R. Hoare originated as a blackboard
   mathematical notation for specifying and reasoning about
   parallel and distributed systems. More recently
   sophisticated tools have emerged which provide automated
   verification of CSP\-specified systems. This has led to a
   tightening and standardisation of syntax. This paper
   outlines the syntax and semantics of CSp as it is now used
   and then describes how to design CSP networks, which are
   guaranteed to be free of deadlock, throught a succession of
   increasingly complex worked examples, making use of the
   verification tool Deadlock Checker.


If you have any comments on this database, including inaccuracies, requests to remove or add information, or suggestions for improvement, the WoTUG web team are happy to hear of them. We will do our best to resolve problems to everyone's satisfaction.

Copyright for the papers presented in this database normally resides with the authors; please contact them directly for more information. Addresses are normally presented in the full paper.

Pages © WoTUG, or the indicated author. All Rights Reserved.
Comments on these web pages should be addressed to: www at wotug.org

Valid HTML 4.01!