Communicating Process Architectures - 2002
The University of Reading (England)
Sunday 15th. September (evening) through Wednesday 18th. September (lunchtime)
[Note: the categories listed below are not definitive - many
papers fit into several]
How to SPIN (to analyse real-life CPAs)
Theo C. Ruys (University of Twente, The Netherlands)
Abstract: SPIN is a model checker for the verification of software systems.
During the last decade, SPIN has been successfully applied to trace
logical design errors in distributed systems, such as operating
systems, data communications protocols, switching systems,
concurrent algorithms, railway signaling protocols, etc.
SPIN checks the logical consistency of a specification; it reports on
deadlocks, unspecified receptions, flags incompleteness, race
conditions, and unwarranted assumptions about the relative speeds
of processes. SPIN is considered to be one of the most powerful
and advanced model checkers (freely) available today. SPIN is
widely distributed and has a large user base.
Moreover, last year, SPIN received the prestigious ACM System Award.
SPIN uses a high level language called PROMELA (PROcess MEta LAnguage)
to specify systems descriptions.
The tutorial gives a crash-course introduction to PROMELA and
presents an overview of the validation and verification features
of SPIN. The material will be illustrated by several demonstrations
using XSpin, the graphical user interface to SPIN.
The second part of the tutorial discusses guidelines to
construct efficient PROMELA models and shows how to use SPIN in
the most effective way. Topics to be discussed include: SPIN's
optimisation algorithms, directives and options to tune SPIN,
aggressive PROMELA modelling, literate modelling, using SPIN as
a debugger, validation management and rules of thumb.
CV: Theo Ruys is an assistant professor in the Formal Methods and
Tools group at the University of Twente, The Netherlands,
headed by Professor Ed Brinksma. He received his MSc in computer
science in 1995 on the design and implementation of a
retargetable code-generator for TCGS, the in-house compiler
generator used in Twente.
In 2001, he received his PhD under supervision of Ed Brinksma.
His PhD Thesis "Towards Effective Model Checking" investigates
methods and techniques to improve the effectiveness of the
model checking process with the goal to reliably apply
model checking technology `in the large'. The research can be
classified by three themes: methodology, pragmatics and true
concurrency. The pragmatics part is devoted to the model
checker SPIN, where several techniques and methods are discussed
to construct efficient PROMELA models and to use SPIN in the
most effective way.
Semantics of prialt in Handel-C (tm)
Andrew Butterfield (Dublin University, Ireland)
Jim Woodcock (Computing Laboratory, University of Kent, UK)
Abstract: This paper discusses the semantics of the
construct in Handel-C. The language is essentially a static subset of C,
augmented with a parallel construct and channel communication, as found
in CSP. All assignments and channel communication events take one clock
cycle, with all updates synchronised with the clock edge marking the cycle
end. The behaviour of prialt in Handel-C is similar in concept
to that of occam,
and to the p-priority concept of Adrian Lawrence's CSPP . However,
we have to contend with both input and output guards in Handel-C, unlike
the situation in occam,
although prialts with conflicting priority requirements are not
legal in Handel-C. This makes our problem simpler than the more general
case including such conflicts considered by Lawrence. We start with an
informal discussion of the issues that arise when considering the semantics
prialt construct. We define a resolution function
(R) that determines which requests in
a collection of prialts become active. We describe a few properties
that we expect to hold for resolution, and discuss the issue of compositionality.
Acceptances, Behaviours and Infinite Activity in CSPP
Adrian Lawrence (Department of Computer Science, Loughborough University, UK)
Abstract: The denotational semantics presented here
defines an extension of CSP called CSPP.
It includes a full description of infinite behaviour in one simple model
using only finite traces. This is true for both finite and infinite alphabets.
The structure is a complete lattice, and so also a complete partial order,
under refinement. Thus recursion is defined by fixed points in the usual
way. It is also a complete restriction metric space so this provides an
alternative treatment of recursion for contraction mappings.
HCSP: Imperative State and True Concurrency
Adrian Lawrence (Department of Computer Science, Loughborough University,
Abstract: HCSP is
an extension of CSPP which captures
the semantics of hardware compilation. Because it is a superset of CSPP,
it can describe both hardware and software and so is useful for co-design.
The extensions beyond CSPP include:
true concurrency; new hardware constructors; and a simple and natural way
to represent imperative state. Both CSPP
and HCSP were invented to cope with problems
that arose while the author was trying to prove that the hardware that
he had designed correctly implemented channels between a processor
and an FPGA. Standard CSP did not capture priority, yet the circuits in
the FPGA and the occam processes in the transputer both depended on priority
for their correctness. The attempt to extend CSP rigorously to handle such
problems of co-design has led to develoments that seem to have a much wider
significance including a new way of unifying theories for imperative programming.
This paper reports on the current state of HCSP
and focuses on handling imperative state and true concurrency. The acceptance
denotational semantics is described briefly.
A Predicate Transformer Semantics for a Concurrent Language of Refinement
Ana Cavalcanti and Jim Woodcock (Computing Laboratory, University
of Kent, UK)
Abstract: Circus is
a combination of Z and CSP; its chief distinguishing feature is the inclusion
of the ideas of the refinement calculus. Our main objective is the definition
of refinement methods for concurrent programs. The original semantic model
for Circus is Hoare and He's unifying
theories of programming. In this paper, we present an equivalent semantics
based on predicate transformers. With this new model, we provide a more
adequate basis for the formalisation of refinement and verification-condition
generation rules. Furthermore, this new framework makes it possible to
include logical variables and angelic nondeterminism in Circus.
The consistency of the relational and predicate transformer models gives
us confidence in their accuracy.
View-Centric Reasoning for Linda and Tuple Space Computation
Marc L. Smith (Computer Science Department, Colby College, USA)
Rebecca J. Parsons (ThoughtWorks, Inc., Chicago, USA)
Charles E. Hughes (School of EE and CS, University of Central Florida,
Abstract: In contrast to sequential computation, concurrent
computation gives rise to parallel events. Efforts to translate the history
of concurrent computations into sequential event traces result in the potential
uncertainty of the observed order of these events. Loosely coupled distributed
systems complicate this uncertainty even further by introducing the element
of multiple imperfect observers of these parallel events. Properties of
such systems are difficult to reason about, and in some cases, attempts
to prove safety or liveness lead to ambiguities. We present a survey of
challenges of reasoning about properties of concurrent systems. We then
propose a new approach, view-centric reasoning, that avoids the problem
of translating concurrency into a se-quential representation. Finally.
we demonstrate the usefulness of view-centric reasoning as a framework
for disambiguating the meaning of tuple space predicate operations, versions
of which exist commercially in IBM?s T Spaces and Sun?s JavaSpaces.
On the Complexity of Buffer Allocation in Message Passing Systems
Alex Brodsky, Jan Bækgaard Pedersen and Alan Wagner
of Computer Science, University of British Columbia, Canada)
Abstract: In modern cluster systems message passing
functionality is often off-loaded to the network interface card for efficiency
reasons. However, this limits the amount of memory available for message
buffers. Unfortunately, buffer insufficiency can cause an otherwise correct
program to deadlock, or at least slow down. Hence, given a program trace
from an execution in an unrestricted environment, determining the minimum
number of buffers needed for a safe execution is an important problem.
We present three related problems, all concerned with buffer allocation
for safe and efficient execution. We prove intractability results for the
first two problems and present a polynomial time algorithm for the third.
Design and Modelling:
A Graphical Modeling
Language for Specifying Concurrency based on CSP
Gerald Hilderink (Department of Electrical Engineering - Control
Laboratory, University of Twente, The Netherlands)
Abstract: In this paper, we introduce a graphical specification
language for modeling concurrency in software design. The language notations
are derived from CSP and the resulting designs form so-called CSP diagrams.
The notations reflect data-flow aspects, control-flow aspects, and they
reflect CSP algebraic expressions that can be used for formal analysis.
The designer does not have to be aware of the underlying mathematics. The
techniques and rules presented provide guidance to the development of concurrent
software architectures. One can detect and reason about compositional conflicts
(errors in design), potential deadlocks (errors at run-time), and priority
inversion problems (performance burden) at a high level of abstraction.
The CSP diagram collaborates with object-oriented and structured methods.
The CSP diagram is UMLable and can be presented as a new process diagram
for the UML to capture concurrent, real-time, and event-flow oriented software
architectures. The extension to the UML is not presented in this paper.
Consolidating The Agreement Problem Protocol Verification Environment
James Pascoe and Roger Loader (Department of Computer Science, University of Reading, UK)
Abstract: The Agreement Problem Protocol Verification
Environment (APPROVE) has become a mature and robust platform for the automated
verification of proposed solutions to agreement problems. APPROVE is based
on the Spin model checker and leverages the literate programming tool noweb
to provide Promela code supplied with L A T E X documentation. The APPROVE
discussion opened in Communicating Process Architectures 2001 and described
the initial project phases and summarised some preliminary results. This
paper presents a follow up, providing a canonical report on the development,
application and insight gained from the project.
Languages, Libraries and Kernels (soft and hard):
Prioritised Dynamic Communicating Processes - Part I
Fred Barnes and Peter Welch (Computing Laboratory, University of Kent)
Abstract: This paper reports continuing research on
language design, compilation and kernel support for highly dynamic concurrent
reactive systems. The work extends the occam multiprocessing language,
which is both sufficiently small to allow for easy experimentation and
sufficiently powerful to yield results that are directly applicable
to a wide range of industrial and commercial practice. Classical occam
was designed for embedded systems and enforced a number of constraints
- such as statically pre-determined memory allocation and concurrency limits
- that were relevant to that generation of application and hardware technology.
Most of these constraints have been removed in this work and a number of
new facilities introduced (channel structures, mobile channels, channel
ends, dynamic process creation, extended rendezvous and process priorities)
that significantly broaden occam?s field of application and raise the level
of concurrent system design directly supported. Four principles were set
for modifications/enhancements of the language. They must be useful and
easy to use. They must be semantically sound and policed (ideally, at compile-time)
to prevent mis-use. They must have very lightweight and fast implementation.
Finally, they must be aligned with the concurrency model of the original
core language, must not damage its security and must not add (significantly)
to the ultra-low overheads. These principles have all been observed. All
these enhancements are available in the latest release (1.3.3) of KRoC,
freely available (GPL/open source) from: http://www.cs.ukc.ac.uk/projects/ofa/kroc/
Prioritised Dynamic Communicating Processes - Part II
Fred Barnes and Peter Welch (Computing Laboratory, University of Kent)
Abstract: This paper illustrates the work presented
in 'Part I', giving additional examples of use of channel-types, extended
rendezvous and FORKs that lean towards real applications. Also
presented are a number of other additions and extensions to the occam
language that correct, tidy up or complete facilities that have long existed.
These include fixing the PRI ALT bug, allowing an unconditional
guard as the last in a PRI ALT, replicator
sizes, run-time computed PAR replication counts,
parameters and abbreviations, nested PROTOCOL definitions, inline
array constructors and parallel recursion. All are available in the latest
release (1.3.3) of KRoC,
freely available (GPL/open source) from: http://www.cs.ukc.ac.uk/projects/ofa/kroc/
The "Honeysuckle" Programming Language: Event and Process
Ian East (School of Computing and Mathematical Sciences, Oxford
Brookes University, UK)
Abstract: A new language for programming systems with
Process Architecture  is introduced which builds upon the success
of occam . Some of the principal objectives are presented and justified.
The means employed to express behaviour are then described, including a
primitive, which conveys object ownership as well as value , and an
construct. The latter replaces PRI PAR and PRI ALT, and affords explicit
expression of conflict-free prioritized reactive (event-driven) behaviour,
including exception response . HPL also offers source-code modularity,
object encapsulation, and the recursive definition of both object and process.
Despite such ambition, a primary aim has been to retain simplicity in abstraction,
expression, and implementation.
The "Honeysuckle" Programming Language: Object and Protocol
Ian East (School of Computing and Mathematical Sciences, Oxford Brookes University, UK)
Abstract: Applications are demanding greater concurrency,
higher integrity, and more reactive behaviour. There is a need to meet
this demand without a requirement for additional tools and skills. To this
end, HPL adds to occam's inherent security by enforcing formal design
rules which guarantee deadlock-freedom . Service protocol
is introduced to constrain inter-process communication. Recursive data
type definition  is introduced to allow a little object-oriented
programming (OOP) within communicating process architecture (CPA)  and
restore the balance between process and object abstraction.
Synchronous Active Objects Introduce CSP's Primitives in Java
Claude Petitpierre (Laboratoire de Téléinformatique, EPFL,Switzerland)
Abstract: This paper presents a proposal of a language,
based on synchronous active objects that introduces the CSP primitives
into Java. The proposal does not use channels to realise the inter-process
communications, but is shown to offer the same expressive power as the
channel based solutions. The paper also shows that the rendezvous defined
by CSP or our synchronous objects are very well adapted to the industrial
development of event driven applications, handling simultaneously GUIs
and accesses to remote applications.
Cluster Computing and JCSP Networking
Brian Vinter (Department of Mathematics and Computer Science, University of Southern Denmark, Denmark)
Peter Welch (Computing Laboratory, University of Kent at Canterbury, UK)
Abstract: Hoare's algebra of Communicating Sequential
Processes (CSP) enables a view of systems as layered networks of concurrent
components, generating and responding to events communicated to each other
through channels, barriers and other (formally defined) synchronisation
primitives. The resulting image and discipline is close to hardware design
and correspondingly easy to visualise, reason about, compose and scale.
JCSP is a library
of Java packages providing an (occam) extended version of this model that
may be used alongside, or as a replacement for, the very different threads-and-monitors
concurrency mechanisms built into Java. The current release (JCSP 1.0)
supports concurrency within a single Java Virtual Machine (which may be
multi-processor). This paper reports early experiments with JCSP.net,
an extension of JCSP for the dynamic construction of CSP networks across
distributed environments. The aims of JCSP.net are to simplify
the construction and programming of dynamically distributed and parallel
systems. It provides high-level support for CSP architectures, unifying
concurrency logic within and between processors. The experiments are on
some classical HPC problems, an area of work for which JCSP.net
was not primarily designed. However, low overheads in the supporting infrastructure
were a primary consideration - along with an intuitive and high-level distributed
programming model (based on CSP). Results reported show JCSP holding up
well against - and often exceeding - the performance obtained from existing
tools such as mpiJava and IBM?s TSpaces. The experimental platform was
a cluster of 16 dual-processor PIII Linux machines. It is expected that
future optimisations in the pipeline for the JCSP.net infrastructure
will improve the results presented here. JCSP and JCSP.net
were developed at the University of Kent.
Configurable Collective Communication in LAM-MPI
John Markus Bjorndalen, Otto J. ANSHUS and Tore LARSEN (Department
of Computer Science, University of Tromsø, Norway)
Brian Vinter (Department of Mathematics and Computer Science, University
of Southern Denmark, Denmark)
Abstract: In an earlier paper, we observed that PastSet
(our experimental tuple space system) was 1.83 times faster on global reductions
than LAM-MPI. Our hypothesis was that this was due to the better resource
usage of the PATHS framework (an extension to PastSet that supports orchestration
and configuration) due to a mapping of the communication and operations
which matched the computing resources and cluster topology better. This
paper reports on an experiment to verify this and represents on-going work
to add some of the same configurability of PastSet and PATHS to MPI. We
show that by adding run-time configurable collective communication, we
can reduce the latencies without recompiling the application source code.
For the same cluster where we experienced the faster PastSet, we show that
with our con-figuration mechanism is 1.79 times faster than the original
LAM-MPI Allreduce. We also experiment with the configuration mechanism
on 3 different cluster platforms with 2-, 4-, and 8-way nodes. For the
cluster of 8-way nodes, we show an improvement by a factor of 1.98 for
Java PastSet - A Structured Distributed Shared Memory System
Kei Simon Pedersen and Brian Vinter (Department of Mathematics and
Computer Science, University of Southern Denmark, Denmark)
Abstract: The architecture and performance of a Java
implementation of a structured distributed shared memory system, PastSet,
is described. The PastSet abstraction allows programmers to write applications
that run efficiently on different architectures, from clusters to widely
distributed systems. PastSet is a tuple-based three-dimensional structured
distributed shared memory system, which provides the programmer with operations
to perform causally ordered reads and writes of tuples to a virtual structured
memory space called the PastSet. It has been shown that the original, native
code, PastSet was able to outperform MPI and PVM when running real applications
and we show that the design translates into Java so that Java PastSet is
a qualified competitor to other cluster application programming interfaces
Cache-Affinity Scheduling for Fine Grain Multithreading
Kurt Debattista, Kevin Vella and Joseph Cordina (Department of Computer
Science and Artificial Intelligence, University of Malta, Malta)
Abstract: Cache utilisation is often very poor in multithreaded
applications, due to the loss of data access locality incurred by frequent
context switching. This problem is compounded on shared memory multiprocessors
when dynamic load balancing is introduced and thread migration disrupts
cache content. In this paper, we present a technique, which we refer to
as ?batching?, for reducing the negative impact of fine grain multithreading
on cache performance. Prototype schedulers running on uniprocessors and
shared memory multiprocessors are described, and finally experimental results
which illustrate the improvements observed after applying our techniques
Reconnetics: A System for the Dynamic Implementation of Mobile Hardware Processes in FPGAs
Ralph Moseley (Computing Laboratory, University of Kent at Canterbury,
Abstract: The capacity to utilise FPGA designs in such
a way that they are compositional, mobile, and ?interactive?, offers many
new possibilities. This holds true for both research and commercial applications.
With the system described here, hardware becomes as easy to distribute
as software and the mediating links between the two domains allow for manipulation
of physical resources in real time. Designs are no longer monolithic ?images?
downloaded at one time, but mobile entities that can be communicated over
a distance and dynamically installed at run-time many times and at many
places. Such twin domain designs can be as complex as a processor, or as
small in scale as a few logic gates. The run-time system, Reconnetics,
provides an environment of high-level control over such elements, which
requires little knowledge of the underlying hardware technology.
Performance Analysis and Behaviour Tuning for Optimisation of Communicating Systems
Mark Green and Ali E. Abdallah (Centre for Applied Formal Methods,
South Bank University, UK)
Abstract: Improving performance is the main driving
force behind the use of parallel systems. Models for performance evaluation
and techniques for performance optimisation are crucial for effectively
exploiting the computational power of parallel systems. This paper focuses
on methods for evaluating the performance of parallel applications built
from components using the software architecture methodology. Minor differences
in the low-level behaviour of functionally equivalent processing elements
can have a dramatic effect upon the performance of the overall system;
this confounds attempts to predict performance at any step prior to implementation.
is a tool supporting the construction and graphical manipulation of interacting
systems built from components. It makes use of specifications in the formal
method CSP, which enables the relevant component behaviours and the linkage
between components to be concisely described. The tool allows the behaviour
of the whole system over time, and the patterns of interaction between
the components, to be visualised through graphical animation. The graphical
display produced facilitates the analysis and evaluation of performance,
and highlights areas where performance could be improved via better utilisation
of parallel resources. VisualNets also allows the timing properties
of the components and of the architecture that underlies them to be changed,
to represent different component implementations or platform configurations.
A case study, based on the dual pipeline architecture, is presented to
show how the graphical animation capability of VisualNets can be
used, firstly to evaluate performance, and secondly to guide the development
of more efficient versions of the parallel system.
Configuration Discovery and Mapping of a Home Network
Keith Pugh (Computer Science Department, Keele University, UK)
Abstract: A home network comprising many heterogeneous
devices requires a scaleable interconnect capable of satisfying the often
vastly different network resource demands of the devices. In addition,
it is likely that over time the configuration of the network will change
as new devices are added to the network and older ones removed or replaced.
While it is acceptable for enterprise networks to be managed by a trained
network administrator it is unreasonable to expect a home owner to have
such expertise. Consequently, a free topology network coupled with a capacity
for plug and play that allows nodes to be added anywhere, and at any time
without interruption to the operation of the networked system are essential
requirements. IEEE 1355 standard technology has the potential to satisfy
these criteria. The demand for a free topology and a capacity for plug
and play require that the configuration of the network is re-discovered
and mapped automatically, and at regular intervals. This paper describes
such a configuration mapping process.
A Communicating Threads Case Study: JIWY
Dusko Jovanovic, Gerald Hilderink, Jan Broenink (Department of Electrical
Engineering, University of Twente, The Netherlands)
Abstract: We strive to allow a control system designer
the power of designing control computer code concurrently and generating
it efficiently, in spite of her or his lack of skills in software engineering.
This is important for many reasons. Nowdays, it is impossible to separate
control engineering from software engineering. There is no way of implementing
control strategies other than to transform them into computer code for
the chosen processing target. Usually, there are not so many general-purpose
programmers available in control engineering research teams in companies
and universities. In these cases, software development techniques suffer
from insufficient knowledge in disciplines of software modelling, concurrency,
reusability and testing. This paper develops a case study whose solution
is based upon the CSP principles supported by the Communicating
Threads libraries developed at Twente and argues why
the techniques are accessible to non-computing-specialist control
Implementing a Distributed Algorithm for Detection of Local Knots and Cycles in Directed Graphs
Geraldo Pereira de Souza and Gerson Henrique Pfitscher (Department
of Computer Science, University of Brasilia, Brazil)
Abstract: In general, most deadlocks take form of cycles
(in database systems) and knots (in communication systems). Boukerche and
Tropper have proposed a distributed algorithm to detect cycles and knots
in generic graphs. Their algorithm has a message complexity of 2m vs. (at
least) 4m for the Chandy and Misra algorithm, where m is the number of
links in the graph, and requires O (n log n) bits of memory, where n is
the number of nodes. We have implemented Boukerche´s algorithm. Our
implementation of the algorithm is based on the construction of processes
of the CSP model. The implementation was done using JCSP, an implementation
of CSP for Java.