Communicating Process Architectures 2018 (CPA 2018)


  Editors: Kevin Chalmersa, Jan Bækgaard Pedersenb, Marc L. Smithc, Kenneth Skovheded, Peter H. Welche
      (a) School of Computing, Edinburgh Napier University, UK
      (b) Department of Computer Science, University of Nevada Las Vegas, USA
      (c) Computer Science Department, Vassar College, USA
      (d) Niels Bohr Institute, University of Copenhagen, Denmark
      (e) School of Computing, University of Kent, UK
 
  Publisher: IOS Press, Amsterdam, The Netherlands
  ISBN:

  Held at University of Dresden in Dresden, Germany, August 19 - 22, 2018 (Local chair: Rainer Spallek)
 
  Conference Program
 
  Award Winners
  Deriving Reusable Go Components From Verified CSP Prototypes (Best Student Paper)
  Towards Automatic Program Specification Using SME Models (Best Paper)
  Space Invaders Game Written in SME, Running on an FPGA (Best Fringe)
 
 Keynote Presentations
 
 Title: SpiNNaker-2: a 10 Million Core Processor System for Machine Learning and Brain Simulation
 Authors: Christian Mayr
VLSI-Systems and Neuromorphic Circuits, Technische Universitat Dresden
 Abstract: In the EU flagship HBP, TU Dresden and University of Manchester jointly develop the SpiNNaker-2 processor platform for brain simulation. Characteristics of the SpiNNaker-2 system are 144 ARM cores per chip (including various numerical accelerators), for overall 10 Mio cores with 5PFLOPS CPU performance and 0.6ExaOPS deep learning acceleration. The chips are implemented in a 22nm FDSOI technology, running as low as 0.4V with the usage of our Adaptive Body Biasing methodology. The SpiNNaker-2 system combines highly efficient machine learning, bio-inspired signal processing at millisecond latency and ultra-low energy operation. Thus, besides the main usage as a neuroscience platform, SpiNNaker-2 opens up completely new avenues in areas such as tactile internet, autonomous driving and industry 4.0.
 Bibliography:   [BibTeX]
 Abstract: [PDF] 
 Presentation:
 
 
 Papers
 
 Title: aPyCSP - Asynchronous PyCSP Using Python Coroutines and asyncio
 Authors: John Markus Bjørndalena, Brian Vinterb, Otto J. Anshusa
(a) Department of Computer Science, University of Tromsø
(b) Niels Bohr Institute, University of Copenhagen
 Abstract: PyCSP provides a CSP (Communicating Sequential Processes) based library for Python, where CSP processes can be executed as threads, operating system processes or greenlets. The main drawback of the thread and process implementations is that they limit the number of CSP processes that a program can use due to operating system limitations. Threads and processes also introduce memory and synchronisation overheads. This overhead is reduced when using greenlets, but requires the use of an external library. aPyCSP is an experimental implementation of PyCSP that uses recent support for coroutines in Python. This paper explores two implementation approaches for aPyCSP where the most recent version is shorter, simpler to read and understand and supports more functionality than the original version, including generic channel support and input and output guards. The aPyCSP implementation is also faster and uses less memory per CSP process, allowing us to scale to 15 million processes on a computer with 64 GB memory.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Emit - Communicating Sequential Processes in Ruby
 Authors: Mads Ohm Larsen, Brian Vinter
Niels Bohr Institute, University of Copenhagen
 Abstract: CSP is an algebra for reasoning about concurrent systems of processes. Being able to do so has become a necessity for computer scientists. Having to think about abstractions like mutexes and threads in practice can be cumbersome, complex, and erroneous.
Ruby as a programming language have been described as fun to program in. It is however missing a CSP framework that it can call its own. Emit, which is presented in this paper, tries to mitigate this by providing such a CSP framework.
As a CSP framework, Emit makes it easy to think about processes, channels, and communication. It is not yet feature complete, however comparing it to its nearest peer, PyCSP, shows good performance for the commstime benchmark, where Emit is 100 times faster.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: A Transparent Thread and Fiber Framework in C++CSP
 Authors: Kevin Chalmers
School of Computing, Edinburgh Napier University
 Abstract: There are multiple low-level concurrency primitive types supported today, but these often require the programmer to be explicit in their implementation decisions at design time. This work illustrates how a process-oriented model written in C++CSP can hide the underlying primitives from the programmer to allow M:N style thread support. The objective was to provide integrated thread and fiber support in C++CSP without major changes to the process interface. To illustrate that threads and fibers are working together two experiments have been undertaken. The first experiment executes multiple stressed select processes to illustrate that that multiple fiber contexts are in operation. The second experiment executes a scaling number of processes syncing across hardware to illustrate the sync-time cost as the number of fibers increases in each context. The results illustrate that it is possible to build M:N style thread support transparently for process design, and that doing so provides a significant increase in the number of active processes in library supported CSP while still taking advantage of multicore hardware.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: MUMMERING Platform Idea's & Ubiquitous Data Analysis
 Authors: Rasmus Munk, Brian Vinter
Niels Bohr Institute, University of Copenhagen
 Abstract: The growth from terabytes of 3D imaging data and soon approaching petabytes from material analysis has left the scientists involved with a set of challenges. In particular, the ability to analyze this collection when going from developing an algorithm for a single image, to efficiently scale this analysis to 100s if not 1000s of images. The MUMMERING research project aims to solve this by providing ability to submit workflows to automate this process. We explore and present our initial design thoughts in this endeavor. This includes a proposal to utilize the IDMC system developed at UCPH, which includes job scheduling support to accomplish part of this. In addition, we introduce the mig_utils Python library that enables local IO like access data directly in an imaging analysis program.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Bohrium.rb - The Ruby Front End
 Authors: Mads Ohm Larsen, Brian Vinter
Niels Bohr Institute, University of Copenhagen
 Abstract: The acceptance of Ruby in the scientific community lags a bit behind, partly because it is missing a good library for linear algebra and vector programming. It has a matrix class in its standard library, but its execution tends to be rather slow. Only a couple of actual scientific computing libraries like NumPy for Python exist for Ruby. In this paper we introduce a new library called Bohrium.rb. Bohrium.rb acts as a front end for the Bohrium framework, which generates and runs JIT-compiled OpenMP/OpenCL kernels. It currently supports Python/NumPy and C++, however as it is built of processes communicating hierarchically to each other, we can replace the front ends with new ones. This new Ruby front end is described with examples and is then compared to the standard library and an already established Ruby library Numo/Narray, where Bohrium.rb seems to be faster for still larger matrix calculations. This is also the trend we have seen in similar areas with Bohrium, being faster once its overhead has been amortized.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Lossy Channels in Bohrium
 Authors: Mads R. B. Kristensen, James Avery
Niels Bohr Institute, University of Copenhagen
 Abstract: In this paper, we introduce a new channel to the Bohrium runtime system that supports lossy compression of data. This channel makes it possible for a local user to offload calculation and data heavy processing to a remote server and still be able to visualize the data on a modest network connection. The new lossy channel supports a fixed frame rate mode where the user can specify a desired frame rate that the lossy channel will maintain. In order to maintain a frame rate, the lossy channel will dynamically adjust the quality setting of the lossy encoding.
We assess how the lossy encoded data approximate the original data -- both visually and numerically -- and how the data compresses compared to popular lossless formats. For this assessment, we use the data from three different scientific simulations and the lossless formats PNG and zlib and the lossy format JPEG.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: A Majority Vote, Modelled by Asynchronous Readers and Asynchronous Writers
 Authors: Antoon H. Boode, Jan F. Broenink
Robotics and Mechatronics, CTIT Institute, University of Twente
 Abstract: Reading and writing is modelled in CSP using actions containing the symbols ? and !. These reading actions and writing actions are synchronous, and there is a one-to-one relationship between occurrences of pairs of these actions. In the CPA conference 2017, we introduced the extended half-synchronous alphabetised parallel operator X ↕ Y, which disconnects the writing to and reading from a channel in time; the reading processes are divided into sets which are set-wise asynchronous, but intra-set-wise synchronous, giving full flexibility to the reads.
In this paper, we allow multiple writers to write to the same channel set-wise asynchronously, but intra-set-wise synchronously and we study the impact on our (Extended) Vertex Removing Synchronised Product. The advantages we accomplish are that the extension of X ↕ Y gives more flexibility by indexing the writing actions and the reading actions, leading to a straightforward priority vote design. Furthermore, the extension of X ↕ Y preserves the advantages of the X ↕ Y operator.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: CSP at the Cyber-Physical Edge
 Authors: Lukasz Michalika, Michael Murphya, John Markus Bjørndalenb, Otto J. Anshusb
(a) University of Tromsø
(b) Department of Computer Science, University of Tromsø
 Abstract: Today, to do ground-based in-situ observations of the arctic tundra, researchers carry wild life cameras and other observation units into the field, manually configure the devices while on the arctic tundra, and fetch the collected data several months later. This approach does not scale. Instead, observing and reporting of data must be automated using a distributed wireless network of autonomous observation units.
We present the basic hardware and software architectures of the Distributed Arctic Observatory (DAO) observation units. A DAO observation unit is composed of both heavy and light computer cores. The idea is that the heavy cores are used for resource demanding tasks, and power them off and shift workload to light cores when possible to achieve energy efficiency.
We report on initial thoughts and experiences in applying a CSP network on an observation unit to ease development of advanced functionalities, while still achieving energy efficiency for observation units.
In order to rapidly develop prototype systems and learn from them, we have composed observation units with a combination of Raspberry Pi computers as the heavy cores, and Arduino, Nucleo, and Sleepy Pi microcontrollers as the light cores.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: SMEIL: A Domain-Specific Language for Synchronous Message Exchange Networks
 Authors: Truls Asheim
Niels Bohr Institute, University of Copenhagen
 Abstract: Synchronous Message Exchange (SME) is a CSP-derived model for hardware designs implementing globally synchronous message passing. SME implementations currently exist for several general-purpose languages, some of which, are translatable to VHDL for subsequent implementation on hardware. A common SME language could reduce the duplication and feature disparity present in these independent implementations. This paper introduces a domain-specific language for implementing SME designs. It is usable both as a primary implementation language for SME models and as an intermediate target for general-purpose languages. We describe the language, its implementation and its features. Furthermore, we explain the specific requirements for a language within this domain. Finally, we evaluate the language through a number of simple, but realistic, hardware designs by showing how they may be implemented and tested.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Towards Automatic Program Specification Using SME Models (Best Paper)
 Authors: Alberte Thegler, Mads Ohm Larsen, Kenneth Skovhede, Brian Vinter
Niels Bohr Institute, University of Copenhagen
 Abstract: This paper introduces a method to simplify hardware modeling and verifi- cation thereof in order for software programmers to, more easily, meet the demands of the growing embedded device industry. We describe a simple method for transpiling from the new SME Implementation Language into CSPM and using formal verifica- tion to verify properties within the generated program. We present a small example consisting of a seven segment display clock network and introduce how to verify the widths of the channels in the network.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Deriving Reusable Go Components From Verified CSP Prototypes (Best Student Paper)
 Authors: James Dibley, Karen Bradshaw
Department of Computer Science, Rhodes University
 Abstract: This paper presents a modelling approach and software tool support for automatically deriving reusable Go components from CSP prototypes that preserves compatibility with existing CSP verification tools. Using the example of a concurrent component that generates prime numbers, the paper demonstrates how to model and verify a parameterised concurrent component in machine-readable CSP, and presents the software tool’s automatically-derived implementation of this model as a reusable Go component.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Evaluating CSP as a Programming Model to Build Distributed Systems
 Authors: Kenneth Skovhede, Brian Vinter
Niels Bohr Institute, University of Copenhagen
 Abstract: In this paper we investigate the use of the CSP programming model for im- plementing distributed systems in an educational setting. For a practical example we choose a well-studied classic distributed system: a distributed hash table. We describe our implementation and compare it to a number of existing open-source implemen- tations. We discuss a number of parameters for the example implementation, such as amount of code, potential for errors, and similar parameters.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Use Case Driven Microservices Architecture Design
 Authors: Jeremy M. R. Martina, Peter Boggisb
(a) ITG Application Group, Lloyd's
(b) Lloyd's of London
 Abstract: We extend the Use Case Driven Object Modelling methodology to the decomposition of a system into microservices. We introduce the concept of the Aggregated Class Interaction Diagram as a tool for balancing the number and complexity of interfaces between services with the benefit of constructing a system as a set of simple, independent, single purpose components. This is illustrated with a practical example of a system for running stochastic financial models
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Solving the Santa Claus Problem Over a Distributed Network
 Authors: David Marchanta, Jon Kerridgeb
(a) Niels Bohr Institute, University of Copenhagen
(b) School of Computing, Edinburgh Napier University
 Abstract: This paper demonstrates a working solution to the Santa Claus Problem over a distributed system. The solutions was developed using the JCSP library for Java and demonstrated over a network of 20 desktop PCs. Each Santa, Reindeer and Elf process is broken down into numerous sub-processes to allow for a deadlock and livelock free system per the requirements of the client/server model. A novel approach to the Elves is presented using a chain-based architecture to allow non pre-determined sub-groups to communicate and consult with Santa in sets of 3 without any central controller.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Testing and Verifying Parallel Programs Using Data Refinement
 Authors: Jeremy M. R. Martin
ITG Application Group, Lloyd's
 Abstract: We show how to apply the technique of data refinement from the field of Formal Methods to test or verify scientific parallel programs by generalizing the concept of abstraction functions to distributed data structures. This work is motivated by experience of developing computationally-intensive finance models to support the London Insurance Market and the application of test-driven development to achieve correctness by design.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Verifying Mobile Systems: Take Two
 Authors: Frédéric Peschanski
Sorbonne University
 Abstract: This papers relates the design and development of two “proof of concept” analysis tools for the pi-calculus. The main functionality of the first tool, named PiEx- plorer, is to construct the state-space of a finitary pi-calculus process as a directed graph akin to a finite-state automaton. As counterintuitive as it may seem, this is a rather complex objective. The second tool is an implementation of the main pi-calculus features (especially name-passing) in the Promela language. This enables the use of the Spin model checker for the verification task. Although very different, these two prototypes share a common foundation: a notion of a stateful observer for pi-calculus processes. This is the central idea discussed in the paper.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: T42 - Transputer in FPGA
 Authors: Uwe Mielkea, Martin Zabelb, Michael Bruestlea
(a) Electronics Engineer
(b) Institute of Computer Engineering, Technische Universität Dresden
 Abstract: The T42 Transputer in FPGA is a full binary code-compatible open-source VHDL implementation of the inmos IMS-T425 32bit microprocessor. The T42 is mainly targeted for education and exploration purposes and many interlinked cores can be loaded onto any suitable sized FPGA board. The construction of larger parallel computing systems based on the Transputers distributed memory architecture is possible, e.g. within student semester projects. The programming model of occam can be evaluated and compared to other languages.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 Title: Implementing a Transputer for FPGA in Less Than 800 Lines of Code
 Authors: Carl-Johannes Johnsena, Kenneth Skovhedeb, Brian Vinterb, Lindsay O'Brien Quarriec, Lawrence J. Dicksonc
(a) Department of Computer Science, University of Copenhagen
(b) Niels Bohr Institute, University of Copenhagen
(c) Space Sciences Corporation
 Abstract: By utilizing Synchronous Message Exchange (SME) for hardware de- sign, we see that going from a hardware schematic to an implementation becomes a much shorter process. This in turn shifts the focus to the architectural details of the implementation. This is shown by constructing an implementation of the Transputer in SME. This implementation has been made in less than 800 lines of code within the timeframe of ?4 months, where the majority of the time spent has been on the Transputer architecture. The resulting implementation is suboptimal compared to similar projects. However, since no optimizations have been made, reaching a more reasonable resource consumption and clockrate should be attainable within a few months.
 Bibliography:   [BibTeX]
 Paper: [PDF] 
 Presentation:
 
 
 Fringe Presentations
 
 Title: Unravelling XC concepts [[combine]],[[combinable]], [[distribute]], [[distributable]] and [[distributed(..)]] plus par and on ...
 Authors: Øyvind Teig

 Abstract: The XC programming language is designed to make runnable multitask programs for XMOS' xCore multicore microcontrollers, spread on tiles and cores. This presentation addresses only a few aspects of XC. What is a combinable and what is a distributable task, syntactically and semantically, but also resource wise and geographically on the chip? The xCore compiler handles the lowering of interfaces onto statically and dynamically allocated channel resources. The microcode and built-in scheduler also reflect them. Of the rather limited amount of resources, one must make it with 32 chanends per tile, with seemingly loose coupling between the XC code and the final number of chanends. Fiddling around, rather overwhelmed, with [[combine]], [[combinable]], [[distribute]], [[distributable]] and [[distributed(..)]] plus par and on reveals a lot of how unchanged functionality spreads differently over cores or across tiles. Thus the same functional code may require anything from, like, six to zero chanends! The goal of this fringe presentation is to trigger interest, not provide answers. Or rather, only those answers that the author might presently have acquired by surrendering to this can of reserved words and observe what might possibly happen. Like, there must be an electric motor inside, because the sound of it is so pleasant.
 Bibliography:   [BibTeX]
 Abstract: [PDF] 
 Presentation:
 
 Title: Challenges When Implementing a Battery-less Display
 Authors: Herman Roebbers
Intelligent Systems / Altran Nederland B.V.
 Abstract: Over the years it has become more and more important to reduce energy consumption, especially in the realm of IoT edge devices, which often must operate without a battery change for several years. At Altran we want to design and implement solutions that do not require batteries, but harvest energy from their surroundings. They use this energy to operate a radio for receiving and transmitting information, as well as operate sensors and actuators. This requires careful analysis of contributors to energy consumption in order to minimize the energy requirements. One aspect of this is the collaboration and coordination of hundreds of such nodes on the use of a shared medium, namely the radio frequency on which they operate. A prototype of a demonstrator to create and show understanding of the various technologies involved is under development. The presentation gives an overview of the many challenges and (proposed) solutions for the complete development of such a system and presents the current status of the prototype, capable of running forever on 2 small solar cells and interacting over Bluetooth Low Energy with a server node.

Of interest to this community is the development of a protocol to create and maintain a robust communication schedule, scalable to hundreds or thousands of nodes, such that the amount of energy to exchange information is minimized while preventing deadlock and starvation. Furthermore the network must be self-organizing, i.e. must automatically spread the edge-node communications such that these do not keep colliding when the network is in the construction phase. Finally there must be a guarantee that all nodes can exchange information with a server within a predefined time.
 Bibliography:   [BibTeX]
 Abstract: [PDF] 
 Presentation:
 
 Title: Space Invaders Game Written in SME, Running on an FPGA (Best Fringe)
 Authors: Carl-Johannes Johnsen
Department of Computer Science, University of Copenhagen
 Abstract: Given the presentation of the Pong inspired game at last years conference, we bring you another, more complex arcade game. This time we have partially implemented Space Invaders running on an FPGA (Field-Programmable Gate Array), which renders the screen directly to the VGA output of the board. Furthermore, to improve the usability of the game, we have connected the wires of an NES controller directly to the board, in order to control the game.
 Bibliography:   [BibTeX]
 Abstract:
 Presentation:
 
 Title: Java, JCSP and Crew (The Strange Case of the Spurious Spurious)
 Authors: Peter H. Welcha, Matthew Ewerb, Jan Bækgaard Pedersenc, Jeremy M. R. Martind
(a) School of Computing, University of Kent
(b) Department of Computer Science, Stanford University
(c) Department of Computer Science, University of Nevada Las Vegas
(d) ITG Application Group, Lloyd's
 Abstract:
 Bibliography:   [BibTeX]
 Abstract:
 Presentation:
 
 
 Workshops
 
 Title: Workshop on Translating CSP-Based Languages to Common Programming Languages
 Authors: Brian Vintera, Lawrence J. Dicksonb, James Dibleyc
(a) Niels Bohr Institute, University of Copenhagen
(b) Space Sciences Corporation
(c) Department of Computer Science, Rhodes University
 Abstract: Programming in CSP style has traditionally been done in occam, or languages such as csp_m for more formal verification. Occam has been in a frozen state for almost 30 years, though a runnable toolset is still available on transputer.net, and code can be targeted to Gavin Chase’s Transputer Emulator and to the Transterpreter. Getting to a setup where one may compile and run occam programs with the more recent KROC occam-Pi is non-trivial, and getting the compiled occam-Pi code to run on different platforms is even more work.
Alternatively to occam one may use one of the many library-based CSP approximations. However, these are typically not as easy to program as occam, and for portability may require large runtime environments. And both these and the occam-Pi approach lose a huge CSP and static occam advantage: the hardware-software equivalence that can use hardware channels (links) to program and run communicating independent devices in an extremely slim and understandable way.
One viable approach may be to implement a transpiler, where occam or related languages are translated rigorously into other, more common, languages that typically have highly maintained libraries on many platforms. This has been done for plain C, but languages like Go and Rust, which already have CSP-like primitives which have performed well in previous (CPA2017) projects, would be more attractive choices for such target languages.
The workshop will focus on Go, though not exclusively. It will motivate the problems, exhibit techniques for transforming near-CSP Go constructs into rigorously correct occam constructs, relate these to a complete lex/yacc grammar for the occam language, demonstrate a few applications that are not easily achievable with what is available today, and introduce ongoing work on producing transpilers, including the rigorous extension and porting of the link concept to drive more typical modern hardware communication such as block and network devices.
 Bibliography:   [BibTeX]
 Abstract:
 Presentation: