Book Chapters

Harrison M, Masci P, Campos JC, Curzon P.  In Press.  The specification and analysis of use properties of a nuclear control system. Formal Methods in Human Computer Interaction. Abstract

The chapter explores a layered approach to the analysis of the Nuclear Power Plant Control System described in Chapter 4. A model is specified to allow the analysis of use-centred properties based on generic templates. User interface properties include: the visibility of state attributes, the clarity of the mode structure and the ease with which an action can be recovered from. Property templates are used as heuristics to ease the construction of requirements for the control system interface.

Ruksenas R, Masci P, Curzon P.  2016.  Developing and Verifying User Interface Requirements for Infusion Pumps: A Refinement Approach. From Action Systems to Distributed Systems: The Refinement Approach. wp325.pdf
Barbosa LS, Martins MA, Madeira A, Neves R.  2016.  Reuse and Integration of Specification Logics: The Hybridisation Perspective. Theoretical Information Reuse and Integration. 446:1–30. Abstractbmmn16.pdf

Hybridisation is a systematic process along which the characteristic features of hybrid logic, both at the syntactic and the semantic levels, are developed on top of an arbitrary logic framed as an institution. It also captures the construction of first-order encodings of such hybridised institutions into theories in first-order logic. The method was originally developed to build suitable logics for the specification of reconfigurable software systems on top of whatever logic is used to describe local requirements of each system’s configuration. Hybridisation has, however, a broader scope, providing a fresh example of yet another development in combining and reusing logics driven by a problem from Computer Science. This paper offers an overview of this method, proposes some new extensions, namely the introduction of full quantification leading to the specification of dynamic modalities, and exemplifies its potential through a didactical application. It is discussed how hybridisation can be successfully used in a formal specification course in which students progress from equational to hybrid specifications in a uniform setting, integrating paradigms, combining data and behaviour, and dealing appropriately with systems evolution and reconfiguration.

Sanchez A, Barbosa LS, Madeira A.  2015.  Modelling and Verifying Smell-Free Architectures with the Archery Language. Software Engineering and Formal Methods. :147-163. Abstractsmellfreearch2014.pdf

Architectural (bad) smells are design decisions found in software architectures that degrade the ability of systems to evolve. This paper presents an approach to verify that a software architecture is smell-free using the Archery architectural description language. The language provides a core for modelling software architectures and an extension for specifying constraints. The approach consists in precisely specifying architectural smells as constraints, and then verifying that software architectures do not satisfy any of them. The constraint language is based on a propositional modal logic with recursion that includes: a converse operator for relations among architectural concepts, graded modalities for describing the cardinality in such relations, and nominals referencing architectural elements. Four architectural smells illustrate the approach.

Teixeira J, Couto M.  2015.  Automatic Distinction of Fernando Pessoas' Heteronyms. Progress in Artificial Intelligence: 17th Portuguese Conference on Artificial Intelligence, EPIA 2015, Coimbra, Portugal, September 8-11, 2015. Proceedings. :783–788. Abstractautomatic-distinction-fernando.pdf


Barbosa MB, Castro D, Silva P.  2014.  Compiling CAO: From Cryptographic Specifications to C Implementations. Principles of Security and Trust. 8414:240-244. Abstractpost14.pdf

We present a compiler for CAO, an imperative DSL for the cryptographic domain. The tool takes high-level cryptographic algorithm specifications and translates them into C implementations through a series of security-aware transformations and optimizations.
The compiler back-end is highly configurable, allowing the targeting of very disparate platforms in terms of memory requirements and computing power.

Oliveira JN.  2014.  Relational Algebra for Just Good Enough Hardware. RAMiCS - 14th International Conference on Relational and Algebraic Methods in Computer Science. 8428:119-138. Abstractramics14.pdf

Device miniaturization is pointing towards tolerating imperfect hardware provided it is “good enough”. Software design theories will have to face the impact of such a trend sooner or later.
A school of thought in software design is relational: it expresses specifications as relations and derives programs from specifications using relational algebra.
This paper proposes that linear algebra be adopted as an evolution of relational algebra able to cope with the quantification of the impact of imperfect hardware on (otherwise) reliable software.
The approach is illustrated by developing a monadic calculus for component oriented software construction with a probabilistic dimension quantifying (by linear algebra) the propagation of imperfect behaviour from lower to upper layers of software systems.

Silva A, Jacobs B.  2014.  Initial Algebras of Terms With Binding and Algebraic Structure. Categories and Types in Logic, Language, and Physics - Lecture Notes in Computer Science. 8222:211-234. Abstract

One of the many results which makes Joachim Lambek famous is: an initial algebra of an endofunctor is an isomorphism. This fixed point result is often referred to as ``Lambek's Lemma''. In this paper, we illustrate the power of initiality by exploiting it in categories of algebra-valued presheaves $\EM(T)^{\N}$, for a monad $T$ on $\Sets$.
The use of presheaves to obtain certain calculi of expressions (with variable binding) was introduced by Fiore, Plotkin, and Turi. They used set-valued presheaves, whereas here the presheaves take values in a category $\EM(T)$ of Eilenberg-Moore algebras. This generalisation allows us to develop a theory where more structured calculi can be obtained. The use of algebras means also that we work in a linear context and need a separate operation $\bang$ for replication, for instance to describe strength for an endofunctor on $\EM(T)$. We apply the resulting theory to give systematic descriptions of non-trivial calculi: we introduce non-deterministic and weighted lambda terms and expressions for automata as initial algebras, and we formalise relevant equations diagrammatically.

Silva A, Jacobs B.  2014.  Automata Learning: A Categorical Perspective. Horizons of the Mind. A Tribute to Prakash Panangaden - Lecture Notes in Computer Science. 8464 Abstract

Automata learning is a known technique to infer a finite state machine from a set of observations. In this paper, we revisit Angluin’s original algorithm from a categorical perspective. This abstract view on the main ingredients of the algorithm lays a uniform framework to derive algorithms for other types of automata. We show a straightforward generalization to Moore and Mealy machines, which yields an algorithm already know in the literature, and we discuss generalizations to other types of automata, including weighted automata.

Hofer B, Riboira A, Wotawa F, Abreu R, Getzner E.  2013.  On the empirical evaluation of fault localization techniques for spreadsheets. Fundamental Approaches to Software Engineering. :68–82. Abstract

Spreadsheets are by far the most prominent example of end-user programs of ample size and substantial structural complexity. In addition, spreadsheets are usually not tested very rigorously and thus comprise faults. Locating faults is a hard task due to the size and the structure, which is usually not directly visible to the user, i.e., the functions are hidden behind the cells and only the computed values are presented. Hence, there is a strong need for debugging support. In this paper, we adapt three program-debugging approaches that have been designed for more traditional procedural or object-oriented programming languages. These techniques are Spectrum-based Fault Localization, Spectrum-Enhanced Dynamic Slicing, and Constraint-based Debugging. Beside the theoretical foundations, we present a more sophisticated empirical evaluation including a comparison of these approaches. The empirical evaluation shows that Sfl (Spectrum-based Fault Localization) and Sendys (Spectrum ENhanced Dynamic Slicing) are the most promising techniques.

Gonzalez-Sanchez A, Piel É, Abreu R, Gross H-G, Van Gemund AJC.  2013.  Prioritizing Tests for Fault Localization. Situation Awareness with Systems of Systems. :247–257. Abstract

In practically all development processes, regression tests are used to detect the presence of faults after a modification. If faults are detected, a fault localization algorithm can be used to reduce the manual inspection cost. However, while using test case prioritization to enhance the rate of fault detection of the test suite (e.g., statement coverage), the diagnostic information gain per test is not optimal, which results in needless inspection cost during diagnosis. We present RAPTOR, a test prioritization algorithm for fault localization, based on reducing the similarity between statement execution patterns as the testing progresses. Unlike previous diagnostic prioritization algorithms, RAPTOR does not require false negative information, and is much less complex. Experimental results from the Software Infrastructure Repository's benchmarks show that RAPTOR is the best technique under realistic conditions, with average cost reductions of 40% with respect to the next best technique, with negligible impact on fault detection capability.

Cardoso N, Abreu R.  2013.  A Distributed Approach to Diagnosis Candidate Generation. Progress in Artificial Intelligence. :175–186. Abstract

Generating diagnosis candidates for a set of failing transactions is an important challenge in the context of automatic fault localization of both software and hardware systems. Being an NP-Hard problem, exhaustive algorithms are usually prohibitive for real-world, often large, problems. In practice, the usage of heuristic-based approaches trade-off completeness for time efficiency. An example of such heuristic approaches is Staccato, which was proposed in the context of reasoning-based fault localization. In this paper, we propose an efficient distributed algorithm, dubbed MHS2, that renders the sequential search algorithm Staccato suitable to distributed, Map-Reduce environments. The results show that MHS2 scales to larger systems (when compared to Staccato), while entailing either marginal or small runtime overhead.

Cardoso N, Abreu R.  2013.  MHS2: A Map-Reduce Heuristic-Driven Minimal Hitting Set Search Algorithm. Multicore Software Engineering, Performance, and Tools. :25–36. Abstract

Computing minimal hitting sets (also known as set covers) for a collection of sets is an important problem in many domains (e.g., model/reasoning-based fault diagnosis). Being an NP-Hard problem, exhaustive algorithms are usually prohibitive for real-world, often large, problems. In practice, the usage of heuristic based approaches trade-off completeness for time efficiency. An example of such heuristic approaches is Staccato, which was proposed in the context of reasoning-based fault localization. In this paper, we propose an efficient distributed algorithm, dubbed MHS2, that renders the sequential search algorithm Staccato suitable to distributed, Map-Reduce environments. The results show that MHS2 scales to larger systems (when compared to Staccato), while entailing either marginal or small runtime overhead.

Koca F, Sözer H, Abreu R.  2013.  Spectrum-Based Fault Localization for Diagnosing Concurrency Faults. Testing Software and Systems. :239–254. Abstract

Concurrency faults are activated by specific thread interleavings at runtime. Traditional fault localization techniques and static analysis fall short to diagnose these faults efficiently. Existing dynamic fault-localization techniques focus on pinpointing data-access patterns that are subject to concurrency faults. In this paper, we propose a spectrum-based fault localization technique for localizing faulty code blocks instead. We systematically instrument the program to create versions that run in particular combinations of thread interleavings. We run tests on all these versions and utilize spectrum-based fault localization to correlate detected errors with concurrently executing code blocks. We have implemented a tool and applied our approach on several industrial case studies. Case studies show that our approach can effectively and efficiently localize concurrency faults.

Martens C, Bosser A-G, Ferreira JF, Cavazza M.  2013.  Linear Logic Programming for Narrative Generation. Logic Programming and Nonmonotonic Reasoning (LNCS 8148). Abstract

In this paper, we explore the use of Linear Logic programming for story generation. We use the language Celf to represent narrative knowledge, and its own querying mechanism to generate story instances, through a number of proof terms. Each proof term obtained is used, through a resource-flow analysis, to build a directed graph where nodes are narrative actions and edges represent inferred causality relationships. Such graphs represent narrative plots structured by narrative causality. This approach is a candidate technique for narrative generation which unifies declarative representations and generation via query and deduction mechanisms.

Campos JC, Saraiva JA, Silva CE, Silva JC.  2012.  GUIsurfer: A Reverse Engineering Framework for User Interface Software. Reverse Engineering - Recent Advances and Applications. :31-54. Abstract

In the context of developing tool support to the automated analysis of interactive systems implementations, this chapter proposal aims to investigate the applicability of reverse engineering approaches to the derivation of user interfaces behavioural models. The ultimate goal is that these models might be used to reason about the quality of the system, both from an usability and an implementation perspective, as well as being used to help systems maintenance, evolution and redesign.

Santos J, Abreu R.  2012.  Lightweight Automatic Error Detection by Monitoring Collar Variables. Testing Software and Systems. :215–230. Abstract

Although proven to be an e ective way for detecting errors, generic program invariants (also known as fault screeners) entail a considerable runtime overhead, rendering them not useful in practice. This paper studies the impact of using simple variable patterns to detect the so-called system's collar variables to reduce the number of variables to be monitored (instrumented). Two di erent patterns were investigated to determine which variables to monitor. The rst pattern nds variables whose value increase or decrease at regular intervals and deems them not important to monitor. The other pattern veri es the range of a variable per (successful) execution. If the range is constant across executions, then the variable is not monitored. Experiments were conducted on three different real-world applications to evaluate the reduction achieved on the number of variables monitored and determine the quality of the error detection. Results show a reduction of 52.04% on average in the number of monitored variables, while still maintaining a good detection rate with only 3.21% of executions detecting non-existing errors (false positives) and 5.26% not detecting an existing error (false negatives).

Andrés C, Abreu R, Núñez A.  2012.  OCE: an online collaborative editor. Computational Collective Intelligence. Technologies and Applications. :89–98. Abstract

In this paper we present the development of an Online Collaborative Editor (OCE) software system. It allows several people, to edit and share computer files using different devices, such as mobiles, PDAs in an easy way.
We use formal methods in order to automatize and describe OCE. Its formalism is very suitable to specify time requirements (both time consumption due to the performance of tasks and timeouts) as well as to represent data communication among different components of the system.
This exercise convinced us that a formal approach to develop complex systems can facilitate some of the development phases. In particular, the testing and debugging phases, more precisely, how to chose those tests more suitable to be applied, is simplified since tests are automatically extracted from the specification.

Jin D, Liu D, Yang B, Moreno CB, He D.  2011.  Ant colony optimization with Markov random walk for community detection in graphs. Advances in Knowledge Discovery and Data Mining. 6635:123–134. Abstract

Network clustering problem (NCP) is the problem associated to the detection of network community structures. Building on Markov random walks we address this problem with a new ant colony optimization strategy, named as ACOMRW, which improves prior results on the NCP problem and does not require knowledge of the number of communities present on a given network. The framework of ant colony optimization is taken as the basic framework in the ACOMRW algorithm. At each iteration, a Markov random walk model is taken as heuristic rule; all of the ants’ local solutions are aggregated to a global one through clustering ensemble, which then will be used to update a pheromone matrix. The strategy relies on the progressive strengthening of within-community links and the weakening of between-community links. Gradually this converges to a solution where the underlying community structure of the complex network will become clearly visible. The performance of algorithm ACOMRW was tested on a set of benchmark computer-generated networks, and as well on real-world network data sets. Experimental results confirm the validity and improvements met by this approach.

Moreno CB, Gonçalves N, José R.  2011.  Privacy preserving gate counting with collaborative bluetooth scanners. Workshops On the Move to Meaningful Internet Systems. 7046:534–543. Abstract

Due to its pervasiveness and communication capabilities, Bluetooth can be used as an infrastructure for several situated interaction and massive sensing scenarios. This paper shows how Bluetooth scanning can be used in gate counting scenarios, where the main goal is to provide an accurate count for the number of unique devices sighted. To this end, we present an analysis of several stochastic counting techniques that not only provide an accurate count for the number of unique devices, but o er privacy guarantees as well.

Ferreira JF, Mendes A, Cunha A, Moreno CB, Silva P, Barbosa LS, Oliveira JN.  2011.  Logic Training through Algorithmic Problem Solving. Tools for Teaching Logic. 6680:62-69. Abstractticttl11.pdf

Although much of mathematics is algorithmic in nature, the skills needed to formulate and solve algorithmic problems do not form an integral part of mathematics education. In particular, logic, which is central to algorithm development, is rarely taught explicitly at pre university level, under the justi cation that it is implicit in mathematics and therefore does not need to be taught as an independent topic. This paper argues in the opposite direction, describing a one week workshop done at the University of Minho, in Portugal, whose goal was to introduce to high-school students calculational principles and techniques of algorithmic problem solving supported by calculational logic. The work shop resorted to recreational problems to convey the principles and to software tools, the Alloy Analyzer and Netlogo, to animate models.

Madeira A, Martins M, Barbosa LS.  2011.  Models as arrows: the role of dialgebras. Models of Computation in Context. :144–153. Abstract

A large number of computational processes can suitably be described as a combination of construction, i.e. algebraic, and observation, i.e. coalgebraic, structures. This paper suggests dialgebras as a generic model in which such structures can be combined and proposes a small calculus of dialgebras including a wrapping combinator and se- quential composition. To take good care of invariants in software design, the paper also discusses how dialgebras can be typed by predicates and proves that invariants are preserved through composition. This lays the foundations for a full calculus of invariant proof-obligation discharge for dialgebraic models.

Abreu R, Gonzalez-Sanchez A, Van Gemund AJC.  2011.  A diagnostic reasoning approach to defect prediction. Modern Approaches in Applied Intelligence. :416–425. Abstract

During software testing, defect prediction approaches measure current reliability status, forecasting future program failures, and provide information on how many defects need to be removed before shipping. Existing approaches often require faults to be detected and identified as a new one, before a model-based trend can be fitted. While during regression testing failures may frequently occur, it is not evident which are related to new faults. Consequently, reliability growth trending can only be performed in sync with fault identification and repair, which is often performed in between regression test cycles. In this paper we present a dynamic, reasoning approach to estimate the number of defects in the system early in the process of regression testing. Our approach, coined Dracon, is based on Bayesian fault diagnosis over abstractions of program traces (also known as program spectra). Experimental results show that Dracon systematically estimates the exact number of (injected) defects, provided sufficient tests cases are available. Furthermore, we also propose a simple, analytic performance model to assess the influence of failed test cases in the estimation. We observe that our empirical findings are in agreement with the model.

Correia A, Pereira JO, Rodrigues L, Oliveira R, Carvalho N.  2010.  Practical Database Replication. Replication: Theory and Practice. Abstract

This chapter illustrates how the concepts and algorithms described earlier in this book can be used to build practical database replication systems. This is achieved first by addressing architectural challenges on how required functionality is provided by generally available software componentes and then how different components can be efficiently integrated. A second set of practical challenges arises from experience on how performance assumptions map to actual environments and real workloads. The result is a generic architecture for replicated database management systems, focusing on the interfaces between key components, and then on how different algorithmic and practical optimization options map to real world gains. This shows how consistent database replication is achievable in the current state of the art.

Leitão J, Pereira JO, Rodrigues L.  2010.  Gossip-based broadcast. Handbook of Peer-to-Peer Networking. Abstract

Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives on large scale peer-to-peer networks. Epidemic protocols are scalable because they distribute the load among all nodes in the system and resilient because they have an intrinsic level of redundancy that masks node and network failures. This chapter provides an introduction to gossip-based broadcast on large-scale unstructured peer-to-peer overlay networks: it surveys the main results in the field, discusses techniques to build and maintain the overlays that support efficient dissemination strategies, and provides an in-depth discussion and experimental evaluation of two concrete protocols, named HyParView and Plumtree.