Recent Publications

Silva A, Jacobs B.  2014.  Initial Algebras of Terms With Binding and Algebraic Structure. Categories and Types in Logic, Language, and Physics - Lecture Notes in Computer Science. 8222:211-234. Abstract

One of the many results which makes Joachim Lambek famous is: an initial algebra of an endofunctor is an isomorphism. This fixed point result is often referred to as ``Lambek's Lemma''. In this paper, we illustrate the power of initiality by exploiting it in categories of algebra-valued presheaves $\EM(T)^{\N}$, for a monad $T$ on $\Sets$.
The use of presheaves to obtain certain calculi of expressions (with variable binding) was introduced by Fiore, Plotkin, and Turi. They used set-valued presheaves, whereas here the presheaves take values in a category $\EM(T)$ of Eilenberg-Moore algebras. This generalisation allows us to develop a theory where more structured calculi can be obtained. The use of algebras means also that we work in a linear context and need a separate operation $\bang$ for replication, for instance to describe strength for an endofunctor on $\EM(T)$. We apply the resulting theory to give systematic descriptions of non-trivial calculi: we introduce non-deterministic and weighted lambda terms and expressions for automata as initial algebras, and we formalise relevant equations diagrammatically.

Hofer B, Riboira A, Wotawa F, Abreu R, Getzner E.  2013.  On the empirical evaluation of fault localization techniques for spreadsheets. Fundamental Approaches to Software Engineering. :68–82. Abstract

Spreadsheets are by far the most prominent example of end-user programs of ample size and substantial structural complexity. In addition, spreadsheets are usually not tested very rigorously and thus comprise faults. Locating faults is a hard task due to the size and the structure, which is usually not directly visible to the user, i.e., the functions are hidden behind the cells and only the computed values are presented. Hence, there is a strong need for debugging support. In this paper, we adapt three program-debugging approaches that have been designed for more traditional procedural or object-oriented programming languages. These techniques are Spectrum-based Fault Localization, Spectrum-Enhanced Dynamic Slicing, and Constraint-based Debugging. Beside the theoretical foundations, we present a more sophisticated empirical evaluation including a comparison of these approaches. The empirical evaluation shows that Sfl (Spectrum-based Fault Localization) and Sendys (Spectrum ENhanced Dynamic Slicing) are the most promising techniques.

Gonzalez-Sanchez A, Piel É, Abreu R, Gross H-G, Van Gemund AJC.  2013.  Prioritizing Tests for Fault Localization. Situation Awareness with Systems of Systems. :247–257. Abstract

In practically all development processes, regression tests are used to detect the presence of faults after a modification. If faults are detected, a fault localization algorithm can be used to reduce the manual inspection cost. However, while using test case prioritization to enhance the rate of fault detection of the test suite (e.g., statement coverage), the diagnostic information gain per test is not optimal, which results in needless inspection cost during diagnosis. We present RAPTOR, a test prioritization algorithm for fault localization, based on reducing the similarity between statement execution patterns as the testing progresses. Unlike previous diagnostic prioritization algorithms, RAPTOR does not require false negative information, and is much less complex. Experimental results from the Software Infrastructure Repository's benchmarks show that RAPTOR is the best technique under realistic conditions, with average cost reductions of 40% with respect to the next best technique, with negligible impact on fault detection capability.

Koca F, Sözer H, Abreu R.  2013.  Spectrum-Based Fault Localization for Diagnosing Concurrency Faults. Testing Software and Systems. :239–254. Abstract

Concurrency faults are activated by specific thread interleavings at runtime. Traditional fault localization techniques and static analysis fall short to diagnose these faults efficiently. Existing dynamic fault-localization techniques focus on pinpointing data-access patterns that are subject to concurrency faults. In this paper, we propose a spectrum-based fault localization technique for localizing faulty code blocks instead. We systematically instrument the program to create versions that run in particular combinations of thread interleavings. We run tests on all these versions and utilize spectrum-based fault localization to correlate detected errors with concurrently executing code blocks. We have implemented a tool and applied our approach on several industrial case studies. Case studies show that our approach can effectively and efficiently localize concurrency faults.

Martens C, Bosser A-G, Ferreira JF, Cavazza M.  2013.  Linear Logic Programming for Narrative Generation. Logic Programming and Nonmonotonic Reasoning (LNCS 8148). Abstract

In this paper, we explore the use of Linear Logic programming for story generation. We use the language Celf to represent narrative knowledge, and its own querying mechanism to generate story instances, through a number of proof terms. Each proof term obtained is used, through a resource-flow analysis, to build a directed graph where nodes are narrative actions and edges represent inferred causality relationships. Such graphs represent narrative plots structured by narrative causality. This approach is a candidate technique for narrative generation which unifies declarative representations and generation via query and deduction mechanisms.

Campos JC, Saraiva JA, Silva CE, Silva JC.  2012.  GUIsurfer: A Reverse Engineering Framework for User Interface Software. Reverse Engineering - Recent Advances and Applications. :31-54. Abstract

In the context of developing tool support to the automated analysis of interactive systems implementations, this chapter proposal aims to investigate the applicability of reverse engineering approaches to the derivation of user interfaces behavioural models. The ultimate goal is that these models might be used to reason about the quality of the system, both from an usability and an implementation perspective, as well as being used to help systems maintenance, evolution and redesign.

Santos J, Abreu R.  2012.  Lightweight Automatic Error Detection by Monitoring Collar Variables. Testing Software and Systems. :215–230. Abstract

Although proven to be an e ective way for detecting errors, generic program invariants (also known as fault screeners) entail a considerable runtime overhead, rendering them not useful in practice. This paper studies the impact of using simple variable patterns to detect the so-called system's collar variables to reduce the number of variables to be monitored (instrumented). Two di erent patterns were investigated to determine which variables to monitor. The rst pattern nds variables whose value increase or decrease at regular intervals and deems them not important to monitor. The other pattern veri es the range of a variable per (successful) execution. If the range is constant across executions, then the variable is not monitored. Experiments were conducted on three different real-world applications to evaluate the reduction achieved on the number of variables monitored and determine the quality of the error detection. Results show a reduction of 52.04% on average in the number of monitored variables, while still maintaining a good detection rate with only 3.21% of executions detecting non-existing errors (false positives) and 5.26% not detecting an existing error (false negatives).

Andrés C, Abreu R, Núñez A.  2012.  OCE: an online collaborative editor. Computational Collective Intelligence. Technologies and Applications. :89–98. Abstract

In this paper we present the development of an Online Collaborative Editor (OCE) software system. It allows several people, to edit and share computer files using different devices, such as mobiles, PDAs in an easy way.
We use formal methods in order to automatize and describe OCE. Its formalism is very suitable to specify time requirements (both time consumption due to the performance of tasks and timeouts) as well as to represent data communication among different components of the system.
This exercise convinced us that a formal approach to develop complex systems can facilitate some of the development phases. In particular, the testing and debugging phases, more precisely, how to chose those tests more suitable to be applied, is simplified since tests are automatically extracted from the specification.

Jin D, Liu D, Yang B, Moreno CB, He D.  2011.  Ant colony optimization with Markov random walk for community detection in graphs. Advances in Knowledge Discovery and Data Mining. 6635:123–134. Abstract

Network clustering problem (NCP) is the problem associated to the detection of network community structures. Building on Markov random walks we address this problem with a new ant colony optimization strategy, named as ACOMRW, which improves prior results on the NCP problem and does not require knowledge of the number of communities present on a given network. The framework of ant colony optimization is taken as the basic framework in the ACOMRW algorithm. At each iteration, a Markov random walk model is taken as heuristic rule; all of the ants’ local solutions are aggregated to a global one through clustering ensemble, which then will be used to update a pheromone matrix. The strategy relies on the progressive strengthening of within-community links and the weakening of between-community links. Gradually this converges to a solution where the underlying community structure of the complex network will become clearly visible. The performance of algorithm ACOMRW was tested on a set of benchmark computer-generated networks, and as well on real-world network data sets. Experimental results confirm the validity and improvements met by this approach.

Moreno CB, Gonçalves N, José R.  2011.  Privacy preserving gate counting with collaborative bluetooth scanners. Workshops On the Move to Meaningful Internet Systems. 7046:534–543. Abstract

Due to its pervasiveness and communication capabilities, Bluetooth can be used as an infrastructure for several situated interaction and massive sensing scenarios. This paper shows how Bluetooth scanning can be used in gate counting scenarios, where the main goal is to provide an accurate count for the number of unique devices sighted. To this end, we present an analysis of several stochastic counting techniques that not only provide an accurate count for the number of unique devices, but o er privacy guarantees as well.

Ferreira JF, Mendes A, Cunha A, Moreno CB, Silva P, Barbosa LS, Oliveira JN.  2011.  Logic Training through Algorithmic Problem Solving. Tools for Teaching Logic. 6680:62-69. Abstractticttl11.pdf

Although much of mathematics is algorithmic in nature, the skills needed to formulate and solve algorithmic problems do not form an integral part of mathematics education. In particular, logic, which is central to algorithm development, is rarely taught explicitly at pre university level, under the justi cation that it is implicit in mathematics and therefore does not need to be taught as an independent topic. This paper argues in the opposite direction, describing a one week workshop done at the University of Minho, in Portugal, whose goal was to introduce to high-school students calculational principles and techniques of algorithmic problem solving supported by calculational logic. The work shop resorted to recreational problems to convey the principles and to software tools, the Alloy Analyzer and Netlogo, to animate models.

Madeira A, Martins M, Barbosa LS.  2011.  Models as arrows: the role of dialgebras. Models of Computation in Context. :144–153. Abstract

A large number of computational processes can suitably be described as a combination of construction, i.e. algebraic, and observation, i.e. coalgebraic, structures. This paper suggests dialgebras as a generic model in which such structures can be combined and proposes a small calculus of dialgebras including a wrapping combinator and se- quential composition. To take good care of invariants in software design, the paper also discusses how dialgebras can be typed by predicates and proves that invariants are preserved through composition. This lays the foundations for a full calculus of invariant proof-obligation discharge for dialgebraic models.

Abreu R, Gonzalez-Sanchez A, Van Gemund AJC.  2011.  A diagnostic reasoning approach to defect prediction. Modern Approaches in Applied Intelligence. :416–425. Abstract

During software testing, defect prediction approaches measure current reliability status, forecasting future program failures, and provide information on how many defects need to be removed before shipping. Existing approaches often require faults to be detected and identified as a new one, before a model-based trend can be fitted. While during regression testing failures may frequently occur, it is not evident which are related to new faults. Consequently, reliability growth trending can only be performed in sync with fault identification and repair, which is often performed in between regression test cycles. In this paper we present a dynamic, reasoning approach to estimate the number of defects in the system early in the process of regression testing. Our approach, coined Dracon, is based on Bayesian fault diagnosis over abstractions of program traces (also known as program spectra). Experimental results show that Dracon systematically estimates the exact number of (injected) defects, provided sufficient tests cases are available. Furthermore, we also propose a simple, analytic performance model to assess the influence of failed test cases in the estimation. We observe that our empirical findings are in agreement with the model.

Correia A, Pereira JO, Rodrigues L, Oliveira R, Carvalho N.  2010.  Practical Database Replication. Replication: Theory and Practice. Abstract

This chapter illustrates how the concepts and algorithms described earlier in this book can be used to build practical database replication systems. This is achieved first by addressing architectural challenges on how required functionality is provided by generally available software componentes and then how different components can be efficiently integrated. A second set of practical challenges arises from experience on how performance assumptions map to actual environments and real workloads. The result is a generic architecture for replicated database management systems, focusing on the interfaces between key components, and then on how different algorithmic and practical optimization options map to real world gains. This shows how consistent database replication is achievable in the current state of the art.

Leitão J, Pereira JO, Rodrigues L.  2010.  Gossip-based broadcast. Handbook of Peer-to-Peer Networking. Abstract

Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives on large scale peer-to-peer networks. Epidemic protocols are scalable because they distribute the load among all nodes in the system and resilient because they have an intrinsic level of redundancy that masks node and network failures. This chapter provides an introduction to gossip-based broadcast on large-scale unstructured peer-to-peer overlay networks: it surveys the main results in the field, discusses techniques to build and maintain the overlays that support efficient dissemination strategies, and provides an in-depth discussion and experimental evaluation of two concrete protocols, named HyParView and Plumtree.