Recent Publications

Silva A, SergeyGoncharov, Milius S.  2014.  Towards a Coalgebraic Chomsky Hierarchy. Abstract

The Chomsky hierarchy plays a prominent role in the foundations of the theoretical computer science relating classes of formal languages of primary importance. In this paper we use recent developments on coalgebraic and monad-based semantics to obtain a generic notion of a T-automaton, where T is a monad, which allows the uniform study of various notions of machines (e.g. finite state machines, multi-stack machines, Turing machines, valence automata, weighted automata). We use the generalized powerset construction to define a generic (trace) semantics for T-automata, and we show by numerous examples that it correctly instantiates for the known classes of machines/languages captured by the Chomsky hierarchy. Moreover, our approach provides new generic techniques for proving expressivity bounds of various machine-based models.

Silva A, Bonchi F, Milius S, Zanasi F.  2014.  How to Kill Epsilons with a Dagger - A Coalgebraic Take on Systems with Algebraic Label Structure. Abstract

We propose an abstract framework for modeling state-based systems with internal behavior as e.g. given by silent or ϵ-transitions. Our approach employs monads with a parametrized fixpoint operator † to give a semantics to those systems and implement a sound procedure of abstraction of the internal transitions, whose labels are seen as the unit of a free monoid. More broadly, our approach extends the standard coalgebraic framework for state-based systems by taking into account the algebraic structure of the labels of their transitions. This allows to consider a wide range of other examples, including Mazurkiewicz traces for concurrent systems.

Campos F, Pereira JO.  2014.  Improving the Scalability oWS- Based Networked Infrastructures. 1:1-28. Abstract1407.8546.pdf

The Devices Profile for Web Services (DPWS) specification enables seamless discovery, configuration, and interoperability of networked devices in various settings, ranging from home automation and multimedia to manufacturing equipment and data centers. Unfortunately, the sheer simplicity of event notification mechanisms that makes it fit for resource-constrained devices, makes it hard to scale to large infrastructures with more stringent dependability requirements, ironically, where self-configuration would be most useful. In this report, we address this challenge with a proposal to integrate gossip-based dissemination in DPWS, thus maintaining compatibility with original assumptions of the specification, and avoiding a centralized configuration server or custom black-box middleware components. In detail, we show how our approach provides an evolutionary and non-intrusive solution to the scalability limitations of DPWS and experimentally evaluate it with an implementation based on the the Web Services for Devices (WS4D) Java Multi Edition DPWS Stack (JMEDS).

Pacheco H, Macedo N, Cunha A, Voigtländer J.  2013.  A Generic Scheme and Properties of Bidirectional Transformations. CoRR. arXiv/1306.4473:19. Abstract1306.4473v2.pdf

The recent rise of interest in bidirectional transformations (BXs) has led to the development of many BX frameworks, originating in diverse computer science disciplines. From a user perspective, these frameworks vary significantly in both interface and predictability of the underlying bidirectionalization technique. In this paper we start by presenting a generic BX scheme that can be instantiated to different concrete interfaces, by plugging-in the desired notion of update and traceability. Based on that scheme, we then present several desirable generic properties that may characterize a BX framework, and show how they can be instantiated to concrete interfaces. This generic presentation is useful when exploring the BX design space: it might help developers when designing new frameworks and end-users when comparing existing ones. We support the latter claim, by applying it in a comparative survey of popular existing BX frameworks.

Shoker A, Bahsoun JP.  2013.  BFT Selection. :1-15. AbstractPaper

One-size-fits-all protocols are hard to achieve in Byzantine fault tolerance (BFT). As an alternative, BFT users, e.g., enterprises, need an easy and efficient method to choose the most convenient protocol that matches their preferences best. The various BFT protocols that have been proposed so far differ significantly in their characteristics and performance which makes choosing the ‘preferred’ protocol hard. In addition, if the state of the deployed system is too fluctuating, then perhaps using multiple protocols at once is needed; this requires a dynamic selection mechanism to move from one protocol to another. In this paper, we present the first BFT selection model and algorithm that can be used to choose the most convenient protocol according to user preferences. The selection algorithm applies some mathematical formulas to make the selection process easy and automatic. The algorithm operates in three modes: Static, Dynamic, and Heuristic. The Static mode addresses the cases where a single protocol is needed; the Dynamic mode assumes that the system conditions are quite ?uctuating and thus requires run-time decisions, and the Heuristic mode is similar to the Dynamic mode but it uses additional heuristics to improve user choices. We give some examples to describe how selection occurs. We show that our approach is automated, easy, and yields reasonable results that match reality. To the best of our knowledge, this is the first work that addresses selection in BFT.

Jin D, Yang B, Moreno CB, Liu D, He D, Liu J.  2013.  Markov random walk under constraint for discovering overlapping communities in complex networks. :21. Abstract1303.5675.pdf

The detection of overlapping communities in complex networks has motivated recent research in relevant fields. Aiming to address this problem, we propose a Markov-dynamics-based algorithm, called UEOC, which means 'unfold and extract overlapping communities'. In UEOC, when identifying each natural community that overlaps, a Markov random walk method combined with a constraint strategy, which is based on the corresponding annealed network (degree conserving random network), is performed to unfold the community. Then, a cutoff criterion with the aid of a local community function, called conductance, which can be thought of as the ratio between the number of edges inside the community and those leaving it, is presented to extract this emerged community from the entire network. The UEOC algorithm depends on only one parameter whose value can be easily set, and it requires no prior knowledge of the hidden community structures. The proposed UEOC has been evaluated both on synthetic benchmarks and on some real-world networks, and has been compared with a set of competing algorithms. The experimental result has shown that UEOC is highly effective and efficient for discovering overlapping communities.

Zawirsky M, Bieniusa A, Balegas V, Duarte S, Moreno CB, Shapiro M, Preguiça N.  2013.  SwiftCloud: Fault-tolerant geo-replication integrated all the way to the client machine. :24. Abstract1310.3107v1.pdf

Client-side logic and storage are increasingly used in web and mobile applications to improve response time and availability. Current approaches tend to be ad-hoc and poorly integrated with the server-side logic. We present a principled approach to integrate client- and server-side storage. We support mergeable and strongly consistent transactions that target either client or server replicas and provide access to causally-consistent snapshots efficiently. In the presence of infrastructure faults, a client-assisted failover solution allows client execution to resume immediately and seamlessly access consistent snapshots without waiting. We implement this approach in SwiftCloud, the first transactional system to bring geo-replication all the way to the client machine. Example applications show that our programming model is useful across a range of application areas. Our experimental evaluation shows that SwiftCloud provides better fault tolerance and at the same time can improve both latency and throughput by up to an order of magnitude, compared to classical geo-replication techniques.

Almeida PS, Moreno CB.  2013.  Scalable eventually consistent counters over unreliable networks. :1-32. Abstract1307.3207v1_1.pdf

Counters are an important abstraction in distributed computing, and play a central role in large scale geo-replicated systems, counting events such as web page impressions or social network "likes". Classic distributed counters, strongly consistent, cannot be made both available and partition-tolerant, due to the CAP Theorem, being unsuitable to large scale scenarios. This paper defines Eventually Consistent Distributed Counters (ECDC) and presents an implementation of the concept, Handoff Counters, that is scalable and works over unreliable networks. By giving up the sequencer aspect of classic distributed counters, ECDC implementations can be made AP in the CAP design space, while retaining the essence of counting. Handoff Counters are the first CRDT (Conflict-free Replicated Data Type) based mechanism that overcomes the identity explosion problem in naive CRDTs, such as G-Counters (where state size is linear in the number of independent actors that ever incremented the counter), by managing identities towards avoiding global propagation and garbage collecting temporary entries. The approach used in Handoff Counters is not restricted to counters, being more generally applicable to other data types with associative and commutative operations.

Murta D, Oliveira JN.  2013.  Calculating risk in functional programming. :38. Abstract1311.3687v1.pdf

In the trend towards tolerating hardware unreliability, accuracy is exchanged for cost savings. Running on less reliable machines, "functionally correct" code becomes risky and one needs to know how risk propagates so as to mitigate it. Risk estimation, however, seems to live outside the average programmer's technical competence and core practice. In this paper we propose that risk be constructively handled in functional programming by (a) writing programs which may choose between expected and faulty behaviour, and by (b) reasoning about them in a linear algebra extension to standard, a la Bird-Moor algebra of programming. In particular, the propagation of faults across standard program transformation techniques known as tupling and fusion is calculated, enabling the fault of the whole to be expressed in terms of the faults of its parts.

Riboira A, Abreu R.  2013.  END-A Lightweight Algorithm to Estimate the Number of Defects in Software. :8. Abstract955913_1.pdf

Defect precision provides information on how many defects a given software application appears to have. Existing approaches are usually based on time consuming model-based techniques. A viable alternative is the previously presented Abacus algorithm, which is based on Bayesian fault diagnosis.This paper presents a novel alternative approach - coined End that uses the same input and produces the same output as the Abacus algorithm, but is considerably more time ecient. An experiment was conducted to compare both the accuracy and performance of these two algorithms. The End algorithm presented the same accuracy as the Abacus algorithm, but outperformed it in the majority of executions.

Bieniusa A, Zawirsky M, Preguiça N, Shapiro M, Moreno CB, Balegas V, Duarte S.  2012.  An optimized conflict-free replicated set. CoRR. 1210.3368(8083):9. Abstract1210.3368v1.pdf

Eventual consistency of replicated data supports concurrent updates, reduces latency and improves fault tolerance, but forgoes strong consistency. Accordingly, several cloud computing platforms implement eventually-consistent data types. The set is a widespread and useful abstraction, and many replicated set designs have been proposed. We present a reasoning abstraction, permutation equivalence, that systematizes the characterization of the expected concurrency semantics of concurrent types. Under this framework we present one of the existing conflict-free replicated data types, Observed-Remove Set. Furthermore, in order to decrease the size of meta-data, we propose a new optimization to avoid tombstones. This approach that can be transposed to other data types, such as maps, graphs or sequences.

Macedo N, Pacheco H, Cunha A.  2012.  Relations as executable specifications: taming partiality and non-determinism using invariants. :146-161. Abstractndlenses12tech.pdf

The calculus of relations has been widely used in program specification and reasoning. It is very tempting to use such specifications as running prototypes of the desired program, but, even considering finite domains, the inherent partiality and non-determinism of relations makes this impractical and highly inefficient. To tame partiality we prescribe the usage of invariants, represented by coreflexives, to characterize the exact domains and codomains of relational specifications. Such invariants can be used as pre-condition checkers to avoid runtime errors. Moreover, we show how such invariants can be used to narrow the non-deterministic execution of relational specifications, making it viable for a relevant class of problems. In particular, we show how the proposed techniques can be applied to execute specifications of bidirectional transformations, a domain where partiality and non-determinism are paramount.

Shoker A, Yabandeh M, Guerraoui R, Bahsoun JP.  2012.  Obfuscated BFT. :1-15. AbstractPaper

A major assumption underlies the correctness of BFT services: failure independence. However, state of the art protocols rely on inter-replica communication in order to preserve consistency, and they implicitly require replicas to store access information about each other. This jeopardizes the independence of failure assumption since intruders can sneak from one replica to another and compromises the entire service. In this paper, we leverage this issue by exploring the idea of obfuscation in the BFT context: the replicas of a BFT service are completely unaware of each others identities. We present a new client-based replication BFT protocol, called OBFT, that assumes honest, but possibly crash-prone clients. The protocol imposes equivalent load on replicas to avoid bottlenecks, and reduces the load on the primary replica by distributing multi-cast and cryptographic tasks among clients. We evaluate OBFT on an Emulab cluster with a wide area topology and convey its scalability with respect to state of the art BFT protocols.

Kozen D, Silva A.  2012.  Practical Coinduction. :1-22. Abstractstructural.pdf

Induction is a well-established proof principle that is taught in most undergraduate programs in mathematics and computer science. In computer science, it is used primarily to reason about inductively-defined datatypes such as finite lists, finite trees, and the natural numbers. Coinduction is the dual principle that can be used to reason about coinductive datatypes such as infinite streams or trees, but it is not as widespread or as well understood. In this paper, we illustrate through several examples the use of coinduction in informal mathematical arguments. Our aim is to promote the principle as a useful tool for the working mathematician and to bring it to a level of familiarity on par with induction. We show that coinduction is not only about bisimilarity and equality of behaviors, but also applicable to a variety of functions and relations defined on coinductive datatypes.

Jeannin J-B, Kozen D, Silva A.  2012.  CoCaml: Programming with Coinductive Types. :1-24. Abstractcocaml.pdf

We present CoCaml, a functional programming language extending OCaml, which allows us to define functions on coinductive datatypes parameterized by an equation solver. We provide numerous examples that attest to the usefulness of the new programming constructs, including operations on infinite lists, infinitary lambda-terms and p-adic numbers.