Recent Publications

Carvalho N, Sousa CS, Pinto JS, Tomb A.  2014.  Formal Verification of kLIBC with the WP Frama-C plug-in. Proceedings of the 6th NASA Formal Methods Symposium . Abstract2014_nfm_14_a.pdf

This paper presents our results in the formal verification of kLIBC, a minimalistic C library, using the Frama-C/WP tool. We report how we were able to completely verify a significant number of func- tions from and . We discuss difficulties encoun- tered and describe in detail a problem in the implementation of common functions, for which we suggest alternative implementations. Our work shows that it is presently already viable to verify low-level C code, with heavy usage of pointers. Although the properties proved tend to be shallower as the code becomes of a lower-level nature, it is our view that this is an important direction towards real-world software verifica- tion, which cannot be attained by focusing on deep properties of cleaner code, written specifically to be verified.

Pasquet M, Maia F, Rivière E, Schiavoni V.  2014.  Autonomous Multi-Dimensional Slicing for Large-Scale Distributed Systems. Proceedings of the 14th International Conference on Distributed Applications and Interoperable Systems - DAIS. Abstractautonomous-multi-dimensional.pdf

Slicing is a distributed systems primitive that allows to autonomously partition a large set of nodes based on node-local attributes. Slicing is decisive for automatically provisioning system resources for different services, based on their requirements or importance. One of the main limitations of existing slicing protocols is that only single dimension attributes are considered for partitioning. In practical settings, it is often necessary to consider best compromises for an ensemble of metrics.
In this paper we propose an extension of the slicing primitive that allows multi-attribute distributed systems slicing. Our protocol employs a gossip-based approach that does not require centralized knowledge and allows self-organization. It leverages the notion of domination between nodes, forming a partial order between multi-dimensional points, in a similar way to SkyLine queries for databases. We evaluate and demonstrate the interest of our approach using large-scale simulations.

Cunha A, Macedo N, Guimarães T.  2014.  Target oriented relational model finding. 7th International Conference on Fundamental Approaches to Software Engineering - FASE. 8411 AbstractPaper

Model finders are becoming useful in many software engineering problems. Kodkod is one of the most popular, due to its support for relational logic (a combination of first order logic with relational algebra operators and transitive closure), allowing a simpler specification of constraints, and support for partial instances, allowing the specification of a priori (exact, but potentially partial) knowledge about a problem's solution. However, in some software engineering problems, such as model repair or bidirectional model transformation, knowledge about the solution is not exact, but instead there is a known target that the solution should approximate. In this paper we extend Kodkod's partial instances to allow the specification of such targets, and show how its model finding procedure can be adapted to support them (using both PMax-SAT solvers or SAT solvers with cardinality constraints). Two case studies are also presented, including a careful performance evaluation to assess the effectiveness of the proposed extension.

Macedo N, Cunha A, Pacheco H.  2014.  Towards a framework for multi-directional model transformations. 3rd International Workshop on Bidirectional Transformations - BX. 1133 AbstractPaper

The Query/View/Transformation Relations (QVT-R) standard for bidirectional model transformation is notorious for its underspecified semantics. When restricted to transformations between pairs of models, most of the ambiguities and omissions have been addressed in recent work. Nevertheless, the application of the QVT-R language is not restricted to that scenario, and similar issues remain unexplored for the multidirectional case (maintaining consistency between more than two models), that has been overlooked so far.
In this paper we first discuss ambiguities and omissions in the QVT-R standard concerning the mutidirectional transformation scenario, and then propose a simple extension and formalization of the checking and enforcement semantics that clarifies some of them. We also discuss how such proposal could be implemented in our Echo bidirectional model transformation tool.
Ours is just a small step towards making QVT-R a viable language for bidirectional transformation in realistic applications, and a considerable amount of basic research is still needed to fully accomplish that goal.

Cunha J, Fernandes JP, Mendes J, Pereira R, Saraiva JA.  2014.  Design and Implementation of Queries for Model-Driven Spreadsheets. CEFP - Central European Functional Programming School. Abstractdsl13_query.pdf

This paper presents a domain-specific querying language for model-driven spreadsheets. We briefly show the design of the language and present in detail its implementation, from the denormalization of data and translation of our user-friendly query language to a more efficient query, to the execution of the query using Google. To validate our work, we executed an empirical study, comparing QuerySheet with an alternative spreadsheet querying tool, which produced positive results.

Cunha J, Fernandes JP, Saraiva JA.  2014.  Spreadsheet Engineering. CEFP - Central European Functional Programming School. Abstractdsl13_notes.pdf

These tutorial notes present a methodology for spreadsheet engineering. First, we present data mining and database techniques to reason about spreadsheet data. These techniques are used to compute relationships between spreadsheet elements (cells/columns/rows). These relations are then used to infer a model de ning the business logic of the spreadsheet. Such a model of a spreadsheet data is a visual domain speci c language that we embed in a well-known spreadsheet system. The embedded model is the building block to de ne techniques for modeldriven spreadsheet development, where advanced techniques are used to guarantee the model-instance synchronization. In this model-driven environment, any user data update as to follow the the model-instance conformance relation, thus, guiding spreadsheet users to introduce correct data. Data re nement techniques are used to synchronize models and instances after users update/evolve the model. These notes brie y describe our model-driven spreadsheet environment, the MDSheet environment, that implements the presented methodology. To evaluate both proposed techniques and the MDSheet tool, we have conducted, in laboratory sessions, an empirical study with the summer school participants. The results of this study are presented in these notes.

Macedo N, Pacheco H, Sousa NR, Cunha A.  2014.  Bidirectional Spreadsheet Formulas. VL/HCC - IEEE Symposium on Visual Languages and Human-Centric Computing. 6120 Abstractvlhcc14ss.pdf

Bidirectional transformations have potential applications in a vast number of computer science domains. Spreadsheets, on the other hand, are widely used for developing business applications, but their formulas are unidirectional, in the sense that their result can not be edited and propagated back to their input cells. In this paper, we interpret such formulas as a well-known class of bidirectional transformations that go by the name of lenses. Being aimed at users that are not proficient with programming languages, we devote particular attention to the seamless embedding of the proposed bidirectional mechanism with the typical workflow of spreadsheet environments, allowing users to have a fine control and understanding of the behavior of the derived backward transformations.

Maia F, Matos M, Vilaça R, Pereira JO, Oliveira R, Rivière E.  2014.  DATAFLASKS: epidemic store for massive scale systems. The 33rd IEEE Symposium on Reliable Distributed Systems (SRDS 2014) . Abstractmain.pdf

Very large scale distributed systems provide some of the most interesting research challenges while at the same time being increasingly required by nowadays applications. The escalation in the amount of connected devices and data being produced and exchanged, demands new data management systems. Although new data stores are continuously being proposed, they are not suitable for very large scale environments. The high levels of churn and constant dynamics found in very large scale systems demand robust, proactive and unstructured approaches to data management. In this paper we propose a novel data store solely based on epidemic (or gossip-based) protocols. It leverages the capacity of these protocols to provide data persistence guarantees even in highly dynamic, massive scale systems. We provide an open source prototype of the data store and correspondent evaluation.

Felber P, Pasin M, Rivière E, Schiavoni V, Sutra P, Coelho F, Matos M, Oliveira R, Vilaça R.  2014.  On the Support of Versioning in Distributed Key-Value Stores. 33rd IEEE International Symposium on Reliable Distributed Systems - SRDS. Abstractpaper.pdf

The ability to access and query data stored in multiple versions is an important asset for many applications, such as Web graph analysis, collaborative editing platforms, data forensics, or correlation mining. The storage and retrieval of versioned data requires a specific API and support from the storage layer. The choice of the data structures used to maintain versioned data has a fundamental impact on the performance of insertions and queries. The appropriate data structure also depends on the nature of the versioned data and the nature of the access patterns. In this paper we study the design and implementation space for providing versioning support on top of a distributed key-value store (KVS). We define an API for versioned data access supporting multiple writers and show that a plain KVS does not offer the necessary synchronization power for implementing this API. We leverage the support for listeners at the KVS level and propose a general construction for implementing arbitrary types of data structures for storing and querying versioned data. We explore the design space of versioned data storage ranging from a flat data structure to a distributed sharded index. The resulting system, \system, is implemented on top of an industrial-grade open-source KVS, Infinispan. Our evaluation, based on real-world Wikipedia access logs, studies the performance of each versioning mechanisms in terms of load balancing, latency and storage overhead in the context of different access scenarios.

Campos F, Matos M, Pereira JO, Rua D.  2014.  A peer-to-peer service architecture for the Smart Grid (short paper). 14th IEEE International Conference on Peer-to-Peer Computing - P2P. Abstract222.p2p2014_063.pdf

The Smart Grid vision needs to address hard challenges such as interoperability, reliability and scalability before it can become fulfilled. The need to provide full interoperability between current and future energy and non-energy systems and its disparate technologies along with the problem of seamless discovery, configuration, and communication of a large variety of networked devices ranging from the resource constrained sensing devices to the large machines inside a data center requires an agnostic Service Oriented Architecture. Moreover, the sheer scale of the Smart Grid and the criticality of the communication among its subsystems for proper management, demands a scalable and reliable communication framework able to work in an heterogeneous and dynamic environment. In this position paper, we propose a generic framework, based on Web Services for interoperability, and epidemic or gossip based communication protocols for reliability and scalability, that can serve a general management substrate where several Smart Grid problems can be solved. We illustrate the flexibility of the proposed framework by showing how it can be used in two specific scenarios.

Matos M, Schiavoni V, Rivière E, Felber P, Oliveira R.  2014.  LayStream: composing standard gossip protocols for live video streaming. 14th IEEE International Conference on Peer-to-Peer Computing (P2P). Abstractp2p-laystream.pdf

Gossip-based live streaming is a popular topic, as attested by the vast literature on the subject. Despite the particular merits of each proposal, all need to implement and deal with common challenges such as membership management, topology construction and video packets dissemination. Well-principled gossip-based protocols have been proposed in the literature for each of these aspects. Our goal is to assess the feasibility of building a live streaming system, \sys, as a composition of these existing protocols, to deploy the resulting system on real testbeds, and report on lessons learned in the process. Unlike previous evaluations conducted by simulations and considering each protocol independently, we use real deployments. We evaluate protocols both independently and as a layered composition, and unearth specific problems and challenges associated with deployment and composition. We discuss and present solutions for these, such as a novel topology construction mechanism able to cope with the specificities of a large-scale and delay-sensitive environment, but also with requirements from the upper layer. Our implementation and data are openly available to support experimental reproducibility.

Campos F, Matos M, Pereira JO.  2014.  Coordenação de Serviços Web heterogéneos com tolerância a faltas. INForum - Simpósio de Informática. Abstractinforum-wsgossip.pdf

A norma Devices Profile for Web Services (DPWS) permite a descoberta, a configuração e a interoperabilidade de dispositivos heterogéneos ligados em rede com grande disparidade em termos de capacidade de processamento, desde pequenos eletrodomésticos inteligentes ou controladores em máquinas industriais, até servidores em centros de dados. Os mecanismos de notificação de eventos e de configuração automática previstos pelo DPWS são especialmente adequados a cenários Machine-to-Machine (M2M) devido à sua simplicidade e flexibilidade. No entanto, a escalabilidade em cenários com um elevado número de nós, nomeadamente, em grandes infraestruturas, é limitada. A coerência dos dados armazenados e manipulados pelos mecanismos de autodescoberta também não é adequada a aplicações críticas. Neste artigo, abordamos este problema com uma proposta assente na norma DPWS com os conceitos de difusão epidémica e de consenso, mostrando que é possível compor serviços adequados a diferentes aplicações assegurando garantias de tolerância a faltas e maior escalabilidade.

Coelho F, Cruz F, Vilaça R, Pereira JO, Oliveira R.  2014.  pH1: A Transactional Middleware for NoSQL. 33rd IEEE International Symposium on Reliable Distributed Systems - SRDS. Abstractph1.pdf

NoSQL databases opt not to offer important abstractions traditionally found in relational databases in order to achieve high levels of scalability and availability: transactional guarantees and strong data consistency.
In this work we propose pH1, a generic middleware layer over NoSQL databases that offers transactional guarantees with Snapshot Isolation. This is achieved in a non-intrusive manner,
requiring no modifications to servers and no native support for multiple versions. Instead, the transactional context is achieved by means of a multiversion distributed cache and an external
transaction certifier, exposed by extending the client’s interface with transaction bracketing primitives.
We validate and evaluate pH1 with Apache Cassandra and Hyperdex. First, using the YCSB benchmark, we show that the cost of providing ACID guarantees to these NoSQL databases
amounts to 11% decrease in throughput.
Moreover, using the transaction intensive TPC-C workload, pH1 presented an impact of 22% decrease in throughput. This contrasts with OMID, a previous proposal that takes advantage of
HBase’s support for multiple versions, with a throughput penalty of 76% in the same conditions.

Casanova P, Garlan D, Schmerl BR, Abreu R.  2014.  Diagnosing unobserved components in self-adaptive systems. SEAMS - 9th International Symposium on Software Engineering for Adaptive and Self-Managing Systems. :75–84. Abstractunobserved2013_casanova_unobserved.pdf

Availability is an increasingly important quality for today's software-based systems and it has been successfully addressed by the use of closed-loop control systems in self-adaptive systems. Probes are inserted into a running system to obtain information and the information is fed to a controller that, through provided interfaces, acts on the system to alter its behavior. When a failure is detected, pinpointing the source of the failure is a critical step for a repair action. However, information obtained from a running system is commonly incomplete due to probing costs or unavailability of probes. In this paper we address the problem of fault localization in the presence of incomplete system monitoring. We may not be able to directly observe a component but we may be able to infer its health state. We provide formal criteria to determine when health states of unobservable components can be inferred and establish formal theoretical bounds for accuracy when using any spectrum-based fault localization algorithm.

Perez A, Abreu R.  2014.  A diagnosis-based approach to software comprehension. ICPC - 22nd International Conference on Program Comprehension. :37–47. Abstracticpc2014preprint.pdf

Program comprehension is a time-consuming task performed during the process of reusing, reengineering, and enhancing existing systems. Currently, there are tools to assist in program comprehension by means of dynamic analysis, but, e.g., most cannot identify the topology and the interactions of a certain functionality in need of change, especially when used in large, real-world software applications. We propose an approach, coined Spectrum-based Feature Comprehension (SFC), that borrows techniques used for automatic software-fault-localization, which were proven to be ef ective even when debugging large applications in resource constrained environments. SFC analyses the program by exploiting run-time information from test case executions to compute the components that are important for a given feature (and whether a component is used to implement just one feature or more), helping software engineers to understand how a program is structured and what the functionality's dependencies are. We present a toolset, coined Pangolin, that implements SFC and displays its report to the user using an intuitive visualization. A user study with the open-source application Rhino is presented, demonstrating the eciency of Pangolin in locating the components that should be inspected when changing a certain functionality.