Publications

Jesus P, Moreno CB, Almeida PS.  2009.  Fault-Tolerant Aggregation by Flow Updating. Distributed Applications and Interoperable Systems. 5523:73–86. Abstractflow_updating.pdf

Data aggregation plays an important role in the design of scalable systems, allowing the determination of meaningful system-wide properties to direct the execution of distributed applications. In the particular case of wireless sensor networks, data collection is often only practicable if aggregation is performed. Several aggregation algorithms have been proposed in the last few years, exhibiting different properties in terms of accuracy, speed and communication tradeoffs. Nonetheless, existing approaches are found lacking in terms of fault tolerance. In this paper, we introduce a novel fault-tolerant averaging based data aggregation algorithm. It tolerates substantial message loss (link failures), while competing algorithms in the same class can be affected by a single lost message. The algorithm is based on manipulating flows (in the graph theoretical sense), that are updated using idempotent messages, providing it with unique robustness capabilities. Furthermore, evaluation results obtained by comparing it with other averaging approaches have revealed that it outperforms them in terms of time and message complexity.

Moreno CB, Almeida PS, Cardoso J.  2009.  Probabilistic estimation of network size and diameter. Fourth Latin-American Symposium on Dependable Computing - LADC. :33–40. Abstractartigoladcfinal.pdf

Determining the size of a network and its diameter are important functions in distributed systems, as there are a number of algorithms which rely on such parameters, or at least on estimates of those values. The Extrema Propagation technique allows the estimation of the size of a network in a fast, distributed and fault tolerant manner. The technique was previously studied in a simulation setting where rounds advance synchronously and where there is no message loss.This work presents two main contributions. The first, is the study of the Extrema Propagation technique under asynchronous rounds and integrated in the Network Friendly Epidemic Multicast (NeEM) framework. The second, is the evaluation of a diameter estimation technique associated with the Extrema Propagation. This study also presents a small enhancement the Extrema Propagation in terms of communication cost and points out some other possible enhancements. Results show that there is a clear trade-off between time and communication that must be considered when configuring the protocol—a faster convergence time implies a higher communication cost. Results also show that its possible to reduce the total communication cost by more than 18% using a simple approach. The diameter estimation technique is shown to have a relative error of less than10% even when using a small sample of nodes.

Jesus P, Moreno CB, Almeida PS.  2009.  Using Less Links to Improve Fault-Tolerant Aggregation. 4th Latin-American Symposium on Dependable Computing - LADC. Abstractfast_abstract

Data aggregation plays a basal role in the design of scalable distributed applications, allowing the determination of meaningful system-wide properties to direct the execution of the system. For instance, aggregation can be used to estimate: the size of the network to dimension of Distributed Hash Table (DHT) structures, or to set a quorum in dynamic settings; the average system load to guide local loadbalancing decisions; the total network disk space in a P2P sharing system. In the particular case of Wireless Sensor Networks (WSN), due to energy constraints, data collection is often only practicable if aggregation is performed.

Moreno CB, Lopes N.  2009.  Search Optimizations in Structured Peer-to-Peer Systems. 18th International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises - WET ICE. :111–115. Abstract10.1.1.159.8669.pdf

DHT systems are structured overlay networks capable of using P2P resources as a scalable platform for very large data storage applications. However, their efficiency expects a level of uniformity in the association of data to index keys that is often not present in inverted indexes. Index data tends to follow non-uniform distributions, often power law distributions, creating intense local storage hotspots and network bottlenecks on specific hosts. Current techniques like caching cannot, alone, cope with this issue. We propose a distributed data structure based on a decentralized balanced tree to balance storage data and network load more uniformly across hosts. The results show that the data structure is capable of balancing resources, in particular when performing.

Jesus P, Moreno CB, Almeida PS.  2009.  Dependability in Aggregation by Averaging. INForum - Simpósio de Informática. :457–470. Abstractsimposio_de_informatica_inforum_2009_jesus.pdf

Aggregation is an important building block of modern distributed applications, allowing the determination of meaningful properties (e.g. network size, total storage capacity, average load, majorities, etc.) that are used to direct the execution of the system. In the recent years, several approaches have been proposed to compute aggregation functions on distributed settings, exhibiting different characteristics, in terms of accuracy, time and communication complexity. However, the majority of the existing aggregation algorithms exhibit relevant dependability issues, when prospecting their use in real application environments. In this paper, we reveal some dependability issues of aggregation algorithms based on iterative averaging techniques, giving some directions to solve them. This class of algorithms is considered robust (when compared to common tree-based approaches), being independent from the used routing topology and providing an aggregation result at all nodes. However, their robustness is strongly challenged and their correctness often compromised, when changing the assumptions of their working environment to more realistic ones. The correctness of this class of algorithms relies on the maintenance of a fundamental invariant, commonly designated as mass conservation. We will argue that this main invariant is often broken in practical settings, and that additional mechanisms and modifications are required to maintain it, incurring in some degradation of the algorithms performance. In particular, we discuss the behavior of three representative algorithms (Push-Sum Protocol [1], Push- Pull Gossip protocol [2] and Distributed Random Grouping [3]) under asynchronous and faulty (with message loss and node crashes) environments. More specifically, we propose and evaluate two new versions of the Push-Pull Gossip protocol, which solve its message interleaving problem (evidenced even in a synchronous operation mode).

Almeida PS, Moreno CB, Fonte V.  2008.  Interval tree clocks. 12th International Conference on Principles of Distributed Systems - OPODIS. 5401 Abstractitc.pdf

Causality tracking mechanisms, such as vector clocks and version vectors, rely on mappings from globally unique identifiers to integer counters. In a system with a well known set of entities these ids can be preconfigured and given distinct positions in a vector or distinct names in a mapping. Id management is more problematic in dynamic systems, with large and highly variable number of entities, being worsened when network partitions occur. Present solutions for causality tracking are not appropriate to these increasingly common scenarios. In this paper we introduce Interval Tree Clocks, a novel causality tracking mechanism that can be used in scenarios with a dynamic number of entities, allowing a completely decentralized creation of processes/replicas without need for global identifiers or global coordination. The mechanism has a variable size representation that adapts automatically to the number of existing entities, growing or shrinking appropriately. The representation is so compact that the mechanism can even be considered for scenarios with a fixed number of entities, which makes it a general substitute for vector clocks and version vectors.

Moreno CB, Lopes N.  2007.  Implementing range queries with a decentralized balanced tree over distributed hash tables. Proceedings of the 1st international conference on Network-based information systems - Nbis. 4658:197–206. Abstractnbis07.pdf

Range queries, retrieving all keys within a given range, is an important add-on for Distributed Hash Tables (DHTs), as they rely only on exact key matching lookup. In this paper we support range queries through a balanced tree algorithm, Decentralized Balanced Tree, that runs over any DHT system.

Our algorithm is based on the B+-tree design that efficiently stores clustered data while maintaining a balanced load on hosts. The internal structure of the balanced tree is suited for range queries operations over many data distributions since it easily handles clustered data without losing performance.

We analyzed, and evaluated our algorithm under a simulated environment, to show it's operation scalability for both insertions and queries. We will show that the system design imposes a fixed penalty over the DHT access cost, and thus inherits the scalability properties of the chosen underlying DHT.

Lopes N, Moreno CB.  2007.  Taming hot-spots in dht inverted indexes. The 11th International Workshop on. Large-Scale and Distributed Systems - LSDS-IR. 7 Abstractacmlsdsir07.pdf

DHT systems are structured overlay networks capable of using
P2P resources as a scalable platform for very large data storage
applications. However, their efficiency expects a level of uni-
formity in the association of data to index keys that is often
not present in inverted indexes. Index data tends to follow non-
uniform distributions, often power law distributions, creating in-
tense local storage hotspots and network bottlenecks on specific
hosts. Current techniques like caching cannot, alone, cope with
this issue.
We propose a new distributed data structure based on a decen-
tralized balanced tree to balance storage data and network load
more uniformly across all hosts. The approach is stackable with
standard DHTs and ensures that the DHT storage subsystem re-
ceives an uniform load by assigning fixed sized, or low variance,
blocks.

Machado D, Preguiça N, Moreno CB, Martins JL.  2007.  VC2-Providing Awareness in Off-The-Shelf Version Control Systems. Proceedings of the 9th International Workshop on Collaborative Editing Systems. 11 Abstractiwces-2007.pdf

Version control systems have been used to help groups of people working at the same or distributed sites to cooperatively create documents. In particular, these systems are very popular in distributed collaborative software development. However, even using these systems, users often perform concurrent changes that require manual con ict resolution. Important causes for this situation are the lack of mutual awareness and coordination, among developers, and reluctance to commit unstable modi cations. The paper addresses this problem by providing a tool that integrates with o the-shelf version control systems and monitors lesystem accesses to relevant les in order to enhance the awareness among developers. With VC2 users can be aware of uncommitted changes made by remote users;
receive request to commit their own changes; be advised to update their local versions. While the nal decision is always under user control, the team is made aware of the level of risk when delaying commits and updates.

Bento M, Preguiça N, Moreno CB, Martins JL.  2006.  Reconciliation for mobile computing environments with portable storage devices. Actas da Conferência sobre Sistemas Móveis e Ubíquos . Abstractbento-131_1.pdf

Mobile computing environments have changed in recent years with the increasing use of different types of portable devices, ranging from mobile phones to laptops, and from MP3 players to portable storage devices (e.g. flash disks). Many of these devices have large amounts of storage, allowing users to transport most of their data with them. In this paper we briefly present the FEW file management system, a system that aims to ease file management in this new mobile environment. In particular, we detail the automatic reconciliation approach used in this system based on operational transformation. We motivate our work with a study of conflicts in data managed by version control systems.

Jesus P, Moreno CB, Almeida PS.  2006.  ID Generation in Mobile Environments. Conference on Mobile and Ubiquitous Systems (CSMU). Abstractjesus-148.pdf

This work is focused in the ID generation problem in mobile environments. We discuss the suitability of traditional mechanisms and techniques to generate IDs in mobile environments. Based on the “Birthday Problem”, we deduced some formulas to evaluate the ID trust that is directly related to the number of entities in the system. The estimation of the system size revels to be the main problem of our approach. To deal with it, we develop a recursive schema that needs to be evaluated. Alternatively, we also design an aggregation algorithm to estimate the system size, which results are currently been analyzed.

Preguiça N, Moreno CB, Martins JL, Shapiro M, Almeida PS, Domingos H, Fonte V, Duarte S, others.  2005.  Few: File management for portable devices. International Workshop on Software Support for Portable Storage (IWSSPS). Abstractfew-iwssps2005-final2.pdf

In recent years, an increasing number of portable devices with large amounts of storage have become widely used. In this paper, we present the early design of the FEW system, a system that aims to ease file management in the new mobile environment. To this end, the system will manage file replicas stored in fixed and portable storage devices. It will provide an automatic mechanism to establish new file replicas by analyzing file system activity. The system will automatically and incrementally synchronize all file replicas exploring the available network connectivity and the availability of portable storage devices. To merge concurrent updates, operational transformation techniques will be used.

Almeida JB, Almeida PS, Moreno CB.  2004.  Bounded Version Vectors. International Conference on Distributed Computing - ICDCS. 3274:102-116. Abstract04bvv.pdf

Version vectors play a central role in update tracking under optimistic distributed systems, allowing the detection of obsolete or inconsistent versions of replicated data. Version vectors do not have a bounded representation; they are based on integer counters that grow indefinitely as updates occur. Existing approaches to this problem are scarce; the mechanisms proposed are either unbounded or operate only under specific settings. This paper examines version vectors as a mechanism for data causality tracking and clarifies their role with respect to vector clocks. Then, it introduces bounded stamps and proves them to be a correct alternative to integer counters in version vectors. The resulting mechanism, bounded version vectors, represents the first bounded solution to data causality tracking between replicas subject to local updates and pairwise symmetrical synchronization.

Almeida PS, Moreno CB, Fonte V.  2002.  Version stamps-decentralized version vectors. Proceedings 22nd International Conference on Distributed Computing Systems - ICDCS. :544–551. Abstract10.1.1.16.8235.pdf

Version vectors and their variants play a central role in update tracking in optimistic distributed systems. Existing mechanisms for a variable number of participants use a mapping from identities to integers, and rely on some form of global configuration or distributed naming protocol to assign unique identifiers to each participant. These approaches are incompatible with replica creation under arbitrary partitions, a typical mode of operation in mobile or poorly connected environments. We present an update tracking mechanism that overcomes this limitation; it departs from the traditional mapping and avoids the use of integer counters, while providing all the functionality of version vectors in what concerns version tracking.

Cunha M, Preguiça N, Martins JL, Domingos H, Moura F, Moreno CB.  2001.  Mobisnap: Um sistema de bases de dados para ambientes móveis. CRC' - Actas da Conferência de Redes de Computadores . Abstractmobisnap-crc01.pdf

O Mobisnap é um sistema de base de dados para ambientes móveis. O sistema é baseado numa arquitectura cliente/servidor utilizando replicação optimista. Os clientes mantêm uma cópia parcial dos dados e o sistema combina um mecanismo que evita os conflitos – reservas – com outro que permite a sua resolução – transacções móveis. Estes mecanismos permitem que os clientes mantenham a sua autonomia mesmo em situações de desconexão.