Francisco Almeida Maia

Cruz F, Maia F, Matos M, Oliveira R, Paulo J, Pereira JO, Vilaça R.  2016.  Resource Usage Prediction in Distributed Key-Value Datastores. Distributed Applications and Interoperable Systems: 16th IFIP WG 6.1 International Conference, DAIS 2016, Held as Part of the 11th International Federated Conference on Distributed Computing Techniques, DisCoTec 2016, Heraklion, Crete, Greece, June 6-9, 2. :144–159. Abstract

n/a

Maia F, Matos M, Coelho F.  2016.  Towards Quantifiable Eventual Consistency. Proceedings of the 6th International Conference on Cloud Computing and Services Science. :368-370. Abstractdatadiversityconvergence_2016_5.pdf

In the pursuit of highly available systems, storage systems began offering eventually consistent data models. These models are suitable for a number of applications but not applicable for all. In this paper we discuss a system that can offer a eventually consistent data model but can also, when needed, offer a strong consistent one.

Jorge T, Maia F, Matos M, Pereira JO, Oliveira R.  2015.  Practical Evaluation of Large Scale Applications. Proceedings of the 15th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS). :124–137. Abstractdais15minhajsr.pdf

Designing and implementing distributed systems is a hard endeavor, both at an abstract level when designing the system, and at a concrete level when implementing, debugging and evaluating it. This stems not only from the inherent complexity of writing and reasoning about distributed software, but also from the lack of tools for testing and evaluating it under realistic conditions. Moreover, the gap between the protocols’ specifications found on research papers and their implementations on real code is huge, leading to inconsistencies that often result in the implementation no longer following the specification. As an example, the specification of the popular Chord DHT comprises a few dozens of lines, while its Java implementation, OpenChord, is close to twenty thousand lines, excluding libraries. This makes it hard and error prone to change the implementation to reflect changes in the specification, regardless of programmers’ skill. Besides, critical behavior due to the unpredictable interleaving of operations and network uncertainty, can only be observed on a realistic setting, limiting the usefulness of simulation tools. We believe that being able to write an algorithm implementation very close to its specification, and evaluating it in a real environment is a big step in the direction of building better distributed systems. Our approach leverages the MINHA platform to offer a set of built in primitives that allows one to program very close to pseudo-code. This high level implementation can interact with off-the-shelf existing middleware and can be gradually replaced by a production-ready Java implementation. In this paper, we present the system design and showcase it using a well-known algorithm from the literature.

Maia F.  2015.  Epidemic Store for Massive Scale Systems. Abstractmain.pdf

Considering the state-of-the-art systems for data management, it is observable that they exhibit two main frailties when deployed in a large scale system. On one hand, coordination protocols used in traditional relational database management systems do not perform well when the system grows beyond tens of nodes. On the other hand, data management approaches that relax consistency guarantees, thus avoiding coordination, struggle with high levels of system churn. In this dissertation, we present a completely decentralized, coordinationfree, scalable and robust data store. Our design is aimed at environments with several thousands of nodes and high levels of churn. Offering the current ubiquitous key-value data structures and programming interfaces, we describe how to overcome challenges raised by the need to distribute data -essential for load balancing, to replicate data - the crux of fault tolerance, and to route requests - key to performability. Alongside the design of our data store, we make several contributions in the context of distributed systems slicing. We propose a novel slicing protocol that overcomes state-of-the-art limitations. Additionally, we propose a novel epidemic algorithm for scalable and decentralized organization of system nodes into groups. This algorithm is used as an alternative to slicing at the core of our system. It organizes nodes into groups of parameterizable size without the need to have nodes knowing the system size. The contributions made on slicing protocols and the proposed group construction protocol are independent from the design of the data store. They are generic and can also be used as building blocks for other applications.

Cruz F, Maia F, Oliveira R, Vilaça R.  2014.  Workload-aware table splitting for NoSQL. SAC - Proceedings of the ACM Symposium on Applied Computing. Abstractdads14.pdf

Massive scale data stores, which exhibit highly desirable scalability and availability properties are becoming pivotal systems in nowadays infrastructures. Scalability achieved by these data stores is anchored on data independence; there is no clear relationship between data, and atomic inter-node operations are not a concern. Such assumption over data allows aggressive data partitioning. In particular, data tables are horizontally partitioned and spread across nodes for load balancing. However, in current versions of these data stores, partitioning is either a manual process or automated but simply based on table size. We argue that size based partitioning does not lead to acceptable load balancing as it ignores data access patterns, namely data hotspots. Moreover, manual data partitioning is cumbersome and typically infeasible in large scale scenarios. In this paper we propose an automated table splitting mechanism that takes into account the system workload. We evaluate such mechanism showing that it simple, non-intrusive and e fective.

DataFlasks

DataFlasks is a key-value data store built solely based on gossip-based protocols. It leverages the inherent characteristics of such protocols to be able to provide data persistence guarantees even in highly dynamic, massive scale systems.

Pasquet M, Maia F, Rivière E, Schiavoni V.  2014.  Autonomous Multi-Dimensional Slicing for Large-Scale Distributed Systems. Proceedings of the 14th International Conference on Distributed Applications and Interoperable Systems - DAIS. Abstractautonomous-multi-dimensional.pdf

Slicing is a distributed systems primitive that allows to autonomously partition a large set of nodes based on node-local attributes. Slicing is decisive for automatically provisioning system resources for different services, based on their requirements or importance. One of the main limitations of existing slicing protocols is that only single dimension attributes are considered for partitioning. In practical settings, it is often necessary to consider best compromises for an ensemble of metrics.
In this paper we propose an extension of the slicing primitive that allows multi-attribute distributed systems slicing. Our protocol employs a gossip-based approach that does not require centralized knowledge and allows self-organization. It leverages the notion of domination between nodes, forming a partial order between multi-dimensional points, in a similar way to SkyLine queries for databases. We evaluate and demonstrate the interest of our approach using large-scale simulations.

Photo

Bio

I am a Post Doc researcher at HASLab - High Assurance Lab. at University of Minho and INESC TEC. My research interests are distributed systems, cloud computing, large scale data management and gossip-based protocols.
Cruz F, Maia F, Matos M, Oliveira R, Paulo J, Pereira JO, Vilaça R.  2013.  MeT: Workload aware elasticity for NoSQL. Proceedings of the 8th European Conference on Computer Systems - EuroSys. :183–196. Abstractmet.pdf

NoSQL databases manage the bulk of data produced by modern Web applications such as social networks. This stems from their ability to partition and spread data to all available nodes, allowing NoSQL systems to scale. Unfortunately, current solutions' scale out is oblivious to the underlying data access patterns, resulting in both highly skewed load across nodes and suboptimal node configurations.
In this paper, we first show that judicious placement of HBase partitions taking into account data access patterns can improve overall throughput by 35%. Next, we go beyond current state of the art elastic systems limited to uninformed replica addition and removal by: i) reconfiguring existing replicas according to access patterns and ii) adding replicas specifically configured to the expected access pattern.
MeT is a prototype for a Cloud-enabled framework that can be used alone or in conjunction with OpenStack for the automatic and heterogeneous reconfiguration of a HBase deployment. Our evaluation, conducted using the YCSB workload generator and a TPC-C workload, shows that MeT is able to i) autonomously achieve the performance of a manual configured cluster and ii) quickly reconfigure the cluster according to unpredicted workload changes.

Maia F, Matos M, Rivière E, Oliveira R.  2012.  Slead: low-memory, steady distributed systems slicing. Proceedings of the 12th international conference on Distributed Applications and Interoperable Systems - IFIP. :1–15. Abstractslead.pdf

Slicing a large-scale distributed system is the process of autonomously partitioning its nodes into k groups, named slices. Slicing is associated to an order on node-specific criteria, such as available storage, uptime, or bandwidth. Each slice corresponds to the nodes between two quantiles in a virtual ranking according to the criteria.
For instance, a system can be split in three groups, one with nodes with the lowest uptimes, one with nodes with the highest uptimes, and one in the middle. Such a partitioning can be used by applications to assign different tasks to different groups of nodes, e.g., assigning critical tasks to the more powerful or stable nodes and less critical tasks to other slices.
Assigning a slice to each node in a large-scale distributed system, where no global knowledge of nodes’ criteria exists, is not trivial. Recently, much research effort was dedicated to guaranteeing a fast and correct convergence in comparison to a global sort of the nodes.
Unfortunately, state-of-the-art slicing protocols exhibit flaws that preclude their application in real scenarios, in particular with respect to cost and stability. In this paper, we identify steadiness issues where nodes in a slice border constantly exchange slice and large memory requirements for adequate convergence, and provide practical solutions for the two. Our solutions are generic and can be applied to two different state-of-the-art slicing protocols with little effort and while preserving the desirable properties of each. The effectiveness of the proposed solutions is extensively studied in several simulated experiments.

Maia F, Matos M, Pereira JO, Oliveira R.  2011.  Worldwide consensus. Proceedings of the 11th international conference on Distributed applications and interoperable systems - IFIP. :257–269. Abstractworldwide_consensus.pdf

Consensus is an abstraction of a variety of important challenges in dependable distributed systems. Thus a large body of theoretical knowledge is focused on modeling and solving consensus within different system assumptions. However, moving from theory to practice imposes compromises and design decisions that may impact the elegance, trade-offs and correctness of theoretical appealing consensus protocols.
In this paper we present the implementation and detailed analysis, in a real environment with a large number of nodes, of mutable consensus, a theoretical appealing protocol able to offer a wide range of trade-offs (called mutations) between decision latency and message complexity. The analysis sheds light on the fundamental behavior of the mutations, and leads to the identification of problems related to the real environment. Such problems are addressed without ever affecting the correctness of the theoretical proposal.

Maia F, Inigo A-J, Ruiz-Fuertes M, Oliveira R.  2010.  Scalable Transactions in the Cloud: Partitioning Revisited. On the Move to Meaningful Internet Systems. 6427:785-797. Abstractpartition-revisited.pdf

Cloud computing is becoming one of the most used paradigms to deploy highly available and scalable systems. These systems usually demand the management of huge amounts of data, which cannot be solved with traditional nor replicated database systems as we know them. Recent solutions store data in special key-value structures, in an approach that commonly lacks the consistency provided by transactional guarantees, as it is traded for high scalability and availability. In order to ensure consistent access to the information, the use of transactions is required. However, it is well-known that traditional replication protocols do not scale well for a cloud environment. Here we take a look at current proposals to deploy transactional systems in the cloud and we propose a new system aiming at being a step forward in achieving this goal. We proceed to focus on data partitioning and describe the key role it plays in achieving high scalability.

MeT

MeT is a Cloud-enabled framework that can be used alone or in conjunction with OpenStack for the automatic and heterogeneous reconfiguration of HBase. MeT is an workload aware system that provides automatic elasticity for the HBase NoSQL database. MeT not only adds and removes nodes automatically and according to system load, but also reconfigures them according to the observed workloads. As a result, it achieves a significant increase in overall system performance.

Position: 
External Researcher