Ricardo Pereira Vilaça

Cookies Policy

The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More

Institution
Research
Research Domains
Artificial Intelligence

Bioengineering

Communications

Computer Science and Engineering
Photonics

Power and Energy Systems

Robotics

Systems Engineering and Management
RESEARCH CENTERS
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Innovation
Innovation / Tec4

TEC4AGRO-FOOD

TEC4ENERGY

TEC4HEALTH

TEC4INDUSTRY

TEC4SEA

TECPARTNERSHIPS

Available Technologies
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Laboratories
Research Laboratories

iilab
Communication
News

Events

Media

Newsletter
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Work with us
Contacts

Home
People
Ricardo Pereira Vilaça

Read Full presentation

Ricardo Vilaça has a PhD from MAP-i since 2012. Currently he is a senior researcher at HASLab and MACC, University of Minho & INESC TEC and former has query engine technical director at LeanXcale. His research interests are energy efficient and scalable data management in both parallel and distributed systems. Currently, he participates in EuroCC2 and SustainableHPC research projects. He represents MACC in the national network (RNCA), and in EuroHPC hosting entities forum. He was the INESC TEC coordinator of the AIDA CMU large scale collaborative project and had participated in more than 15 national and international research projects. He was co-supervisor of 2 PhD students and had supervise several research grant holders and master thesis. He had published more than 40 research papers or articles and has served as reviewer for several highly reputed conferences.

Read Full presentation

About

Interest
Topics

Details

Name
Ricardo Pereira Vilaça
Role
Assistant Researcher
Since
01st November 2011

Nationality
Portugal
Centre
High-Assurance Software
Contacts
+351253604440
ricardo.p.vilaca@inesctec.pt

004

Publications

View all Publications

2023

TiQuE: Improving the Transactional Performance of Analytical Systems for True HybridWorkloads

Authors
Faria, N; Pereira, J; Alonso, AN; Vilaca, R; Koning, Y; Nes, N;

Publication
PROCEEDINGS OF THE VLDB ENDOWMENT

Abstract
Transactions have been a key issue in database management for a long time and there are a plethora of architectures and algorithms to support and implement them. The current state-of-the-art is focused on storage management and is tightly coupled with its design, leading, for instance, to the need for completely new engines to support new features such as Hybrid Transactional Analytical Processing (HTAP). We address this challenge with a proposal to implement transactional logic in a query language such as SQL. This means that our approach can be layered on existing analytical systems but that the retrieval of a transactional snapshot and the validation of update transactions runs in the server and can take advantage of advanced query execution capabilities of an optimizing query engine. We demonstrate our proposal, TiQuE, on MonetDB and obtain an average 500x improvement in transactional throughput while retaining good performance on analytical queries, making it competitive with the state-of-the-art HTAP systems.

CloseRead Abstract

2022

AIDA-DB: A Data Management Architecture for the Edge and Cloud Continuum

Authors
Faria, N; Costa, D; Pereira, J; Vilaça, R; Ferreira, L; Coelho, F;

Publication
19th IEEE Annual Consumer Communications & Networking Conference, CCNC 2022, Las Vegas, NV, USA, January 8-11, 2022

Abstract
There is an increasing demand for stateful edge computing for both complex Virtual Network Functions (VNFs) and application services in emerging 5G networks. Managing a mutable persistent state in the edge does however bring new architectural, performance, and dependability challenges. Not only it has to be integrated with existing cloud-based systems, but also cope with both operational and analytical workloads and be compatible with a variety of SQL and NoSQL database management systems. We address these challenges with AIDA-DB, a polyglot data management architecture for the edge and cloud continuum. It leverages recent development in distributed transaction processing for a reliable mutable state in operational workloads, with a flexible synchronization mechanism for efficient data collection in cloud-based analytical workloads. © 2022 IEEE.

CloseRead Abstract

2022

Adaptive Database Synchronization for an Online Analytical Cloud-to-Edge Continuum

Authors
Costa, D; Pereira, J; Vilaca, R; Faria, N;

Publication
37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING

Abstract
Wide availability of edge computing platforms, as expected in emerging 5G networks, enables a computing continuum between centralized cloud services and the edge of the network, close to end-user devices. This is particularly appealing for online analytics as data collected by devices is made available for decisionmaking. However, cloud-based parallel-distributed data processing platforms are not able to directly access data on the edge. This can be circumvented, at the expense of freshness, with data synchronization that periodically uploads data to the cloud for processing. In this work, we propose an adaptive database synchronization system that makes distributed data in edge nodes available dynamically to the cloud by balancing between reducing the amount of data that needs to be transmitted and the computational effort needed to do so at the edge. This adapts to the availability of CPU and network resources as well as to the application workload.

CloseRead Abstract

2022

Scalable transcriptomics analysis with Dask: applications in data science and machine learning

Authors
Moreno, M; Vilaca, R; Ferreira, PG;

Publication
BMC BIOINFORMATICS

Abstract
Background: Gene expression studies are an important tool in biological and biomedical research. The signal carried in expression profiles helps derive signatures for the prediction, diagnosis and prognosis of different diseases. Data science and specifically machine learning have many applications in gene expression analysis. However, as the dimensionality of genomics datasets grows, scalable solutions become necessary. Methods: In this paper we review the main steps and bottlenecks in machine learning pipelines, as well as the main concepts behind scalable data science including those of concurrent and parallel programming. We discuss the benefits of the Dask framework and how it can be integrated with the Python scientific environment to perform data analysis in computational biology and bioinformatics. Results: This review illustrates the role of Dask for boosting data science applications in different case studies. Detailed documentation and code on these procedures is made available at https:// github. com/martaccmoreno/gexp-ml-dask. Conclusion: By showing when and how Dask can be used in transcriptomics analysis, this review will serve as an entry point to help genomic data scientists develop more scalable data analysis procedures.

CloseRead Abstract

2021

Detailed Black-Box Monitoring of Distributed Systems

Authors
Neves, F; Vilaca, R; Pereira, J;

Publication
APPLIED COMPUTING REVIEW

Abstract
Modern containerized distributed systems, such as big data storage and processing stacks or micro-service based applications, are inherently hard to monitor and optimize, as resource usage does not directly match hardware resources due to multiple virtualization layers. For instance, inter-application traffic is an important factor in as it directly indicates how components interact, it has not been possible to accurately monitor it in an application independent way and without severe overhead, thus putting it out of reach of cloud platforms. In this paper we present an efficient black-box monitoring approach for gathering detailed structural information of collaborating processes in a distributed system that can be queried for various purposes, as it includes both information about processes, containers, and hosts, as well as resource usage and amount of data exchanged. The key to achieving high detail and low overhead without custom application instrumentation is to use a kernel-aided event driven strategy. We validate a prototype implementation by applying it to multi-platform microservice deployments, evaluate its performance with micro-benchmarks, and demonstrate its usefulness for container placement in a distributed data storage and processing stack (i.e., Cassandra and Spark).

CloseRead Abstract

Supervised
thesis

Supervised Thesis

View all Supervised Theses

2022

Como é possível melhorar o desempenho da utilização de listas encadeadas em React Native?

Author
DIOGO ANDRÉ PINTO BATISTA

Institution
IPP-ISEP

2022

Data Lakes em ambientes híbridos Cloud/Edge

Author
Daniel Vilar da Costa

Institution
UM

2021

ClimateCollab: A collaborative graph for reproducible evidence of climate change

Author
Lázaro Gabriel Barros da Costa

Institution
UP-FEUP

2021

Holistic performance and scalability analysis for large-scale distributed systems

Author
Francisco Nuno Teixeira Neves

Institution
UM

2018

Controlo das trajetórias de um robô móvel de alto desempenho

Author
Sandro Augusto Costa Magalhães

Institution
UP-FEUP

View all Supervised Theses

Ricardo Pereira Vilaça

About

Details

Name

Role

Since

Nationality

Centre

Contacts

AIDA

BigHPC

RISC2

EuroCC2

TiQuE: Improving the Transactional Performance of Analytical Systems for True HybridWorkloads

AIDA-DB: A Data Management Architecture for the Edge and Cloud Continuum

Adaptive Database Synchronization for an Online Analytical Cloud-to-Edge Continuum

Scalable transcriptomics analysis with Dask: applications in data science and machine learning

Detailed Black-Box Monitoring of Distributed Systems

Como é possível melhorar o desempenho da utilização de listas encadeadas em React Native?

Data Lakes em ambientes híbridos Cloud/Edge

ClimateCollab: A collaborative graph for reproducible evidence of climate change

Holistic performance and scalability analysis for large-scale distributed systems

Controlo das trajetórias de um robô móvel de alto desempenho