Recent Publications

Macedo N.  2010.  Translating Alloy specifications to the point-free style. Abstractnunomacedomsc.pdf

Every program starts from a model, an abstraction, which is iteratively re ned until we reach the
nal result, the implementation. However, at the end, one must ask: does the nal program resemble
in anyway the original model? Was the original idea correct to begin with? Formal methods
guarantee that those questions are answered positively, resorting to mathematical techniques. In
particular, in this thesis we are interested on the second factor: veri cation of formal models.
A trend of formal methods defends that they should be lightweight, resulting in a reduced
complexity of the speci cation, and automated analysis. Alloy was proposed as a solution for this
problem. In Alloy, the structures are described using a simple mathematical notation: relational
logic. A tool for model checking, automatic veri cation within a given scope, is also provided.
However, sometimes model checking is not enough and the need arises to perform unbounded
veri cations. The only way to do this is to mathematically prove that the speci cations are correct.
As such, there is the need to nd a mathematical logic expressive enough to be able to represent
the speci cations, while still being su ciently understandable.
We see the point-free style, a style where there are no variables or quanti cations, as a kind
of Laplace transform, where complex problems are made simple. Being Alloy completely relational,
we believe that a point-free relational logic is the natural framework to reason about Alloy
speci cations.
Our goal is to present a translation from Alloy speci cations to a point-free relational calculus,
which can then be mathematically proven, either resorting to proof assistants or to manual proving.
Since our motivation for the use of point-free is simplicity, we will focus on obtaining expressions
that are simple enough for manipulation and proofs about them.

Silva A.  2010.  Kleene Coalgebra. Abstractthesis.pdf

Computer systems have widely spread since their appearance, and they now play a crucial role in many daily activities, with their deployment ranging from small home appliances to safety critical components, such as airplane or automobile control systems. Accidents caused by either hardware or software failure can have disastrous consequences, leading to the loss of human lives or causing enormous financial drawbacks. One of the greatest challenges of computer science is to cope with the fast evolution of computer systems and to develop formal techniques which facilitate the construction of dependable software and hardware systems. Since the early days of computer science, many scientists have searched for suitable models of computation and for specification languages that are appropriate for reasoning about such models. When devising a model for a particular system, there is a need to encode different features which capture the inherent behavior of the system. For instance, some systems have deterministic behavior (a calculator or an elevator), whereas others have inherently non-deterministic or probabilistic behavior (think of a casino slot machine). The rapidly increasing complexity of systems demands for compositional and unifying models of computation, as well as general methods and guidelines to derive specification languages.

Ferreira JF.  2010.  Principles and Applications of Algorithmic Problem Solving. Abstractthesis-a4-colour.pdf

Algorithmic problem solving provides a radically new way of approaching and solving problems in general by using the advances that have been made in the basic principles of correct-by-construction algorithm design. The aim of this thesis is to provide educational material that shows how these advances can be used to support the teaching of mathematics and computing.
We rewrite material on elementary number theory and we show how the focus on the algorithmic content of the theory allows the systematisation of existing proofs and, more importantly, the construction of new knowledge in a practical and elegant way. For example, based on Euclid’s algorithm, we derive a new and efficient algorithm to enumerate the positive rational numbers in two different ways, and we develop a new and constructive proof of the two-squares theorem. Because the teaching of any subject can only be effective if the teacher has access to abundant and sufficiently varied educational material, we also include a catalogue of teaching scenarios. Teaching scenarios are fully worked out solutions to algorithmic problems together with detailed guidelines on the principles captured by the problem, how the problem is tackled, and how it is solved. Most of the scenarios have a recreational flavour and are designed to promote self-discovery by the students. Based on the material developed, we are convinced that goal-oriented, calculationa algorithmic skills can be used to enrich and reinvigorate the teaching of mathematics and computing.

Paulo J.  2009.  Efficient storage of data in cloud computing. Abstractpp09.pdf

Keeping critical data safe and accessible from several locations has become a global preoccupation, either being this data personal, organizational or from applications. As a consequence of this issue, we verify the emergence of on-line storage services. In addition, there is the new paradigm of Cloud Computing, which brings new ideas to build services that allow users to store their data and run their applications in the Cloud. By doing a smart and efficient management of these services' storage, it is possible to improve the quality of service offered, as well as to optimize the usage of the infrastructure where the services run. This management is even more critical and complex when the infrastructure is composed by thousand of nodes running several virtual machines and sharing the same storage. The elimination of redundant data at these services' storage can be used to simplify and enhance this management.
This dissertation presents a solution to detect and eliminate duplicated data between virtual machines that run on the same physical host and write their virtual disks' data to a shared storage. A prototype that implements this solution is introduced and evaluated. Finally, a study that compares the efficiency of two different approaches used to eliminate redundant data in a personal data set is described.

Silva P.  2009.  On the Design of a Galculator. Abstractthesis.pdf

The increased complexity of software systems together with the lack of tools and technique to support their development led to the so-called "software crisis". Different views about the problem originated diverse approaches to a possible solution, although it is now generally accepted that a "silver bullet" does not exist.
The formal methods view considers mathematical reasoning as fundamental to fulfill the most important property of software systems: correctness. However, since correctness proofs are generally difficult and expensive, only critical applications are regarded as potential targets for their use. Developments in tool support such as proof assistants, model checkers and abstract interpreters allows for reducing this cost and making proofs affordable to a wider range of applications. Nevertheless, the effectiveness of a tool is highly dependent of the underlying theory.
This dissertation follows a calculational proof style in which equality plays the fundamental role. Instead of the traditional logical approach, fork algebras, an extension of relation algebras, are used. In this setting, Galois connections are important because they reinforce the calculational nature of the algebraic approach, bringing additional structure to the calculus. Moreover, Galois connections enjoy several valuable proper- ties and allow for transformation in the domain of problems. In this dissertation, it is shown how fork algebras and Galois connections can be integrated together with the indirect equality principle. This combination offers a very powerful, generic device to tackle the complexity of proofs in program verification. This power is enhanced with the design of an innovative proof assistant prototype, the Galculator.

Matos M.  2009.  Network-Aware Epidemic Broadcast. Abstractmasterthesis.pdf

Epidemic multicast is an emerging resilient and scalable approach to the reliable dissemination of application data in the context of very large scale distributed systems. Unfortunately, the resilience and scalability come at the cost of considerable redundancy which led to high resource consumption on both links and nodes. In environments with resource constrained links, such as in Cloud Computing infrastructure composed by data centers organized in a federation around the globe, the high resource consumption precludes the use of this class of protocols. The goal of this dissertation is therefore to cope with the constraints of these scenarios, by reducing the network load imposed on the constrained long distance links. This is achieved by constructing an overlay that reflects the characteristics of the links, and by using a dissemination protocol that takes into account locality when transmitting the message payloads. We conducted an extensive experimental evaluation that presents an improvement over an order of magnitude in the number of messages that traverse the costlier links, without endangering the resilience and scalability properties that make epidemic based protocols so attractive.

Nunes A.  2009.  P2P content-push in the Internet. AbstractP2P_content-push_in_the_Internet

Syndicated content-push in the Internet is a huge success, and web feeds are being used to convey various types of content: from news headlines, to podcasts and video podcasts, to being a feature in Web 2.0 websites. This diversity lead to the appearance of several frameworks, each tailored to a specific content type. At the same time, interest in social networking exploded, as more and more websites of this purpose were launched. Syndicated content and social networking websites are now intimately connected.
In this work, a generic, modular, p2p content-push architecture is proposed. It provides an evolutionary upgrade path based on the technologies already in use in the Internet for interoperability, thus without disrupting current infrastructure or changing participants’ habits. It also leverages social networks for content discovery and aggregation, using peer-to-peer protocols for distribution. A partial implementation of this architecture, dubbed SEEDS, is also presented.

Abreu R.  2009.  Spectrum-based fault localization in embedded software. Abstractthesis_rui_abreu.pdf

Modern daily devices such as televisions rely increasingly on (embedded) software. Features implemented in software are often cheaper, easier, flexible to future modifications, and more portable than when implemented in hardware. Such properties are extremely important as, nowadays, many devices serve no single purpose but, instead, have several functionalities which need to be easily modified or upgraded to adhere to the high expectations of the consumers. As an example, a mobile phone is used not only to make phone calls but also, e.g., as a navigation system which has to contain the most upto-date navigation engine and maps.

Oliveira N.  2009.  Improving Program Comprehension Tools for Domain Specific Languages. Abstract

Since the dawn of times, curiosity and necessity to improve the quality of their life, led humans to find means to understand everything surrounding them, aiming at improving it. Whereas the creating abilities of some was growing, the capacity to comprehend of others follow their steps. Disassembling physical objects to comprehend the connections between the pieces in order to understand how they work together is a common human behavior. With the computers arrival, humans felt the necessity of applying the same techniques (disassemble to comprehend) to their programs.
Traditionally, these programs are written resorting to general-purpose programming languages. Hence, techniques and artifacts, used to aid on program comprehension, were built to facilitate the work of software programmers on maintaining and improving programs that were developed by others. Generally, these generic languages deal with concepts at a level that the human brain can hardly understand.
So understanding programs written in this languages is an hard task, because the distance between the concepts at the program level and the concepts at the problem level is too big.
Thus, as in politics, justice, medicine, etc. groups of words are regularly used facilitating the comprehension between people, also in programming, languages that address a specific domain were created. These programming languages raise the abstraction of the program domain, shortening the gap to the concepts of the problem domain.
Tools and techniques for program comprehension commonly address the program domain and they took little advantage of the problem domain. In this master’s thesis, the hypothesis that it is easier to comprehend a program when the underlying problem and program domains are known and a bridge between them is established, is assumed. Then, a program comprehension technique for domain specific languages,
is conceived, proposed and discussed. The main objective is to take advantage from the large knowledge about the problem domain inherent to the domain specific language, and to improve traditional program comprehension tools that only dealt, until then, with the program domain. This will create connections between both program and problem domains. The final result will show, visually, what happens internally
at the program domain level, synchronized with what happens externally, at problem level.

Rodrigues N.  2008.  Slicing techniques applied to architectural analysis of legacy software. Abstracttese.pdf

Program understanding is emerging as a key concern in software engineering. In a situation in which the only quality certificate of the running software artifact still is life-cycle endurance, customers and software producers are little prepared to modify or improve running code. However, faced with so risky a dependence on legacy software, managers are more and more prepared to spend resources to increase confidence on — i.e., the level of understanding of — their (otherwise untouchable) code. In fact the technological and economical relevance of legacy software as well as the complexity of their re-engineering entails the need for rigour.
Addressing such a scenario, this thesis advocates the use of direct source code analysis for both the process of understanding and transformation of software systems. In particular, the thesis focuses on the development and application of slicing techniques at both the “micro” and “macro” structural levels of software.
The former, deals with fine-grained structures of programs, slicing operating over elementary program entities, such as types, variables or procedure identifiers. The latter, on the other hand, addresses architectural issues and interaction modes across modules, components or services upon which a system is decomposed. At the “micro” level this thesis delves into the problem of slicing functional programs, a paradigm that is gaining importance and was generally neglected by the slicing community. Three different approaches
to functional slicing are proposed, accompanied by the presentation of the HaSlicer application, a software tool developed as a proof-of-concept for some of the ideas discussed. A comparison between the three approaches, their practical application and the motivational aspects for keeping investivigating new functional slicing processes are also discussed. Slicing at a “macro” level is the theme of the second part of this thesis,
which addresses the problem of extracting from source code the system’s coordination model which governs interaction between its components. This line of research delivers two approaches for abstracting software systems coordination models, one of the most vital structures for software architectural analysis. Again, a software tool – CoordInspector – is introduced as a proof-of-concept.

Cunha A.  2005.  Point-free Program Calculation. Abstracttese.pdf

Due to referential transparency, functional programming is particularly appropriate for equational reasoning. In this thesis we reason about functional programs by calculation, using a program calculus built upon two basic ingredients. The first is a set of recursion patterns that allow us to define recursive functions implicitly. These are encoded as higher-order operators that encapsulate typical forms of recursion, such as the well-known foldr operator on lists. The second is a point-free style of programming in which programs are expressed as combinations of simpler functions, without ever mentioning their arguments. The basic combinators are derived from standard categorical constructions, and are characterized by a rich set of equational laws. In order to be able to apply this calculational methodology to real lazy functional programming languages, a concrete category of partial functions and elements is used.
While recursion patterns are already well accepted and a lot of research has been carried out on this topic, the same cannot be said about point-free programming. This thesis addresses precisely this component of the calculus. One of the contributions is a mechanism to translate classic pointwise code into the point-free style. This mechanism can be applied to a λ-calculus rich enough to represent the core functionality of a real functional programming language. A library that enables programming in a pure pointfree style within Haskell is also presented. This library is useful for understanding the expressions resulting from the above translation, since it allows their direct execution and, where applicable, the graphical visualization of recursion trees. Another contribution of the thesis is a framework for performing point-free calculations with higher-order functions. This framework is based on the internalization of some basic combinators, and considerably shortens calculations in this setting. In order to assess the viability of mechanizing calculations, several issues are discussed concerning the decidability of equality and the implementation of fusion laws.

Frade MJ.  2004.  Type-Based Termination of Recursive Definitions and Constructor Subtyping in Typed Lambda Calculi. Abstractthesis.pdf

In type systems, a combination of subtyping and overloading is a way to achieve more precise typings. This thesis explores how to use these mechanisms in two directions: (i) as a way to ensure termination of recursive functions; (ii) as a way to capture in a type-theoretic context the use of subtyping as inclusion between inductively defined sets.
The first part of the thesis presents a mechanism that ensures termination through types and defines a system that incorporates it. More precisely, we formalize the notion of type-based termination using a restricted form of type dependency (also known as indexed types). Every datatype is replaced by a family of approximations indexed over a set of stages; then being in a certain approximation means that a term can be seen as having a certain bound on constructor usage. We introduce λˆ, a simply typed λ-calculus à la Curry, supporting parametric inductive datatypes, case-expressions and letrec-expressions with termination ensured by types. We show that λˆ enjoys important meta-theoretical properties, including confluence, subject reduction and strong normalization. We also show that the calculus is powerful enough to encode many recursive definitions rejected by existing type systems, and give some examples. We prove that this system encompasses in a strict way Giménez' λς, a system in which termination of typable expressions is ensured by a syntactical condition constraining the uses of recursive calls in the body of definitions.
The second part of the thesis studies properties of a type system featuring constructor subtyping. Constructor subtyping is a form of subtyping in which an inductive type σ is viewed as a subtype of another inductive type τ if each constructor c of σ is also a constructor of τ (but τ may have more constructors), and whenever c : θ'→σ is a declaration for τ, then c : θ'→τ is a declaration for τ with θ'→≤θ'. In this thesis we allow for this form of subtyping in the system λcs, which is a simply typed λ-calculus à la Curry, supporting mutually recursive parametric datatypes, case-expressions and letrec-expressions. We establish the properties of confluence, subject reduction and decidability of type checking for this calculus. As the system features general recursion, the reduction calculus is obviously non-terminating. However, we sketch two ways of achieving strong normalization. One way is to constrain the system to guard-by-destructors recursion, following what is done for λς. The other way is to enrich the type system with stages (following the ideas presented for λˆ) and enforcing termination through typing. Potential uses of constructor subtyping include proof assistants and functional programming languages. In particular, constructor subtyping provides a suitable foundation for extensible datatypes, and is specially adequate to re-usability. The combination of subtyping between datatypes and overloading of constructors allows the definition of new datatypes by restricting or by expanding the set of constructors of an already defined datatype. This flexibility in the definition of datatypes induces a convenient form of code reuse for recursive functions, allowing the definition of new functions by restricting or by expanding already defined ones. We enrich a calculus featuring constructor subtyping with a mechanism to define extensible overloaded recursive functions by pattern-matching, obtaining the system λcs+fun. We formalize the concept of well-formed environment of function declarations and establish that under such environments the properties of confluence, subject reduction and decidability of type-checking hold. Moreover, we prove that the requirements imposed for the well-formed environments are decidable and show how standard techniques can still be used for compiling pattern-matching into case-expressions.

Pereira JO.  2002.  Semantically Reliable Group Communication. Abstractphd.pdf

Current usage of computers and data communication networks for a variety of daily tasks, calls for widespread deployment of fault tolerance techniques with inexpensive off-the-shelf hardware and software. Group communication is in this context a particularly appealing technology, as it provides to the application programmer reliability guarantees that highly simplify many fault tolerance techniques. It has however been reported that the performance of group communication toolkits in large and heterogeneous systems is frequently disappointing. Although this can be overcome by relaxing reliability guarantees, the resulting protocol is often much less useful than group communication, in particular, for strong consistent replication. The challenge is thus to relax reliability and still provide a convenient set of guarantees for fault tolerant programming. This thesis addresses models and mechanisms that by selectively relaxing reliability guarantees, offer both the convenience of group communication for fault tolerant programming and high performance. The key to our proposal is to use knowledge about the semantics of messages exchanged to determine which messages need to be reliably delivered, hence semantic reliability. In many applications, some messages implicitly convey or overwrite other messages sent recently before, making them obsolete while still in transit. By omitting only the delivery of obsolete messages, performance can be improved without impact on the correctness of the application. Specifications and algorithms for a complete semantically reliable group communication protocol suite are introduced, encompassing ordered and view synchronous multicast. The protocols are then evaluated with analytical and simulation models and with a prototype implementation. The discussion of a concrete application illustrates the resulting programming interface and performance.

Campos JC.  1999.  Automated Deduction and Usability Reasoning. Abstractycst-2000-09.pdf

Reasoning about the usability of a given interactive system's design is a difficult task. However it is one task that must be performed if downstream costs with extensive redesign are to be avoided. Traditional usability testing alone cannot avoid these costs since it too often is performed late in development life cycle. In recent years approaches have been put forward that attempt to reason about the usability of a design from early in development. Mainstream approaches are based on some kind of (more or less structured) inspection of a design by usability experts. This type of approach breaks down for systems with large and complex user interfaces, and again extensive testing is required. In an attempt to deal with these issues there have been attempts to apply software engineering reasoning tools to the development of interactive systems. The article reviews this work and puts forward some ideas for the future.

Pereira JO.  1998.  Comunicação Fiável em Sistemas Distribuídos em Grande Escala. Abstracttese.pdf

Protocolos de difusão fiável em grupos de processo são uma ferramenta de programação utilizada em sistemas distribuídos e confiáveis para isolar cada aplicação da complexidade decorrente da comunicação e das diversas faltas que podem ocorrer no sistema. Isto é conseguido oferecendo às aplicações uma abstração de um canal de difusão que oferece garantias muito fortes quanto à qualidade de serviço, o que é suficiente para garantir a correção de um largo espectro de aplicações com um esforço mínimo por parte do programador.
A concretização de protocolos de difusão fiável sobre redes em grande escala introduz um nível adicional de complexidade, pois as características destas redes dificultam a concretização correta e eficiente de serviços que oferecem garantias muito fortes.
Diversas propostas procuram ultrapassar este problema disponibilizando serviços que oferecem menos garantias de qualidade de serviço, mas que apresentam desempenho superior nas situações em que são utilizáveis. No entanto, quanto menos garantias se oferecerem, menos aplicações podem ser correctamente suportadas, uma vez que diferentes aplicações toleram diferentes relaxamentos da qualidade de serviço.
Nesta tese propõe-se um protocolo de difusão fiável que possa ser configurado para cada aplicação, de modo a poder ser simultaneamente eficiente e correcto porque adequado à aplicação em causa.
Para o efeito, começa-se por decompor a especificação de protocolos de difusão segundo um conjunto de parâmetros aplicáveis em separado a cada uma das mensagens de uma sessão, que podem ser combinados em especificações de protocolos adequados a cada aplicação.
Propõe-se então uma estratégia de concretização aberta orientada por objetos, que permite compor protocolos a partir de módulos independentes para cada mensagem, correspondentes aos parâmetros de especificação.