Recent Publications

Proenca J.  2011.  Synchronous Coordination of Distributed Components. Abstractthesis-colour.pdf

Coordination is a relatively recent field, considerably inspired by concurrency theory [83, 85, 91]. Coordination languages and models [90] are based on the philosophy that an application or a system should be divided into the parts that perform computations, typically components or services, and the parts that coordinate the results and resources required to perform the computations. The coordination aspect focuses on the latter, describing how the components or services are connected. In this thesis we study a specific class of coordination models, namely synchronous, exogenous, and composable models, and we exploit implementation techniques for such models in distributed environments. Our work concentrates on the Reo coordination model [8] as the main representative of this class of coordination models.

Macedo N.  2010.  Translating Alloy specifications to the point-free style. Abstractnunomacedomsc.pdf

Every program starts from a model, an abstraction, which is iteratively re ned until we reach the
nal result, the implementation. However, at the end, one must ask: does the nal program resemble
in anyway the original model? Was the original idea correct to begin with? Formal methods
guarantee that those questions are answered positively, resorting to mathematical techniques. In
particular, in this thesis we are interested on the second factor: veri cation of formal models.
A trend of formal methods defends that they should be lightweight, resulting in a reduced
complexity of the speci cation, and automated analysis. Alloy was proposed as a solution for this
problem. In Alloy, the structures are described using a simple mathematical notation: relational
logic. A tool for model checking, automatic veri cation within a given scope, is also provided.
However, sometimes model checking is not enough and the need arises to perform unbounded
veri cations. The only way to do this is to mathematically prove that the speci cations are correct.
As such, there is the need to nd a mathematical logic expressive enough to be able to represent
the speci cations, while still being su ciently understandable.
We see the point-free style, a style where there are no variables or quanti cations, as a kind
of Laplace transform, where complex problems are made simple. Being Alloy completely relational,
we believe that a point-free relational logic is the natural framework to reason about Alloy
speci cations.
Our goal is to present a translation from Alloy speci cations to a point-free relational calculus,
which can then be mathematically proven, either resorting to proof assistants or to manual proving.
Since our motivation for the use of point-free is simplicity, we will focus on obtaining expressions
that are simple enough for manipulation and proofs about them.

Silva A.  2010.  Kleene Coalgebra. Abstractthesis.pdf

Computer systems have widely spread since their appearance, and they now play a crucial role in many daily activities, with their deployment ranging from small home appliances to safety critical components, such as airplane or automobile control systems. Accidents caused by either hardware or software failure can have disastrous consequences, leading to the loss of human lives or causing enormous financial drawbacks. One of the greatest challenges of computer science is to cope with the fast evolution of computer systems and to develop formal techniques which facilitate the construction of dependable software and hardware systems. Since the early days of computer science, many scientists have searched for suitable models of computation and for specification languages that are appropriate for reasoning about such models. When devising a model for a particular system, there is a need to encode different features which capture the inherent behavior of the system. For instance, some systems have deterministic behavior (a calculator or an elevator), whereas others have inherently non-deterministic or probabilistic behavior (think of a casino slot machine). The rapidly increasing complexity of systems demands for compositional and unifying models of computation, as well as general methods and guidelines to derive specification languages.

Paulo J.  2009.  Efficient storage of data in cloud computing. Abstractpp09.pdf

Keeping critical data safe and accessible from several locations has become a global preoccupation, either being this data personal, organizational or from applications. As a consequence of this issue, we verify the emergence of on-line storage services. In addition, there is the new paradigm of Cloud Computing, which brings new ideas to build services that allow users to store their data and run their applications in the Cloud. By doing a smart and efficient management of these services' storage, it is possible to improve the quality of service offered, as well as to optimize the usage of the infrastructure where the services run. This management is even more critical and complex when the infrastructure is composed by thousand of nodes running several virtual machines and sharing the same storage. The elimination of redundant data at these services' storage can be used to simplify and enhance this management.
This dissertation presents a solution to detect and eliminate duplicated data between virtual machines that run on the same physical host and write their virtual disks' data to a shared storage. A prototype that implements this solution is introduced and evaluated. Finally, a study that compares the efficiency of two different approaches used to eliminate redundant data in a personal data set is described.

Silva P.  2009.  On the Design of a Galculator. Abstractthesis.pdf

The increased complexity of software systems together with the lack of tools and technique to support their development led to the so-called "software crisis". Different views about the problem originated diverse approaches to a possible solution, although it is now generally accepted that a "silver bullet" does not exist.
The formal methods view considers mathematical reasoning as fundamental to fulfill the most important property of software systems: correctness. However, since correctness proofs are generally difficult and expensive, only critical applications are regarded as potential targets for their use. Developments in tool support such as proof assistants, model checkers and abstract interpreters allows for reducing this cost and making proofs affordable to a wider range of applications. Nevertheless, the effectiveness of a tool is highly dependent of the underlying theory.
This dissertation follows a calculational proof style in which equality plays the fundamental role. Instead of the traditional logical approach, fork algebras, an extension of relation algebras, are used. In this setting, Galois connections are important because they reinforce the calculational nature of the algebraic approach, bringing additional structure to the calculus. Moreover, Galois connections enjoy several valuable proper- ties and allow for transformation in the domain of problems. In this dissertation, it is shown how fork algebras and Galois connections can be integrated together with the indirect equality principle. This combination offers a very powerful, generic device to tackle the complexity of proofs in program verification. This power is enhanced with the design of an innovative proof assistant prototype, the Galculator.

Matos M.  2009.  Network-Aware Epidemic Broadcast. Abstractmasterthesis.pdf

Epidemic multicast is an emerging resilient and scalable approach to the reliable dissemination of application data in the context of very large scale distributed systems. Unfortunately, the resilience and scalability come at the cost of considerable redundancy which led to high resource consumption on both links and nodes. In environments with resource constrained links, such as in Cloud Computing infrastructure composed by data centers organized in a federation around the globe, the high resource consumption precludes the use of this class of protocols. The goal of this dissertation is therefore to cope with the constraints of these scenarios, by reducing the network load imposed on the constrained long distance links. This is achieved by constructing an overlay that reflects the characteristics of the links, and by using a dissemination protocol that takes into account locality when transmitting the message payloads. We conducted an extensive experimental evaluation that presents an improvement over an order of magnitude in the number of messages that traverse the costlier links, without endangering the resilience and scalability properties that make epidemic based protocols so attractive.

Nunes A.  2009.  P2P content-push in the Internet. AbstractP2P_content-push_in_the_Internet

Syndicated content-push in the Internet is a huge success, and web feeds are being used to convey various types of content: from news headlines, to podcasts and video podcasts, to being a feature in Web 2.0 websites. This diversity lead to the appearance of several frameworks, each tailored to a specific content type. At the same time, interest in social networking exploded, as more and more websites of this purpose were launched. Syndicated content and social networking websites are now intimately connected.
In this work, a generic, modular, p2p content-push architecture is proposed. It provides an evolutionary upgrade path based on the technologies already in use in the Internet for interoperability, thus without disrupting current infrastructure or changing participants’ habits. It also leverages social networks for content discovery and aggregation, using peer-to-peer protocols for distribution. A partial implementation of this architecture, dubbed SEEDS, is also presented.

Abreu R.  2009.  Spectrum-based fault localization in embedded software. Abstractthesis_rui_abreu.pdf

Modern daily devices such as televisions rely increasingly on (embedded) software. Features implemented in software are often cheaper, easier, flexible to future modifications, and more portable than when implemented in hardware. Such properties are extremely important as, nowadays, many devices serve no single purpose but, instead, have several functionalities which need to be easily modified or upgraded to adhere to the high expectations of the consumers. As an example, a mobile phone is used not only to make phone calls but also, e.g., as a navigation system which has to contain the most upto-date navigation engine and maps.

Rodrigues N.  2008.  Slicing techniques applied to architectural analysis of legacy software. Abstracttese.pdf

Program understanding is emerging as a key concern in software engineering. In a situation in which the only quality certificate of the running software artifact still is life-cycle endurance, customers and software producers are little prepared to modify or improve running code. However, faced with so risky a dependence on legacy software, managers are more and more prepared to spend resources to increase confidence on — i.e., the level of understanding of — their (otherwise untouchable) code. In fact the technological and economical relevance of legacy software as well as the complexity of their re-engineering entails the need for rigour.
Addressing such a scenario, this thesis advocates the use of direct source code analysis for both the process of understanding and transformation of software systems. In particular, the thesis focuses on the development and application of slicing techniques at both the “micro” and “macro” structural levels of software.
The former, deals with fine-grained structures of programs, slicing operating over elementary program entities, such as types, variables or procedure identifiers. The latter, on the other hand, addresses architectural issues and interaction modes across modules, components or services upon which a system is decomposed. At the “micro” level this thesis delves into the problem of slicing functional programs, a paradigm that is gaining importance and was generally neglected by the slicing community. Three different approaches
to functional slicing are proposed, accompanied by the presentation of the HaSlicer application, a software tool developed as a proof-of-concept for some of the ideas discussed. A comparison between the three approaches, their practical application and the motivational aspects for keeping investivigating new functional slicing processes are also discussed. Slicing at a “macro” level is the theme of the second part of this thesis,
which addresses the problem of extracting from source code the system’s coordination model which governs interaction between its components. This line of research delivers two approaches for abstracting software systems coordination models, one of the most vital structures for software architectural analysis. Again, a software tool – CoordInspector – is introduced as a proof-of-concept.

Frade MJ.  2004.  Type-Based Termination of Recursive Definitions and Constructor Subtyping in Typed Lambda Calculi. Abstractthesis.pdf

In type systems, a combination of subtyping and overloading is a way to achieve more precise typings. This thesis explores how to use these mechanisms in two directions: (i) as a way to ensure termination of recursive functions; (ii) as a way to capture in a type-theoretic context the use of subtyping as inclusion between inductively defined sets.
The first part of the thesis presents a mechanism that ensures termination through types and defines a system that incorporates it. More precisely, we formalize the notion of type-based termination using a restricted form of type dependency (also known as indexed types). Every datatype is replaced by a family of approximations indexed over a set of stages; then being in a certain approximation means that a term can be seen as having a certain bound on constructor usage. We introduce λˆ, a simply typed λ-calculus à la Curry, supporting parametric inductive datatypes, case-expressions and letrec-expressions with termination ensured by types. We show that λˆ enjoys important meta-theoretical properties, including confluence, subject reduction and strong normalization. We also show that the calculus is powerful enough to encode many recursive definitions rejected by existing type systems, and give some examples. We prove that this system encompasses in a strict way Giménez' λς, a system in which termination of typable expressions is ensured by a syntactical condition constraining the uses of recursive calls in the body of definitions.
The second part of the thesis studies properties of a type system featuring constructor subtyping. Constructor subtyping is a form of subtyping in which an inductive type σ is viewed as a subtype of another inductive type τ if each constructor c of σ is also a constructor of τ (but τ may have more constructors), and whenever c : θ'→σ is a declaration for τ, then c : θ'→τ is a declaration for τ with θ'→≤θ'. In this thesis we allow for this form of subtyping in the system λcs, which is a simply typed λ-calculus à la Curry, supporting mutually recursive parametric datatypes, case-expressions and letrec-expressions. We establish the properties of confluence, subject reduction and decidability of type checking for this calculus. As the system features general recursion, the reduction calculus is obviously non-terminating. However, we sketch two ways of achieving strong normalization. One way is to constrain the system to guard-by-destructors recursion, following what is done for λς. The other way is to enrich the type system with stages (following the ideas presented for λˆ) and enforcing termination through typing. Potential uses of constructor subtyping include proof assistants and functional programming languages. In particular, constructor subtyping provides a suitable foundation for extensible datatypes, and is specially adequate to re-usability. The combination of subtyping between datatypes and overloading of constructors allows the definition of new datatypes by restricting or by expanding the set of constructors of an already defined datatype. This flexibility in the definition of datatypes induces a convenient form of code reuse for recursive functions, allowing the definition of new functions by restricting or by expanding already defined ones. We enrich a calculus featuring constructor subtyping with a mechanism to define extensible overloaded recursive functions by pattern-matching, obtaining the system λcs+fun. We formalize the concept of well-formed environment of function declarations and establish that under such environments the properties of confluence, subject reduction and decidability of type-checking hold. Moreover, we prove that the requirements imposed for the well-formed environments are decidable and show how standard techniques can still be used for compiling pattern-matching into case-expressions.

Pereira JO.  2002.  Semantically Reliable Group Communication. Abstractphd.pdf

Current usage of computers and data communication networks for a variety of daily tasks, calls for widespread deployment of fault tolerance techniques with inexpensive off-the-shelf hardware and software. Group communication is in this context a particularly appealing technology, as it provides to the application programmer reliability guarantees that highly simplify many fault tolerance techniques. It has however been reported that the performance of group communication toolkits in large and heterogeneous systems is frequently disappointing. Although this can be overcome by relaxing reliability guarantees, the resulting protocol is often much less useful than group communication, in particular, for strong consistent replication. The challenge is thus to relax reliability and still provide a convenient set of guarantees for fault tolerant programming. This thesis addresses models and mechanisms that by selectively relaxing reliability guarantees, offer both the convenience of group communication for fault tolerant programming and high performance. The key to our proposal is to use knowledge about the semantics of messages exchanged to determine which messages need to be reliably delivered, hence semantic reliability. In many applications, some messages implicitly convey or overwrite other messages sent recently before, making them obsolete while still in transit. By omitting only the delivery of obsolete messages, performance can be improved without impact on the correctness of the application. Specifications and algorithms for a complete semantically reliable group communication protocol suite are introduced, encompassing ordered and view synchronous multicast. The protocols are then evaluated with analytical and simulation models and with a prototype implementation. The discussion of a concrete application illustrates the resulting programming interface and performance.

Campos JC.  1999.  Automated Deduction and Usability Reasoning. Abstractycst-2000-09.pdf

Reasoning about the usability of a given interactive system's design is a difficult task. However it is one task that must be performed if downstream costs with extensive redesign are to be avoided. Traditional usability testing alone cannot avoid these costs since it too often is performed late in development life cycle. In recent years approaches have been put forward that attempt to reason about the usability of a design from early in development. Mainstream approaches are based on some kind of (more or less structured) inspection of a design by usability experts. This type of approach breaks down for systems with large and complex user interfaces, and again extensive testing is required. In an attempt to deal with these issues there have been attempts to apply software engineering reasoning tools to the development of interactive systems. The article reviews this work and puts forward some ideas for the future.

Pereira JO.  1998.  Comunicação Fiável em Sistemas Distribuídos em Grande Escala. Abstracttese.pdf

Protocolos de difusão fiável em grupos de processo são uma ferramenta de programação utilizada em sistemas distribuídos e confiáveis para isolar cada aplicação da complexidade decorrente da comunicação e das diversas faltas que podem ocorrer no sistema. Isto é conseguido oferecendo às aplicações uma abstração de um canal de difusão que oferece garantias muito fortes quanto à qualidade de serviço, o que é suficiente para garantir a correção de um largo espectro de aplicações com um esforço mínimo por parte do programador.
A concretização de protocolos de difusão fiável sobre redes em grande escala introduz um nível adicional de complexidade, pois as características destas redes dificultam a concretização correta e eficiente de serviços que oferecem garantias muito fortes.
Diversas propostas procuram ultrapassar este problema disponibilizando serviços que oferecem menos garantias de qualidade de serviço, mas que apresentam desempenho superior nas situações em que são utilizáveis. No entanto, quanto menos garantias se oferecerem, menos aplicações podem ser correctamente suportadas, uma vez que diferentes aplicações toleram diferentes relaxamentos da qualidade de serviço.
Nesta tese propõe-se um protocolo de difusão fiável que possa ser configurado para cada aplicação, de modo a poder ser simultaneamente eficiente e correcto porque adequado à aplicação em causa.
Para o efeito, começa-se por decompor a especificação de protocolos de difusão segundo um conjunto de parâmetros aplicáveis em separado a cada uma das mensagens de uma sessão, que podem ser combinados em especificações de protocolos adequados a cada aplicação.
Propõe-se então uma estratégia de concretização aberta orientada por objetos, que permite compor protocolos a partir de módulos independentes para cada mensagem, correspondentes aos parâmetros de especificação.

Almeida PS.  1998.  Control of object sharing in programming languages. Abstractpsa-phd-thesis.pdf

Current data abstraction mechanisms are not adequate to control sharing of state in the general case involving objects in linked structures. They only prevent the direct access to the state variables of single objects, as opposed to considering the state reachable by an object and the inter-object references, neglecting the fact that an object is not, in general, self-contained. The pervading possibility of sharing is a source of errors and an obstacle both to reasoning about programs and to language implementation techniques.
This thesis presents balloon types, a general extension to programming languages which makes the ability to share state a first class property of a data type, resolving a long-standing flaw in existing data abstraction mechanisms. Balloon types provide the balloon invariant, which expresses a strong form of encapsulation of state: it is guaranteed that no state reachable (directly or transitively) by an object of a balloon type is referenced by any `external' object.
The mechanism is syntactically very simple, relying on a non-trivial static analysis to perform checking. The static analysis is presented as an abstract interpretation based on a denotational semantics of a simple imperative first-order language with constructs for creating and manipulating objects.
Balloon types are applicable in a wide range of areas such as program transformation, memory management and distributed systems. They are the key to obtaining self-contained composite objects, truly opaque data abstractions and value types---important concepts for the development of large scale, provably correct programs.

Frade MJ.  1995.  Comportamento e Estado. Abstractcompest.pdf

In this work we introduce a formalism for specifying the behaviour of computational systems, based on logic.
Our basic unit of software specification is called agent and is nothing more than a particular logical system. The agent logical language allow us to express the relationship between the occurrence of events and the properties an agent exhibits. The events are seen as logical connectives and the logical assertions reflect the causality between events and properties - the essential cause of a property in face of an event.
Because we are interested in knowing which sequence of events “justify” a property, we have traces coming from the past to the present. Our perspective is that of knowing which behaviour could justify the characteristics of au agent in some instant. Obviously, these characteristics will condition the future behaviour of the agent. This dependency is expressed in a deduction system that reflects the notion of causality.
For the purpose of characterizing the agents semantically a connection is made between the logical system and its topological models. The Stone duality provides a junction between the logical system and its semantics and it provides two alternative perspectives:
the algebraic or logical perspective and the topological perspective.
We present two types of models for the agents: the trace model, a denotational model intimately connected to the notion of event and trace, and the state model, au operational model based on the notions of state and state transitions.
We establish a relation between these two types of models: we see how a state model may be obtained from a trace model. The states arise as equivalence classes of past behaviours. We also study the properties of the state models which result from the conversion of trace models. We see how topological properties — such as separation conditions, soberty and coherence — characterize the state space of these modeis, and how from behaviour we can generate a minimal state space.