Other Projects


These are projects in which VisLab is involved, ordered by starting date (descending), funded by non-European agencies.

SPARSIS — Sparse Modeling and Estimation of Motion Fields

Funding: FCT. Reference: PTDC/EEIPRO/0426/2014. Dates: July 2016 – June 2019. Website: http://users.isr.ist.utl.pt/~jsm/SPARSIS/

Recently, we have proposed an insightful description for the motion of pedestrians and vehicles in images based motion fields. Since the motion of pedestrians and vehicles is unpredictable, to a certain extent, we assume that a single motion field is not enough to represent a wide variety of patterns and, therefore, adopt multiple motion fields. In order to increase the model flexibility, each object is allowed to switch from one motion regime to another, according to switching probabilities computed from the video data.

In this project we will explore the use of sparse techniques to estimate multiple motion fields as well as the space-varying switching matrix (stochastic matrix) that describes the switching process associated to the movement of each target. We will also consider the use of sparse representations in the analysis of other signals and dynamical systems.

INSIDE — Intelligent Networked Robot Systems for Symbiotic Interaction with Children with Impaired Development

Funding: CMU-Portugal. Reference: CMUP-ERI/HCI/0051/2013. Dates: Aug. 2014 – 2018. Website: http://gaips.inesc-id.pt/inside

The INSIDE initiative explores symbiotic interactions between humans and robots in joint cooperative activities, and addresses the following key research problems:

How can robots plan their course of action to coordinate with and accommodate for the co-actions of their human teammates? In particular, is it possible to foresee, at planning time, the possibility of having the robots assist humans in actions that the latter cannot perform single-handedly (due to the complexity of the action or to limitations of the human agent) as well as having the humans assist the robots in those actions that the robots cannot perform on their own (for example, due to perceptual or actuation limitations)?

How can task, context and environment information collected from a set of networked sensors (such as cameras and biometric sensors) be exploited to create more natural and engaging interactions between humans and robots involved in a joint cooperative activity in a physical environment?

AHA — Augmented Human Assistance

Development of novel robotic assistance platforms to support healthy lifestyle, sustain active aging, and support those with motor deficits

Funding: CMU-Portugal. Reference: CMUP-ERI/HCI/0046/2013. Dates: Aug. 2014 – July 2018. Website: http://aha.isr.tecnico.ulisboa.pt/

Aging and sedentarism are two main challenges for social and health systems in modern societies. To face these challenges a new generation of ICT based solutions are being developed to promote active aging, prevent sedentarism and find new tools to support the large populations of patients that suffer chronic conditions as result of aging. Such solutions have the potential to transform healthcare by optimizing resource allocation, reducing costs, improving diagnoses and enabling novel therapies, thus increasing quality of life. In this context, the “AHA: Augmented Human Assistance” project proposes the development and deployment of a novel Robotic Assistance Platform designed to support healthy lifestyle, sustain active aging, and support those with motor deficits.

BIOMORPH — Bio­-Inspired Self­-Organizing Robot Computational Morphologies

Funding: FCT. Reference: XPL/EEI­AUT/2175/2013. Dates: Feb. 2014 – July 2015. Website: n/a


Micro­ and mini­robots are raising significant interest in embedded applications. Due to their small size and scarce energy availability, a great constraint on theses systems is the amount of computation they can do. Many simple biological systems are able to survive and exhibit advanced behaviour with very limited neuronal resources due to very adapted sensorimotor systems to their particular environment. Following the same paradigm, and inspired in some solutions found in biological systems, we are working to provide robots with highly optimized sensorimotor processing systems through the joint optimization of their different subsystems.

SEAGULL — Sistemas Inteligentes de Suporte ao Conhecimento Situacional Marítimo Baseados em Veículos Aéreos Não Tripulados

Intelligent Systems to Support Maritime Situational Knowledge Based on Unmanned Aerial Vehicles

Funding: ADI — Agência de Inovação. Reference: contract number 20131034063. Dates: July 2013 – Dec. 2015. Website: http://www.criticalsoftware.com/pt/seagull


See also the Seagull dataset page.

Portugal is a maritime nation having the largest exclusive economic zone (EEZ) of Europe. Upon acceptance of the proposal submitted to the United Nations (UN), the Portuguese EEZ will be extended to an area larger than 2 million square kilometers. In addition, Portugal is known as a specialist in maritime matters not only by historical reasons but also due to its contemporary activities on the sea. It is crucial for Portugal to maintain the safety of human life at sea, maintain the safety of its maritime area, and protect the sea and coastline natural environments. The seagull project consists on the use of unmanned air vehicles (UAV’s) as a not only viable but also cheaper solution to help with these difficult tasks on such a large area. The objective is to develop an intelligent system that couples the UAV with sensors like visual, infra-red and hyperspectral cameras along with GPS receivers and other sensors. Its purpose is to fly autonomously over the sea while scanning it to automatically detect, identify, report and follow targets like boats, shipwrecks, lifeboats, boats with suspicious behavior, and also hydrocarbyls and other chemical pollutants.

HD-Analytics — New Video Analytics Tools That Will Take Full Advantage of HD

Funding: QREN. Reference: 13750?. Dates: March 2011 – ?. Website: http://www.hd-analytics.net

HD-Analytics will address two different sets of objectives: vertical objectives, which will result in concrete video analytics applications for high resolution multi-megapixel video, and horizontal objectives, which will create a lower-level basis for the vertical objectives.

AuReRo — Human-Robot Interaction with Field Robots Using Augmented Reality and Interactive Mapping

Funding: FCT. Reference: PTDC/EIA-CCO/113257/2009. Dates: Apr. 2011 – March 2014. Website: http://aurero.isr.ist.utl.pt

Field robotics is the use of sturdy robots in unstructured environments. One important example of such a scenario is in Search And Rescue (SAR) operations to seek out victims of catastrophic events in urban environments. While advances in this domain have the potential to save human lives, many challenging problems still hinder the deployment of SAR robots in real situations. This project tackles one such crucial issue: effective real time mapping. To address this problem, we adopt a multidisciplinary approach by drawing on both Robotics and Human Computer Interaction (HCI) techniques and methodologies.

MAIS+S — Multiagent Intelligent Surveillance System

Funding: CMU-Portugal. Reference: CMU-PT/SIA/0023/2009. Dates: Sept. 2010 – Aug. 2013. Website: http://gaips.inesc-id.pt/mais-s/

With the generalized use of “intelligent technology”, the interaction between multiple smart devices poses interesting challenges both in terms of engineering and research. One interesting aspect of this phenomenon in the context of MAIS+S is the appearance of networks of heterogeneous devices that must operate in a fully distributed manner while sharing information necessary to complete some preassigned task. In MAIS+S, such complex networks are modelled as multiagent systems, by interpreting individual nodes or groups of nodes with autonomous agents. This interpretation suggests several interesting research avenues, some of which are the focus of the project.

DCCAL — Discrete Cameras Calibration using Properties of Natural Scenes

Funding: FCT. Reference: PTDC/EEA-CRO/105413/2008. Dates: Jan. 2010 – Dec. 2012. Website: http://users.isr.ist.utl.pt/~jag/project_dccal/

The learning capability of the human’s vision system clearly contrasts with current calibration methodologies of artificial vision systems, which are still strongly grounded to the a priori knowledge of the projection models. Usually the imaging device is assumed to have a known topology and in many cases to have a uniform resolution. This knowledge allows detecting and localizing edges, extrema, corners and features in uncalibrated images, just as well as on calibrated images. In short, it is undeniable that locally, and for many practical purposes, an uncalibrated image looks just like a calibrated image. This project departs from traditional computer vision by exploring tools for calibrating general central imaging sensors using only data and variations in the natural world. In particular we consider discrete cameras as collections of pixels, photocells, organized as pencils of lines with unknown topologies. Note that distinctly from common cameras, discrete cameras can be formed just by some sparse, non regular, sets of pixels. The main idea is that photocells that view the same direction of the world will have higher correlations. As distinct from many conventional calibration methods in use today, calibrating discrete cameras requires moving them within a diversified (textured) natural world. This project follows approaches based on information theory and computer learning methodologies.

3D Modelling from Video

Funding: FCT. Reference: SAPIENS PROJECT 34121/99. Dates: 2000 – 2003.


This project proposes to develop a methodology for video coding by using intermediate 3D representations In other words we propose to code a sequence of video images into a 3D representation and  then generate back another sequence. The main objective is to create a high compression rate and make possible mechanisms for video indexing and interactive video.

The main topics we propose to tackle are: (1) Image to image matching, (2) image to model matching and (3) 3D modeling and generation of images from the 3D model.
The main idea is to create a feedback loop of “a priori” knowledge provided by the existent 3D scene model into the matching process, in a global way. By formalizing image to image and image to model matching as an integer programming problem which is then relaxed to a concave programming problem, authors believe the whole process of matching and 3D reconstruction can be integrated into a single recursive framework.

SIVA — Integrated System for Active Surveillance

Funding: FCT. Reference: PRAXIS 2/2.1/TPAR/2074/95. Dates: Jan. 1998 – Apr. 2001. Partners: ISR-Coimbra Pole.


With this project we aim to develop an autonomous system able to perform surveillance tasks in a structured environment. The system as a whole will be made up of several mobile platforms (“agents”). Each of them will be equipped with an active vision system, an odometry system and communication links enabling the platforms to communicate among themselves. As a whole the full system (including all the platforms) will have a high degree of autonomy. Each of the platforms will also be autonomous in terms of the navigation and surveillance tasks. The system will be designed to operate off-hours in structured environments such as supermarkets, military installations, public buildings, power plants, etc.. The implication of its use during off-hours is that the primary condition for the detection of an intruder will be the occurrence of motion. Each one of the “agents” will perform the surveillance tasks based on the active vision system. Navigation will also be performed based on the active vision system and on the odometry system. The environment map will be known “a priori” and landmarks both natural and artificial will be used for localization. Both navigation and surveillance do not have to be performed with accuracy. Indeed each “agent” will only be concerned with not repeating its trajectory so that all the map area is covered periodically. Agents will have to cooperate so that they do not survey the same area simultaneously. Therefore there will be communication links enabling the “agents” to inform one another from their activities. Cooperation will occur at several levels namely to ensure that the “agents” survey different areas, to avoid collisions and to enable the execution of common taks such as pursuing an intruder.
We propose to demonstrate this system by using two mobile and active “agents” that cooperate in a structured environment.

Visual Behaviours for Mobile Robots

Funding: JNICT. Reference: PBIC/C/TPR/2550/95. Dates: Jan. 1996 – Dec. 1998.


One of the major limitations of the current robotics systems is their reduced capabilities of perceiving the surrounding space. This limitation determines the maximum complexity of the tasks they may perform, and reduces the robustness when the current tasks are performed.
Therefore, by increasing the perceptual capabilities, these systems could react to environmental changes, and accomplish the desired tasks. For many living species, namely the human being, visual perception plays a key role in their behavior. Very often we rely intensively on our visual capabilities to move around in the world, track moving objects, handle tools, avoid obstacles, etc.
To improve the flexibility and robustness of robotics systems, this project aims at studying and implementing Computer Vision techniques in various tasks for Mobile Robotic Systems. The goal is to study not only the visual perception techniques, per se, but also to explore the intimate relationship between perception and the control of action : the Perception-Action cycle.
For many years, most research efforts on Computer Vision for Robotic Agents were focused on recovering a simbolic model of the surrounding environment. This model could then be used by higher level cognitive systems to plan the actions for the agent, based on the pursued goals and the world state. This approach, however, has revealed many problems in dealing with dynamic environments where unpredictable events may occur.
More recent approaches, trying to achieve robust operation in dynamic, weakly structured environments, consider a set of behaviours, in which perception (vision in this case) and action are tightly connected and mutually constraining, similarly to many successful biological systems. Therefore visual information is fed directly into the various control systems of the different behaviours, thus leading to a more robust performance.
Specifically, in this project, an Architecture for Visual Behaviours for a mobile vehicle equipped with an agile camera, will be considered. Each behaviour allocates the perception and action resources strictly needed for the associated task, such as detecting and tracking a moving target, detecting interesting points in the scene, docking to a specific point, detecting obstacles, navigating along corridors, self- localization, etc. The robustness and performance of the overall system emerges from the coordination, integration and competition between these various visual behaviours.

Medusa Stereo Head





Medusa is a stereo head designed for research on active vision applications. One of our major interest areas is the link between perception and action and the continuous use of vision for the control of autonomous agents. Medusa has 4 mechanical degrees of freedom:

  • Two independent vergences
  • Common tilt
  • Common pan

Additionally, the interocular distance (baseline) can be set manually, which is a useful feature when working with different ranges of depth.

We have implemented some of the main ocular movements available in the human oculomotor system: saccades, smooth pursuit, vergence, etc. The maximum speeds and accelerations are comparable to those of the human eye-system (over 200 deg per second and 6 pi rad/s^2).


Funding: ?. Reference: ?. Dates: ?.



GPOD stands for Ground Plane Obstacle Detection. This system uses vision algorithms to detect obstacles in front of a mobile robot, lying on a flat ground floor. The methods we use are based on stereo algorithms. However, we use a single camera and a mirror set, in such a way that we grab two half-images of two virtual cameras of a traditional stereo setup. The advantages are increased compactness and robustness (since we do not have to bother about different camera settings).

The main features are the following:

  • Purposive approach for obstacle detection
  • Use of a single camera and a set of mirrors instead of a traditional stereo setup.
  • Qualitative measurements (depth is not determined explicitly).
  • Initialization phase to determine the expected disparities of the ground floor points.
  • Real-time (1.5 Hz!) obstacle detection by checking inconsistencies of the expected disparity map.

One of the main options in the system design was the definition of a purposive visual behavior, exclusively tailored for the problem in hand (detecting obstacles over the ground floor). We do not recognize the object nor even determine depth explicitly.


Cyclope is a system developed for vision-based obstacle avoidance for mobile robots. The underlying philosophy is to design a purposive visual behavior solely for obstacle avoidance in indoor scenes. We rely on the assumption that the robot is maneuvering on a flat ground plane.

The main features are the following:

  • Use of single camera.
  • Use of the ground plane constraint.
  • Use of the normal flow as sensor stimulus (there is no need to deterine both components of the optical flow field).
  • No calibration needed (camera orientation or parameters).
  • Reactive control strategy.

The system is up and running at a speed of about 2 Hz (depending on system settings) and has been tested on top of a mobile platform for obstacle avoidance.