Home »


Datasets and Resources


HDA Dataset: High Definition Analytics Camera Network at ISR, July 2013. Dataset page.
HDA is a multi-camera high-resolution image sequence dataset for research on high-definition surveillance. 18 cameras (including VGA, HD and Full HD resolution) were recorded simultaneously during 30 minutes in a typical indoor office scenario at a busy hour (lunch time) involving more than 80 persons. Each frame is labeled with Bounding Boxes tightly adjusted to the visible body of the persons, the unique identification of each person, and flag bits indicating occlusion and crowd.
The following figures show examples of labeled frames: (a) an unoccluded person; (b) two occluded people; (c) a crowd with three people in front.

Cleaning tasks dataset, September 2017. Download link: tar.gz (200MB)

In this work, the robot will automatically clean a table from different types of dirt (scribbles, lentils). In order to create the cleaning trajectory, the system needs three waypoints. The information of the waypoints is automatically extracted from robot camera images, using a deep neural network previously trained on images taken during human demonstrations (kinesthetic teaching) of a cleaning task.

The dataset for the network is composed by 1000 cleaning demonstrations (500 for wiping off scribbles and 500 for sweeping the lentils). The labeling consists of 2D positions (x, y) and orientations (sin, cos) of initial, intermediate and final waypoints. For further information, please contact j.kim at sssup dot it.

ACTICIPATE Ball Placing and Giving Dataset, September 2017. Download link: zip (1.4 GB), md5 checksum.

ACTICIPATE project, synchronised motion capture and gaze data with labeled video of actor performing ball placing and giving actions. For further information, please contact rakovicm at uns dot ac dot rs.

Hand Posture Affordances Dataset, September 2017. Available on this GitHub repository.

iCub performing actions on objects with different hand postures

This dataset contains results of trials in which the robot executes different actions with multiple hand postures on various objects. In the image we show the experimental setup, with the iCub humanoid robot at the beginning of a robot–object interaction trial, and the visual perception routines in the background screen. For more information, please contact gsaponaro at isr dot tecnico dot ulisboa dot pt.
If you use this dataset in your work, please cite the following publication(s):

  • Giovanni Saponaro, Pedro Vicente, Atabak Dehban, Lorenzo Jamone, Alexandre Bernardino, José Santos-Victor. Learning at the Ends: From Hand to Tool Affordances in Humanoid Robots. IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob 2017).

A Dataset on Visual Affordances of Objects and Tools, September 2016. Available on this GitHub repository.

iCub with objects and tools

This dataset contains results of trials in which the iCub humanoid robot executes different motor actions with multiple tools onto various objects. In total, there are 11 objects, 4 actions, 3 tools and at least 10 repetitions of each trial, which sums up to ~1320 unique trials and 5280 unique images of resolution 320×200 pixels. The images are captured from left and right cameras of the robot before and after executing a successful action. We also provide the foreground segmented images of each trial. Moreover, the 3D position of the object together with some extracted features are also available. For more information, please contact adehban at isr dot tecnico dot ulisboa dot pt.
If you use this dataset in your work, please cite the following publication(s):

  • A Moderately Large Size Dataset to Learn Visual Affordances of Objects and Tools Using iCub Humanoid Robot. Dehban, A., Jamone, L., Kampff, A. R. and Santos-Victor, J. (2016). 1st Workshop on Action and Anticipation for Visual Learning, European Conference on Computer Vision (ECCV).

CAD models (PDMS molds) for tactile sensors of Vizzy humanoid robot, September 2016. Download link: tactile_sensors_pdms_molds.zip.

KS20 VisLab Multi-View Kinect skeleton dataset, Athira Nambiar, 2017.

The KS20 VisLab Multi-View Kinect skeleton dataset is a set of Kinect skeleton (KS) data sequences comprised of 300 skeletal gait samples, collected from 20 walking subjects, in the context of long-term person re-identification using biometrics. The dataset is composed of the skeleton sequences of 20 subjects walking along 5 different view-points (0o, 30o, 90o, 130o, 180o) collected in-house using Kinect v2.

 directions skeleton

SEAGULL Dataset: Multi-camera multi-spectrum Ocean images from UAV point of view, 2013 – 2015. Dataset page.

A multi-camera multi-spectrum (visible, infra-red, near infra-red and hyperspectral) image sequences dataset for research on sea monitoring and surveillance. The image sequences are recorded from the point of view of a fixed wing unmanned aerial vehicle (UAV) flying above the Atlantic Ocean and over boats, life rafts and other objects, as well as over fish oil spills.

Improved annotation for the INRIA person data set, Matteo Taiana, 2014
The INRIA person data set is very popular in the Pedestrian Detection community, both for training detectors and reporting results. Yet, its labelling has some limitations: some of the pedestrians are not labelled, there is no specific label for the ambiguous cases and the information on the visibility ratio of each person is missing. We collected a new labelling that overcomes such limitations and can be used to evaluate the performance of detection algorithms in a more truthful way. It also allows researchers to test the influence of partial occlusion and pedestrian height during training training on the detection performance. The labels are encoded in the format used for the Caltech Pedestrian Detection Benchmark, in two vbb files.

Related papers: Neurocomputing 2014, IbPRIA 2013. Related video talk.

Real-World Data for 3D Reconstruction Algorithms: Tracked features in uncalibrated images, Etienne Grossmann, July 1999