Current projects

The project primary goal is to develop the next-generation maritime environment perception methods, which will harvest the power of end-to-end trainable deep models for essential challenges of safe operation like: general obstacle detection with re-identification, implicit detection of hazardous areas and sensor fusion for improved detection.
The challenge that we address in this project is a robust design of deep generative models, their training and application to a visual tracking scenario. We believe that a generative appearance model of the entire object is a crucial step towards grounding visual object tracking in high-level concepts behind raw pixel values.
Postdoctoral ARRS project.
Project duration: 2019 - 2021
Modd thumb.png
The project primary goal is to develop functionalities required for robust autonomous navigation of USVs in uncontrolled environments, primarily relying on the captured visual information. The project focuses on obstacle detection using monocular and stereo systems, development of efficient visual tracking algorithms for marine environments and environment representation through sensor fusion.
The objective of the proposed project is to develop novel deep learning methods for modelling complex consistency and detecting inconsistencies in visual data using training images annotated with different levels of accuracy.
ARRS Basic research project.
Project duration: 2018 - 2021
The research group is involved in basic research in computer vision, with emphasis on visually enabled cognitive systems involving visual learning and recognition. Topics include recognition and tracking of objects, scenes, and activities in visual cognitive tasks such as smart vision-based detection and positioning using wearable computing as well as for mobile robots and cognitive assistants.
Project duration: 2009-2014.

Past projects

The aim of the GOSTOP programme is to accelerate the development of the Factories of the Future concept in Slovenia. The main goal of our research is to develop flexible and adaptable technologies that would allow for fast and simple adaptation to a new product in the production process, mainly utilising machine vision and machine learning techniques.
Structural Funds Project.
Project duration: 2016 - 2020
The project addresses the use of computer vision algorithms for object recognition and augmented reality on smart mobile devices. PKP project, 2018 (pages in Slovenian).
Villard logo.png
The main goal of the project is to develop a framework for semi-supervised interactive incremental learning as well as specific methods for visual learning and recognition that will increase the quality and efficiency of large visual information databases maintenance.
Applied ARRS project.
Project duration: 2014 - 2017
3d carve blue.png
The project addresses the use of computer vision algorithms for contactless foot measuring techniques that facilitate a reliable online recommendation system for footwear purchasing.
PKP project, 2014 (pages in slovenian).
The project aims at a holistic approach towards learning, detection and recognition / categorisation of the visual motion and the phenomena derived from it. The approach is based on a novel and powerful paradigm of learning multi­layer compositional hierarchies. While individual ingredients, such as the hierarchical processing, compositionality and incremental learning, have already been subjects of a research, they have, to the best of our knowledge, never been treated in a unified motion­related framework. Such a framework is crucial for robustness, versatility, ease of learning and inference, generalisation, real­time performance, transfer of the knowledge, and scalability for a variety of cognitive vision tasks. ARRS Project. Project duration: 2011 - 2014.
Our challenge is to develop a methodology that would bridge the gap between the computer-centered low-level image features and the high-level human-centered semantic meanings.
ARRS project (J2-3607)
Project duration: 2010 - 2013.
The high level aim of this project was to develop a unified theory of self-understanding and self-extension with a convincing instantiation and implementation of this theory in a robot. By self-understanding we mean that the robot has representations of gaps in its knowledge or uncertainty in its beliefs. By self-extension we mean the ability of the robot to extend its own abilities or knowledge by planning learning activities and carrying them out. The project involved six universities and about 30 researchers.
POETICON is a research project in the Seventh framework programme, that explores the “poetics of everyday life”, i.e. the synthesis of sensorimotor representations and natural language in everyday human interaction.
We are developing methods for image interpretation on mobile platforms with the specific aim of direct interaction with smart objects.
The MOBVIS project identified the key issue for the realisation of smart mobile vision services to be the application of context to solve otherwise intractable vision tasks. In order to achieve this challenging goal, MOBVIS claimed that three components, (1) multi-modal context awareness, (2) vision based object recognition, and (3) intelligent map technology, should be combined for the first time into a completely innovative system - the attentive interface.
Visiontrain Project addressed the problem of understanding vision from both computational and cognitive points of view. The research approach was based on formal mathematical models and on the thorough experimental validation of these models. 11 academic partners worked cooperatively on a number of targeted research objectives: (i) computational theories and methods for low-level vision, (ii) motion understanding from image sequences, (iii) learning and recognition of shapes, objects, and categories, (iv) cognitive modelling of the action of seeing, and (v) functional imaging for observing and modelling brain activity.
The main goal of the project was to advance the science of cognitive systems through a multi-disciplinary investigation of requirements, design options and trade-offs for human-like, autonomous, integrated, physical (eg., robot) systems, including requirements for architectures, for forms of representation, for perceptual mechanisms, for learning, planning, reasoning and motivation, for action and communication.
The main objective of the EU FP5 project CogVis was to provide the methods and techniques that enable construction of vision systems that can perform task oriented categorization and recognition of objects and events in the context of an embodied agent.


The main goal of this project is development of computer-vision-based automated counters applicable to the domain of scyphistoma census in underwater imagery. Such counters are crucial for processing extremely large datasets, vastly reducing the required manual labor and facilitating census orders of magnitude grater than what is possible with today's semi-manual techniques. The methods apply learning-based methodology, allowing to train a general polyp counter applicable to a large variety of images as well as training for a specific type of images to maximize a task-specific detection performance.
Surf inspect.jpg
We have developed algorithms for robust and fast dent detection and characterization on reflective surfaces that does not require reference fringe-pattern images.
Visual tracking algorithms used for tracking with drones.