tumor image

Workshop on cloud-based evaluation approaches

Overview

Many data analysis techniques and machine learning approaches, developed over the past 30 or more years, are increasingly important and feasible in the context of data science and big data. However, comparing approaches is not always easy as the data are often not shared and the implementations are not always reproducible. Typically, publications contain just a mathematical description of the method and the data and do not sharing the specific implementation. The goal of this workshop is to consider evaluation as a service (EaaS) for benchmarking of data analysis tools, sustainability and extensibility of such services to facilitate reproducibility and advance biomedical sciences.

In several fields such as medicine, information retrieval and others, challenges or competitions have been developed to allow for a comparison of approaches to advance techniques. These are often held in the context of workshops (TREC - information retrieval) or at existing conferences (MICCAI - medical imaging). More recently, platforms such as Kaggle have enabled companies to crowd source their challenges with data and prize money and have the elicited great interest from research community in undertaking these challenges. These algorithmic challenges often have a training phase where both the data and the optimal solutions are made available to the participants and a test phase where the data but not the solutions are shared. Participants typically download the data and upload the results to the platform.

Although these computational challenges have been successful in stimulating the scientific community and enhancing transparency, problems of reproducibility of the underlying research remain. Often, the data of interest cannot be shared due to reasons of confidentiality or legal liability or the data are too big to be downloaded or shipped on via hard disk. Furthermore, the availability of the test data can allow participants to optimize results for those data or use manual methods when evaluation of automatic methods were the intended. As a consequence such approaches lead to a low generalizability but surprisingly good results. Without sharing the implementation, the effectiveness of informatics tools for analyzing data cannot easily be validated by others.

The idea of providing evaluations as a service (EaaS) has created several approaches used in challenges to address these problems. These include keeping test data of challenges hidden from participants or have a trusted third party running the software delivered in the form of virtual machines or installable software. These approaches allow greater reproducibility and enable the challenges to persist for a longer period of time as the data and the algorithms can be retained and re-executed in different computing infrastructures or on different data. This has potential advantages for data providers, researchers and funding agencies as these techniques can maximize impact of data sets and tools developed in funded projects.

In this workshop, we will begin by discussing EaaS approaches currently being considered in a range of communities from information retrieval, image analysis, bioinformatics, space research and others. After discussing both technical and human factors aspects of these approaches, we will seek to identify best practices and some cautionary tales. We will also seek to identify some limitations of EaaS including the ability to include interactive tools or potential for distributed storage and consider potential solutions.

Another objective is to work on impact and sustainability of such approaches. Big data may potentially have big impact in many areas of science and we believe that it is important to create a framework to better leverage existing data and limit the propagation of incorrect or inadequate results in science that can occur through small or highly selected datasets.

Creating an infrastructure for such systems has the potential to change the way data science is conducted.

The workshop will include funding bodies, researchers and companies that provide the infrastructure and organize challenges, as well as scientists that have studied the social aspects of challenges including participant motivation and incentivization.

Workshop Organizers


Jayashree Kalpathy-Cramer, PhD
Athinoula A. Martinos Center for Biomedical Imaging, MGH, USA

Henning Müller, PhD
Athinoula A. Martinos Center for Biomedical Imaging, MGH, USA &
HES-SO, Sierre, Switzerland

Keyvan Farahani, PhD
National Cancer Institute, USA