Labelling user data is a central part of the design and evaluation of pervasive systems that aim to support
the user through situation-aware reasoning. It is essential both in designing and training the system to recog- nize and reason about the situation, either through the definition of a suitable situation model in knowledge- driven applications, or though the preparation of training data for learning tasks in data-driven models. Hence, the quality of annotations can have a significant impact on the performance of the derived systems. Labelling is also vital for validating and quantifying the performance of applications. With pervasive systems relying in- creasingly on large datasets for designing and testing models of users’ activities, the process of data labelling is becoming a major concern for the community. This also reflects the increasing need of intelligent inter- active annotation tools, which can reduce the manual annotation effort and improve the annotation perfor- mance and quality in large datasets.
To address these problems, this year’s workshop has a particular focus on:
- intelligent and interactive tools and automated methods for annotating large user datasets.
Furthermore, we aim to address the general problems of:
- the role and impact of annotations in designing pervasive applications,
- the process of labelling, and the requirements to produce high quality annotations, especially in thecontext of large datasets.
The goal of the workshop is to provide a ground for researchers from interdisciplinary backgrounds to reflect on their experiences, challenges, and possible resolutions of the related problems.
We invite you to submit papers with a maximum of 6 pages that offer new empirical or theoretical insights on the challenges and innovative solutions associated with labelling of user data, as well as on the impact that labelling choices have on the user and the developed system. The topics of interest include, but are not limited to:
- methods and intelligent tools for annotating user data for pervasive systems;
- processes of and best practices in annotating user data;
- methods towards an automation of the annotation;
- improving and evaluating the annotation quality;
- ethical issues concerning the annotation of user data;
- beyond the labels: ontologies for semantic annotation of user data;
- high-quality and re-usable annotation for publicly available datasets;
- impact of annotation on a ubiquitous and intelligent system’s performance;
- building classifier models that are capable of dealing with multiple (noisy) annotations and/or makinguse of taxonomies/ontologies;
- the potential value of incorporating modelling of the annotators into predictive models.
Furthermore, we encourage the discussion on the challenges and requirements for annotating textual resources so that they can be automatically interpreted and utilised by ubiquitous applications.
Examples of such resources are textual instructions such as recipes and manuals, natural language conversations, and social media posts.
Example datasets are listed below:
- http://www.wikihow.com (textual instructions)
recipes (cooking recipes)
- http://www.cs.cmu.edu/~enron/ (emails NLP dataset)
html (blog and email datasets)
sentiment-analysis-training- corpus-dataset-2012-09-22/ (twitter sentiment analysis training corpus)
search/docs/data-types/recipes (annotated recipes for search in Google)