Designing ground truth for Machine Learning - conceptualisation of a collaborative design process between medical professionals and data scientists
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Documents
- Fulltext
Accepted author manuscript, 542 KB, PDF document
The development of Machine Learning (ML) models is a complex process consisting of several iterative steps like problem definition, data collection and processing, feature engineering, model training, and evaluation. While the amount of research on ML model development is growing, little is known about the design process of ground truth in datasets that serve as the backbone of many ML-based systems. Design choices made before the labelling process often become invisible, and the ground truth becomes an infrastructural part of the data, which prevents it from being inspected in the event of problems at the later stages of the data science cycle. I conducted observations of the collaborative work of radiologists and data scientists on ground truth design. I report on the adopted process divided into three stages: Stage 1 - assessment of data requirements and labelling practices; Stage 2 - design and evaluation of label structure; and Stage 3 - design and evaluation of labelling tool. Moreover, I introduce two activities of Stage 2: ideation and stress test to design high-quality labels. At last, I pose outstanding questions to unpack the tensions and motivations observed during the ethnographic work.
Original language | English |
---|---|
Title of host publication | Proceedings of the 20th European Conference on Computer-Supported Cooperative Work: The International Venue on Practice-centred Computing on the Design of Cooperation Technologies - Posters and Demos |
Number of pages | 9 |
Publisher | European Society for Socially Embedded Technologies |
Publication date | 2022 |
DOIs | |
Publication status | Published - 2022 |
Event | 20th European Conference on Computer-Supported Cooperative Work - Coimbra, Portugal Duration: 27 Jun 2022 → 1 Jul 2022 |
Conference
Conference | 20th European Conference on Computer-Supported Cooperative Work |
---|---|
Land | Portugal |
By | Coimbra |
Periode | 27/06/2022 → 01/07/2022 |
Series | Reports of the European Society for Socially Embedded Technologies |
---|---|
Number | 2 |
Volume | 6 |
ISSN | 2510-2591 |
Number of downloads are based on statistics from Google Scholar and www.ku.dk
No data available
ID: 362456226