Working towards more universal skin cancer identification with AI 

JGI Seed Corn Funding Project Blog 2023/24: James Pope

9 examples of malignant/benign cancer marks on different skin types
Images from the International Skin Imaging Collaboration (https://www.isic-archive.com/

Introduction

Open-source skin cancer datasets contain predominantly lighter skin tones potentially leading to biased artificial intelligence (AI) models. This study aimed to analyse these datasets for skin tone bias. 

What were the aims of the seed corn project? 

The project’s aims were to perform an exploratory data analysis of open-source skin cancer datasets and evaluate potential skin tone bias resulting from the models developed with these datasets.  Assuming biases were found and time permitting, a secondary goal was to mitigate the bias using data pre-processing and modelling techniques. 

What was achieved? 

Dataset collection

The project focused on the International Skin Imaging Collaboration (https://www.isic-archive.com/) archive that contains over 20 datasets totalling over 100,000 images.  The analysis required that the images provide some indication of skin tone.  We found that only 3,623 recorded the Fitzpatrick Skin Type on a scale from 1 (lighter) to 6 (darker).  For each image, we mapped the Fitzpatrick Skin Type to light or dark skin tone.  As future work, the project began exploring tone classification techniques to expand the images considered. 

Artificial Intelligence Modelling

We then developed a typical artificial intelligence model, specifically a deep convolutional neural network, to classify whether the images are malignant (i.e. cancerous) or benign. The model was trained from 2/3 of the images and evaluated in the remaining 1/3.  Due to computational limits, the model was only trained for 50 epochs. The model’s accuracy (how many correct classifications it made of either benign or malignant tumours out of all the tumours it was evaluated on) was comparatively poor with only 82%. 

Bias Analysis

The model was then evaluated relative to light and dark skin tones.  We found that the model was better at identifying cancer in light versus dark skin tone images.  The recall/true positive rate for dark skin tones was 0.26 while for light skin tones it was 0.45.  The resulting disparate impact (a measure used to indicate if a test is biased for certain groups) was found to be 0.58, which indicates the model is potentially biased.

Future plans for the project 

The project results were limited due to the subset of images with skin tone and constrained computational resources.  Future work is to further develop the tone classifier to expand the number of labelled images. Converting colour values from images into values more closely related to skin tone and then comparing with the tone labels of the image, might help train an AI model to exclude the tumour itself when classifying skin tone of the whole image. This is important as we know that the tone of tumours themselves is often different to that of the surrounding skin.

Heat map showing where the skin tone matches the label
An example image from ISIC which had its Fitzpatrick Skin Type labelled. The light green indicates where individual pixels correspond with expected colours associated with the labelled skin type. Notice that the centre of the image, where the tumour is, does not match.

More powerful computational resources will be acquired and used to sufficiently train the model.   Future work will also employ explainable AI techniques to identify the source of the bias. 


Contact details and links 

James Pope: https://research-information.bris.ac.uk/en/persons/james-pope,

Ayush Joshi https://research-information.bris.ac.uk/en/persons/ayush-joshi,