A real-time map-based traffic and air-quality dashboard for Bristol

JGI Seed Corn Funding Project Blog 2023/24: James Matthews

Example screenshot of the Bristol Air Quality and Traffic (AQT) Dashboard with Key
Example of the dashboard in use.

A reduction in urban air quality is known to be a detrimental to health, affecting many conditions including cardiorespiratory health. Sources of poor air quality in urban areas include industry, traffic and domestic wood burning. Air quality can be tracked by many government, university and citizen held pollution sensors. Bristol has implemented a clean air zone, but non-traffic related sources, such as domestic wood burning, are not affected.

The project came about through the initiative of Christina Biggs who approached academics in the school of Engineering Mathematics and Technology (Nikolai Bode) and the School of Chemistry (James Matthews and Anwar Khan) with a vision for an easy to use data dashboard that could empower citizens by drawing data from citizen science, university and council air quality and traffic sensors in order to better understand the causes of poor air quality in their area. The aims were to (1) work with community groups to inform the dashboard design (2) create an online dashboard bringing together air quality and traffic data (3) use council air quality sensors to enable comparison with citizen science sensors for validation and (4) to use this to identify likely sources of poor air quality.

An online dashboard was created using R with Shiny and Leaflet, collecting data using API code, and tested offline. The latest version of the dashboard has been named the Bristol Air Quality and Traffic (AQT) dashboard. The dashboard allows PM2.5 data and traffic numbers to be investigated in specific places and plotted as a time series. We are able to compare citizen sensor data to council and government data, and we can compare to known safety limits.

The dashboard collates traffic data from several sources including Telraam traffic report and Vivacity traffic data which provide information on car numbers from local sensors; and PM2.5 data from different sources including Defra air quality stations and SensorCommunity (previously named as Luftdaten) citizen air quality stations. Clicking onto a data point provides the previous 24 hour time series of measurements. For example, in the screenshots below, one Telraam sensor shows a clear PM2.5 peak during the morning rush hour of 26th June 2024 (a) which is likely related to traffic, while the second shows a higher PM2.5 peak in the evening (b) which could be related to domestic field burning, such as an outdoor barbecue. A nearby traffic sensor shows that the morning peak and smaller afternoon peak do agree with traffic numbers (c), but the evening peak might be unrelated. Data can be selected from historic data sets and is available to download for future interrogation.

Example of data output from the dashboard showing PM2.5 midnight to midnight on 26/06/2024
Figure (a) Example of data output from the dashboard showing PM2.5
Example of data output from the dashboard showing PM2.5 midnight to midnight on 26/06/2024
Figure (b) Example of data output from the dashboard showing PM2.5
Example of data output from the dashboard showing traffic measured using local Bristol sensors
Figure (c) Example of data output from the dashboard showing traffic measured using local Bristol sensors

It is a hope that these snapshots might provide an intuitive way for communities to understand the air quality in their location. Throughout the project, the project team held regular meetings with Stuart Phelps from Baggator, a community group based in Easton, Bristol, so that community needs were put to the forefront of the dashboard design.

We are currently planning a demonstration event with local stakeholders to allow them to interrogate the data and provide feedback that can be used to add explanatory text to the dashboard and enable easy and intuitive analysis of the data. We will then engage with academic communities to consider how to use the data on the dashboard to answer deeper scientific questions.


Contact details and links

Details of the dashboard can be found at the link below, and further questions can be sent to James Matthews at j.c.matthews@bristol.ac.uk

https://github.com/christinabiggs/Bristol-AQT-Dashboard/tree/main

Using Machine Learning to Correct Probe Skew in High-frequency Electrical Loss Measurements 

JGI Seed Corn Funding Project Blog 2023/24: Jun Wang & Song Liu

Introduction 

This project develops a machine learning approach to address the probe skew problem in high-frequency electrical loss measurements. 

 A pipeline using ML model to correct the probe skew in measuring a magnetic hysterysis loop. Model training and model deploy method shown
Fig.1 (main figure) A pipeline using ML model to correct the probe skew in measuring a magnetic hysterysis loop 

What were the aims of the seed corn project? 

To tackle the net-zero challenge through electrification, power electronic converters play an important role in modern electrical systems, such as electric vehicles and utility grids. Accurate characterisation of individual component’s loss is essential for the virtual prototyping and digital twins of these converters. Making loss measurements requires using two different probes, one voltage and one current, each with its own propagation delay. The difference in the delays between the probes, known as skew, causes inaccurate timing measurements which leads to incorrect loss measurements. Incorrectly measured loss will misinform the design process and the digital twin, which can lead to wrongly sized cooling component and potential failure of the converter systems in safety-critical applications, e.g. electric passenger cars.  

As the aim of this project, we proposed to develop a Machine Learning based solution learn from experimentally measured datasets and subsequently generate a prediction model to compensate the probe skew problem. This interdisciplinary project treats the challenge as an image recognition problem with special shape constraints. The goal is a tool developed for the engineering community which takes in raw measurements and outputs the corrected data/image.  

What was achieved? 

Joint research efforts were made by the interdisciplinary team across two schools (EEME and Mathematics) for this project with the following achievements: 

  1. We have explored the options and made a design choice to utilize an open-source database as the foundation for this project (MagNet database produced by PowerLab Princeton), which provides rich datasets of experimentally measured waveforms. We then have developed an approach to artificially augment the data to create training data for our ML model. 
  1. We successfully developed a shape-aware ML algorithm based on the Convolutional Neural Network to capture the shape irregularity in measured waveforms and find its complex correlation to the probe skew in nanoseconds. 
  1. We subsequently develop a post-processing approach to retrospectively compensate the skew and reproduce the corrected image/data.  
  1. We evaluated the proposed ML-based method against testing datasets, which demonstrated a high accuracy and effectiveness. We also tested the model on our own testing rig in the laboratory as a real-life use case. 
  1. We have developed a web-based demonstrator to visualise the process and showcase the correction tool’s capability to the public. The web demonstrator is hosted on Streamlit and accessible through this link
Snapshot of the web application demo showing phase shift prediction at 620 test index
Fig.2 Snapshot of the web application demo 

Future plans for the project 

This completed project is revolutionary in terms of applying ML to correct the imperfection of hardware instruments through software-based post-processing, in contrast to conventional calibration approaches using physical tools. This pilot project will initiate a long-term stream of research area leveraging ML/AI to solve challenges in power electronics engineering. The proposed method can be packaged into a software tool as direct replacement/alternative to commercial calibration tools, which cost ~£1000 each unit. Our plans for the next steps include 

  1. Create documentations for the approach and the pipeline 
  1. Write a conference/journal paper for dissemination 
  1. Explore the commercialisation possibilities of the developed approach 
  1. Further improve the approach to make it more versatile for wider use cases 
  1. Evaluate the approach more comprehensively by testing it on extended sets of data 

Contact details and links 

Dr Jun Wang, Jun.Wang@bristol.ac.uk 

Dr Song Liu, Song.Liu@bristol.ac.uk 

Web demo: https://skewtest-jtjvvx7cyvduheqihdtjqv.streamlit.app/ 

The project was assisted by University of Bristol Research IT.

Understanding the growth of human brains 

JGI Seed Corn Funding Project Blog 2023/24: James Armstrong

The human brain is a highly complex structure and an inaccessible organ to study, which has hampered our understanding of how the brain grows, becomes diseased, and responds to drugs. In the last ten years, a new method has been developed that uses stem cells to grow miniature brain tissues in the lab. These “brain organoids” have proven to be an incredibly useful tool for scientists studying the human brain. 

However, a well-known limitation of this tissue model is their unpredictable growth: within the same batch, some organoids will undergo typical neural development with large cortical buds (Figure 1A) while others will fail to produce these important structural features (Figure 1B). This Jean Golding Institute funded project sought to answer the question – can do seemingly identical stem cell cultures undergo such different growth? To this end, we aimed to track the growth of ~600 brain organoids over 20 days, then to use computer vision / machine learning methods to pick out key structural features that could be used to predict the tissue growth. 

Two brain organoids grown using different methods and showing the growth from day 3 to day 20
Figure 1. (A-B) Examples of two brain organoids, grown using the same methods, that were identical at day 3 but undergo very different growth. (C) An example of the images acquired during this project. 

This work was led by Dr James Armstrong, a Senior Research Fellow who runs a tissue engineering research group at Bristol Medical School (www.TheArmstrongGroup.co.uk). Members of his team (Martha Lavelle, Dr Aya Elghajiji, with help from Carolina Gaudenzi) have so far grown ~200 organoids, with another member of his team (Sammy Shorthouse) collecting microscopy images at ten intervals throughout the growth (Figure 1C). As expected, we saw tremendous variation in the growth of the brain organoids, in terms of their size, shape, and budding. Sammy has developed a program that takes these images and automatically processes them (indexing, identifying correct focal plane, centring, and cropping). He is now developing this script into a user-friendly “app”. For the next stages, Dr Qiang Liu in the Faculty of Engineering has been working with Sammy to develop computer vision methods that can pick out the key structural features of the organoids at the different stages of their growth. We are now growing the next batch of organoids and hope to reach the ~600 mark by the end of the summer. This should provide us with our target dataset, which should be large enough to start drawing links and making predictions of tissue growth. 


Contact details and links

If you wish to contact us about this study, please email james.armstrong@bristol.ac.uk

Working towards more universal skin cancer identification with AI 

JGI Seed Corn Funding Project Blog 2023/24: James Pope

9 examples of malignant/benign cancer marks on different skin types
Images from the International Skin Imaging Collaboration (https://www.isic-archive.com/

Introduction

Open-source skin cancer datasets contain predominantly lighter skin tones potentially leading to biased artificial intelligence (AI) models. This study aimed to analyse these datasets for skin tone bias. 

What were the aims of the seed corn project? 

The project’s aims were to perform an exploratory data analysis of open-source skin cancer datasets and evaluate potential skin tone bias resulting from the models developed with these datasets.  Assuming biases were found and time permitting, a secondary goal was to mitigate the bias using data pre-processing and modelling techniques. 

What was achieved? 

Dataset collection

The project focused on the International Skin Imaging Collaboration (https://www.isic-archive.com/) archive that contains over 20 datasets totalling over 100,000 images.  The analysis required that the images provide some indication of skin tone.  We found that only 3,623 recorded the Fitzpatrick Skin Type on a scale from 1 (lighter) to 6 (darker).  For each image, we mapped the Fitzpatrick Skin Type to light or dark skin tone.  As future work, the project began exploring tone classification techniques to expand the images considered. 

Artificial Intelligence Modelling

We then developed a typical artificial intelligence model, specifically a deep convolutional neural network, to classify whether the images are malignant (i.e. cancerous) or benign. The model was trained from 2/3 of the images and evaluated in the remaining 1/3.  Due to computational limits, the model was only trained for 50 epochs. The model’s accuracy (how many correct classifications it made of either benign or malignant tumours out of all the tumours it was evaluated on) was comparatively poor with only 82%. 

Bias Analysis

The model was then evaluated relative to light and dark skin tones.  We found that the model was better at identifying cancer in light versus dark skin tone images.  The recall/true positive rate for dark skin tones was 0.26 while for light skin tones it was 0.45.  The resulting disparate impact (a measure used to indicate if a test is biased for certain groups) was found to be 0.58, which indicates the model is potentially biased.

Future plans for the project 

The project results were limited due to the subset of images with skin tone and constrained computational resources.  Future work is to further develop the tone classifier to expand the number of labelled images. Converting colour values from images into values more closely related to skin tone and then comparing with the tone labels of the image, might help train an AI model to exclude the tumour itself when classifying skin tone of the whole image. This is important as we know that the tone of tumours themselves is often different to that of the surrounding skin.

Heat map showing where the skin tone matches the label
An example image from ISIC which had its Fitzpatrick Skin Type labelled. The light green indicates where individual pixels correspond with expected colours associated with the labelled skin type. Notice that the centre of the image, where the tumour is, does not match.

More powerful computational resources will be acquired and used to sufficiently train the model.   Future work will also employ explainable AI techniques to identify the source of the bias. 


Contact details and links 

James Pope: https://research-information.bris.ac.uk/en/persons/james-pope,

Ayush Joshi https://research-information.bris.ac.uk/en/persons/ayush-joshi,  

First Steps Towards a Crowd-Sourced Ancient Greek Encyclopaedia

JGI Seed Corn Funding Project Blog 2023/24: Naomi Scott

Passage of Ancient Greek text
A page from a 10th century manuscript of Julius Pollux’s Onomasticon

In the second century A.D., Julius Pollux, Professor of Rhetoric at the Academy in Athens, wrote the Onomasticon (‘Book of Words’), and dedicated it to the Emperor Commodus. The work sits somewhere between an encyclopaedia and a lexicon. Chapters are organised by topic, and Pollux lists appropriate words on diverse themes such as ‘The Gods’, ‘Bakery Equipment’, ‘Diseases of Dogs’, and ‘Objects Found On Top Of Tables’. Throughout his work, Pollux quotes canonical authors such as Homer, Aeschylus, and Sappho in support of what he considers correct and elegant linguistic usage. This means that in addition to providing a wealth of information on everyday life in the ancient world, the Onomasticon is also one of our best sources of quotations from otherwise lost works of ancient Greek literature.   

Despite Pollux’s obvious importance, his work has not been translated into any modern language. The vast size of the Onomasticon (10 books in total, each comprised of around 250 chapters) means that it is unwieldy even for researchers able to study the original ancient Greek text. With seed-corn funding from the Jean Golding Institute, my project ‘Crowd Sourcing Julius Pollux’s Onomasticon’ has set to work on filling this gap. Eventually, my aim is to use crowd-sourcing to produce not only a translation of the Onomasticon, thereby making it accessible to researchers in a wide variety of disciplines, but an edition of the work which is fully data-tagged, so that researchers can better navigate the text, and produce key data about it: Which ancient authors and genres are most frequently cited as sources and in what contexts; what topics are granted the most or least coverage within the text; and how are different lexical categories distributed within the encyclopaedia? Without the answers to questions such as these, any individual chapter or citation within the Onomasticon cannot be placed in the wider context of the work as a whole.  

Creating a New Digital Edition 

While a digitised version of the ancient Greek text of the Onomasticon exists, it is based on the work of Erich Bethe, whose early twentieth-century edition of Pollux removed all the chapter titles which have been used to organise the text since it was first published as a printed book in 1502. Bethe did this because he did not consider the chapter titles to be Pollux’s own. Both for the purpose of splitting the text up into manageable short chunks for translation, and for the purpose of data-tagging, I decided it was essential to reinstate the titles. Additionally, my own examination of manuscripts of the Onomasticon dating as far back as the 10th century has revealed that the chapter titles are in fact much older than first thought, and that the text as we currently have it (abridged from Pollux’s even longer original!) may even have been conceived with the chapter titles. 

The first step in producing a digital edition suitable for crowd-sourcing and data-tagging is therefore to reinsert the titles into the text. This would be an enormous undertaking if done manually. Working with a brilliant team from Bristol’s Research IT department, led by Serena Cooper, Keiran Pitts, and Mike Jones, we have set about automating this process. Ancient Greek OCR (Optical Character Recognition) software designed by Professor Bruce Robertson at the University of Mount Allison in Canada, two editions of the text were scanned — one Bethe’s chapterless version, and the other by Karl Wilhelm Dindorf, whose 1824 edition of the text includes the titles.  The next step is to use digital mapping software to combine the two texts, inserting the titles from Dindorf into the otherwise superior version of the text produced by Bethe.  

Next Steps 

Once the issue of the chapter titles has been resolved, the next step will be to create a prototype of around 20 chapters, which can then be made available to the scholarly community to begin translating and data-tagging the text. A prototype would allow us to get feedback from researchers around the world working with Pollux, and to better understand what kinds of data would be most useful to those seeking to understand the text. This feedback can then be integrated into an eventual complete edition of the text which can then be translated and data-tagged as a whole.  

Eventually, this project will not only make the Onomasticon more accessible to researchers, and help to revolutionise our understanding of this important work. A complete translation and data-tagged edition complete with chapter titles will also allow the Onomasticon to have an impact beyond the academic community. The eventual plan is to train arts professionals engaging with the ancient Greek world to use the digital edition and translation. The Onomasticon’s remarkably detailed picture of ordinary life and ordinary stuff in antiquity makes it a vital resource for anyone trying to recreate the ancient Greek world on stage, on screen, or in novels. The hope is that this project will therefore not only change the way that scholars understand the Onomasticon and its place in the history of the encyclopaedia. It can also offer artists a window onto antiquity, and through its impact on art, shape the public understanding of the ancient world.