Secret Life of Data Competition

The Jean Golding Institute’s Secret Life of Data Competition and Awards Ceremony

When we think about the security of data on our phones and computers, we might think about passwords and permissions, or about data encryption – but we rarely think about what our data looks like, or what it does as it moves around hidden inside our phones, computers, digital devices, our apps and networks. This secret life of data – the traces, bits, and fragments of personal information that we leave behind us online – was the focus of this short story competition. The Jean Golding Institute, in collaboration with the Digital Security by Design (DSbD) Futures programme, delivered by the ESRC funded Discribe Hub+, hosted a short story competition exploring ‘the secret life of data’.

The competition sought creative stories that brought to life the secret life of data. The stories could imagine this life as a journey, a quest, a romance, or a tragedy; thinking of a computer’s internal architecture as a house, a jungle, a zoo, or a city; and the data as characters facing danger in the form of various digital threats and vulnerabilities.

The Jean Golding Institute were proud to host an awards ceremony on 2nd November, with readings of the extracts of all ten shortlisted stories, and the JGI extend their congratulations to the winners and runners up:

  • 1st place: Guy Russell – The Task in the Eight-Bit Pyramid
  • 2nd place: Fiona Ritchie Walker – Mini-Me
  • 3rd place: Ben Marshall – The Courier

All ten shortlisted stories have been published in a Secret Life of Data Anthology, available to buy from Bristol Books.

JGI Seed Corn Funding Project Blog 2021: Michael Rumbelow and Alf Coles

JGI Seed Corn Funding Project Blog 2021: Michael Rumbelow and Alf Coles

Introduction

This seed corn project aimed to explore and extend uses of AI-generated data in an educational context. We have worked on an AI-based app to recognise, gather data on and respond to children’s arrangements of wooden blocks in mathematical block play. The project was inter-disciplinary in two ways. Firstly, the people involved crossed disciplines (teachers, academics, programmers) and, secondly, the app itself provokes engagement in creative activities involving music, chemistry and mathematics.

Developing an app to recognise real-world block play

Block play is a popular activity among children. And in schools there has also been a resurgence in the use of physical blocks in primary mathematics classrooms, particularly in the teaching of maths (drawing on some East Asian practices of using physical blocks as concrete models of abstract mathematical concepts). We were interested in researching children’s interactions with physical blocks, with the aim of supporting their learning across the curriculum, and one of the key challenges was how to capture data on children’s interactions with blocks for analysis.

Previous studies of block play have focused on gathering data variously through sketching or taking photos or videos of children’s block constructions, or embedding radio transmitters in blocks which could transmit their positions and orientations. Recently developments in computer vision technology offer novel ways of capturing data on block play. For example, photogrammetry apps such as 3D Scanner can now create 3D digital models from images or video of objects taken on mobile phones, and AI-based object recognition apps are increasingly able to detect objects they have been trained to ‘see’.

With funding from the JGI we were able to form a small project team of two researchers in the School of Education, a software developer and the head of a local primary school, in order to develop an app to trial with children in the school (see Figure 1).

Figure 1. The experimental set-up as used in the initial trial in a primary school

Technical Developments

Over the course of the JGI project we have developed the app in the following ways:

  • We have rebuilt the app architecture robustly around the Detectron-2 AI algorithm, to facilitate reliable data gathering, training and feature development.
  • We have developed a new mode to enable gathering of data on mathematical block play around proportion (ie detection of relative block sizes as well as adjacency) and carbon chemistry modelling (ie detection of multiple row block adjacency).
  • We have made improvements to the user interface (eg removal of text from the screen for when testing with pre-literate users).
  • We have tested the app with 5-6 year olds in a primary school.
  • The new version generates snap shots of children’s block arrangements and exports data on their positions to a spreadsheet which allows further analysis.

Lessons Learned

As well as developing a prototype, we have been able to trial this phase of development in school, giving us several valuable insights into both the technical development of AI computer vision apps for gathering anonymous data on block play in schools, as well as the usability and potential of apps controlled by children via the arrangement of physical blocks on a tabletop. In particular we have found:

  • Benefits of using platforms available to the target audience as and when feasible. Our aim was to develop an app which is ultimately usable by schools. At the time of development, the AI algorithms used required processing power beyond standard laptops to run at reasonable speeds, and dedicated AI processing hardware such as the Nvidia Jetson NX offered sufficient processing power at a fraction of the cost of higher-end GPU equipped laptops. However, during development, due largely to global chip shortages, this price difference disappeared, Jetson NX’s became scarce, and we decided to switch to higher-end GPU-equipped Windows laptops. This has simplified installation and portability of the app without the need for specialist hardware and opened a route to incremental optimization for the types of standard lower-spec laptops used in schools, as well as easing technical maintenance, and sharing and processing of the data gathered in standard apps such as spreadsheets.
  • The resilience of trained artificial neural network algorithms in practice, as well as the importance of responsively optimising training image datasets. The app was trained to recognise blocks using training datasets originally gathered with a specific higher-spec webcam at a fixed distance from the table, which required a separate support apparatus. In practice when we tried using low-cost webcams with their own built-in gooseneck support these worked relatively well, at a variety of heights, and in a variety of lighting and tabletop environments in the field, and were much more practical to set up. However, dips in reliability became apparent in certain lighting conditions, for instance in distinguishing red and pink blocks, which highlighted the need for fresh training datasets using the new webcam, focusing on these areas of ambiguity apparent in field-testing.
  • Children’s patience, curiosity and creativity in using novel technology. We had minimised the textual buttons in the interface designed for the researcher, to change modes etc, in the assumption that young children would not want to have to bother with them and that their presence might be confusing. In practice children, having seen the buttons used during set-up, were curious to bring up and explore all of the interface buttons themselves. They were also patient when the app occasionally did not immediately detect a block, ‘helping’ it to ‘see’ the block by nudging its position or re-laying it. And rather than copying what they had seen researchers or other children doing, the children were creative in exploring the affordances of the app, for example trying laying blocks horizontally rather than vertically, or reversing the order of a melody played by placing blocks.

Above all, this phase of development and trialling has provided evidence of the feasibility of producing an app which can use AI to detect and respond to block placements by young children in the field, and highlighted several of the key challenges for next steps.

Future Challenges

The potential uses of the app are extensive and, following on from the successes of this JGI project, we now want to:

  • Develop our app, which is currently a prototype, into something potentially ready to move into production.
  • Engage with Research Software Engineering (RSE) at the University of Bristol, to support further app development.
  • Trial and hone the tools and games to support learning using the app
  • Extend the dataset of images used to train the app from several hundreds to several thousands, aligned with the diverse webcams and conditions likely in the field
  • Pilot the app with visually impaired and blind children
  • Pilot the app with teachers interested in teaching climate chemistry
  • Develop an anonymised dataset of children’s block play, including creative free play and guided mathematical block play (inspired by the UoB’s EPIC-KITCHENS data set https://epic-kitchens.github.io/2020-100)
  • Enable upload, storage and visualisation of data on block arrangements on a server, for potential research analysis using AI to detect patterns
  • Extend the app to recognise stacked as well as laid-flat block constructions, making use of LIDAR technology.

We are currently taking part in a training programme (SHAPE “Pre-accelerator” course) to help us plan the next stages of development.

JGI Seed Corn Funding Project Blog 2021: Conor Houghton

Bayesian methods in Neuroscience – Conor Houghton

For the last century science has relied on a statistical framework based on hypothesis testing and frequentist inference. Despite its convenience in simple contexts this approach has proved to be intricate, obtuse and sometimes misleading when applied to more difficult problems, particularly problems with the sort of large, complex and untidy datasets that are vital for applications like climate modelling, finance, bioinformatics, epidemiology and neuroscience.

Bayesian inference solves this; the Bayesian approach is easy to interpret and returns science to its traditional reliance on evidence and description rather than a false notion of significance and truth. With a rigorous handling of uncertainty Bayesian inference can dramatically improve statistical efficiency, allowing us to squeeze more insight out of finite, hard-won data which in turn reduces animal and biological tissue use and reduces costs for scientific projects.

With support from the Jean Golding Institute we ran a workshop about Bayesian Modelling: our workshop had lots of different elements, a tutorial for people unfamiliar with the approach, short talks by people in the University who use these methods, a few talks by external speakers and a data study group. In retrospect, we did try to do too much, but the workshop was very helpful, the short talks brought together the local community around Bayesian Modelling and the two external speakers, Hong Ge and Mike Peardon, were excellent and provided real unexpected insight into the current and potential future state of Bayesian Modelling.

We hope to next host a workshop on Hybrid / Hamiltonian Monte Carlo; HMC has quickly become a very useful tool in data science, allowing us to perform Bayesian inference for a host of real world problems that would not have been tractable a few years ago. Perhaps surprisingly, HMC has its origins in high energy particle physics and was invented to perform the high-dimensional integrals involved in Quantum Chromodynamics, the calculations required to predict the results of collider experiments in CERN.

We believe that there is a still a lot these two communities  – particle physics and applied data science – can learn from each other when exploring and developing the power and scope of HMC.

JGI Seed Corn Funding Project Blog 2021: Lucy Biddle

Can sharing app data facilitate communication between young people and their mental health practitioner?

Bridget Ellis, Lucy Biddle, Helen Bould, Jon Bird

Mental health problems are increasing among young people, who have the highest prevalence of mental health problems among all age groups [1]. Despite the adverse outcomes that result from this, young people access mental health services at a lesser rate than other age groups [3], with barriers including communication, poor mental health literacy, embarrassment, fear of stigma and confidentiality concerns.

Research illustrates that digital peer support can help people with mental health difficulties [2] and the increased availability of mobile technologies is now being harnessed to deliver mental health support.

Our project was a collaboration with the company that created the award winning, NHS-endorsed young person’s mental health app, ‘Tellmi’ (www.Tellmi.help).  The app is a fully moderated peer support environment, where young people anonymously share ‘tweet’ style posts about their emotional and mental health difficulties. A holistic dataset builds up for each individual which could have potential clinical value if shared with a healthcare practitioner. For example, the posts can be tagged for content, rated for severity, displayed longitudinally and presented in a shareable summary document.

Previous feasibility survey and interview data investigated the views of young people who used the Tellmi app, and Child and Adolescent mental health services (CAMHS) clinicians about the acceptability and utility of sharing such a summary document during mental health consultations as a means of enhancing the clinical exchange. Our current study had two aims: i) to carry out in-depth thematic analysis of this previously collected data; and ii)  to form a multidisciplinary working group and convene a one-day workshop to present and discuss our findings as preparation for a full-scale research proposal.

We conducted thematic analysis on interviews with five young people and four healthcare practitioners, and 120 survey responses from users of Tellmi.

“So I think finding the words and putting them on Tellmi makes it easier to be able to say them to someone who is in front of you”

A theme was identified surrounding communication and how a summary document could be utilised to facilitate this between young people and healthcare practitioners. A concern raised by young people was that the way they communicate varies upon the audience they are communicating with, meaning a summary of the posts which they intended to be seen by peers may contain information they may not usually present to a clinician. Young people appear to value the written communication of Tellmi and were enthusiastic about how this could help to provide a focus and inform clinical sessions. For young people who struggle when trying to communicate their levels of distress with a clinician, this could be overcome through making it possible to share their experiences through their Tellmi posts. Additionally, providing a written account of how young people have been feeling may help to bridge a gap between the more honest and open information that is disclosed anonymously and that that is disclosed face-to-face with a clinician. However, young people did raise concerns about how this written information could be misconstrued or misinterpreted.

“If I feel comfortable with them then I’ll be more likely to share but if I don’t feel comfortable then I would not share”

We found that trust would play a key role in the process of sharing. This was not only trust between a young person and their clinician but also trust between a young person and Tellmi and how sharing could change how young people engaged with the app going forward. Clinicians also raised questions about trusting the Tellmi app, in particular how successfully an algorithm can identify risk or how the data being shared may be monetised.

“Tellmi posts do tend to be quite personal and honest and open because you expect to be talking to someone who isn’t really there so you can say whatever you like and there’s no judgement”

Young people seem to really value Tellmi as a safe space. This safety appeared particularly facilitated by the anonymity it provides. Young people were concerned about how their data may be handled if it was no longer anonymous and being shared with clinicians.

We also found practicalities surrounding sharing that would need to be addressed. For example, young people required control over their data and how it is used and shared. The potential of young people censoring the information they present to their clinician was also discussed. Additionally, the impact that revisiting old posts may have on young people was considered. Factors specific to clinicians could also impact sharing, with time being a concern for both clinician and young people.

“I think it’s [sharing a Tellmi summary] a great idea but the young people would need to have complete control of the information that is included to avoid endangering young people”

Workshop

Our multidisciplinary working group consisted of three researchers from computer science and health sciences, two child and adolescent psychiatrists, representatives from Tellmi, and two young people with lived experience of mental health difficulties. We presented our findings from the thematic analysis then used discussion sessions and group work to consider implications for the design of future research. We discussed how data sharing is likely to be most beneficial; how acceptability can be enhanced for young people and clinicians; stakeholders’ evaluations of the dummy data summary document of Tellmi posts, including methods of data visualisation; and potential barriers to data sharing in practice.

Discussion of the design for a user study of Tellmi data-sharing in practice identified this would involve varied stakeholders, including Tellmi users, researchers and clinicians. It was noted that recruitment could bring challenges and discussion sought to identify the most appropriate pathways for recruiting clinicians and young people in paired groups so that both perspectives can be captured for each case of data sharing. If recruiting through clinicians, it was noted that young people may not be Tellmi users or have enough data to produce a summary document. A suggestion for overcoming this was to ask young people to engage with Tellmi while on a waiting list. However, one of the lived experience advisors highlighted a challenge: I think another issue with recruiting people through NHS is that no matter how good an app is, if you are young person on a waiting list and a clinician says, Use this app, its like, No, I want you to help me and why am I going to use an app?. Alternatively, we discussed recruiting through the Tellmi platform and young people approaching their clinicians to get involved. However again, there would be challenges with this approach such as obtaining ethical approvals and clinical ‘buy in’ where relevant young people could be based all over the country.

We also discussed the practicalities of sharing and how a study procedure would be designed, focusing this discussion around the implications for design highlighted in our thematic analysis. This encompassed details determining how the process of sharing would actually take place. For example, we considered whether the summary would be shared as a physical document or an electronic copy, and whether this should be given to the young person to present to their clinician or be sent directly to the clinician. When to share is also a key consideration, our data showing young people have varied views around this and whether sharing should be repeated, and if so, at what frequency. Additionally, methods of improving and encouraging sharing were discussed, as well as the overall design of a summary document and how this could be altered to ensure inclusivity for special educational needs.

Key to designing a research study were methods of evaluation and establishing outcome measures. Young people and clinicians flagged a range of potential outcomes. These included completing clinical tasks such as goal setting, and how successful a young person may consider a session “something else to measure would be how the young person feels coming out of the appointment. Has it empowered them or let them take control of their healthcare”. The view of the young person was considered key in determining how outcome would be measured “it’s just making me think what is the actual point of sharing the data again? I guess that depends on the young person”.

The workshop provided a space for exciting discussion with input from stakeholders from different backgrounds. While we hoped it would allow elements of co-design to inform development of a data sharing document and research plans to evaluate this, challenges were raised which suggest further development work may be necessary before the process of sharing can be evaluated. The ideas and issues raised at our workshop will be explored through our continued collaboration with Tellmi.

The workshop was incredibly insightful. It provided us the opportunity to discuss the findings of the study with a diverse group of experts including academics, clinicians and young people with lived experiences of poor mental health. It has helped us to completely rethink how to approach the problem and we look forward to continuing to work with the Bristol team.” Kerstyn Comley, Tellmi Co-CEO

JGI Seed Corn Funding Project Blog 2021: Dr Josh Hoole

Exploiting Data to Support UK Search and Rescue

Dr Josh Hoole, Dr Oliver Andrews, Dr Steve Bullock

Introduction

Various UK organisations provide 24/7 Search and Rescue (SAR) capability year-round across land, sea and air. Data analytics provides a key route to supporting SAR operations and aerospace system design in the future.

Aims of the Project

The aim of this project was to explore what data is available to capture the variability present in SAR operations (including mission characteristics and weather) to help support the future design of aerial systems to support SAR. This aim was to be achieved using the following objectives:

Engagement with search and rescue organisations to establish:

  • Availability of data for characterising SAR mission profiles
  • Perceptions on developing Unmanned Aerial Vehicles (UAVs or ‘drones’) to support SAR

Data fusion across asset tracking data to characterise SAR mission profiles:

  • Exploitation of aircraft and vessel trajectories
  • Combining mission profiles with meteorological data

This project therefore lay at the exciting and valuable intersection between data science, aerospace systems, weather and climate analysis and SAR.

Results

To date on the project, the following activities have been performed supported by the Seedcorn Funding:

Data Workshop with the Royal National Lifeboat Institution (RNLI)

A one-day workshop was held with the RNLI Data team at the RNLI College in Poole. Within this workshop, areas of interest and ideas were shared spanning the exploitation of data for mission analysis, future planning and the use of computer vision to support lifesaving activities. The University of Bristol team were simply amazed at the large amount of data-driven work performed by the RNLI and look forward to establishing stronger links between the RNLI and research institutions in the future (see contact details below).

There was also a tour of the RNLI’s training and lifeboat manufacturing facilities as part of the workshop to provide context to the RNLI’s activities. The Bristol team were overwhelmed by the vast and diverse capabilities present in a single location and thoroughly recommend a tour of the RNLI College and All-weather Lifeboat Centre.

RNLI Workshop Participants at the RNLI memorial

 

RNLI All-weather Lifeboat Centre for Lifeboat Manufacture and Maintenance

Initial Assessment of Vessel Tracking Data

Maritime vessels are equipped with real-time tracking capability via Automatic Identification System (AIS) installations. Historic AIS data provides vessel trajectories which can be post-processed to characterise the mission performed. Building on prior work in the literature, an initial investigation into processing the AIS trajectories of RNLI lifeboats has been performed using data sourced from MarineTraffic. Using simple algorithms, AIS trajectories can be processed to identify the occurrence of lifeboat search manoeuvres and generate characteristics regarding the search operation (e.g. search time, search area, etc.). It is intended that such characteristics can be used in the future to support the post-mission reporting performed by the RNLI.

Identification and characterisation of search areas within lifeboat trajectories (data source: MarineTraffic)

Data Fusion to Enhance SAR Helicopter Tracking Data

A large number of aerospace vehicles are also equipped with real-time tracking capability via Automatic Dependent Surveillance-Broadcast (ADS-B) equipment. However, as a line-of-sight system, ADS-B derived trajectories are often lacking in the regions where SAR operations take place, such as at low altitude, close to obstructions or out at sea. SAR helicopters are also equipped with AIS equipment, permitting ADS-B and AIS data sources to be fused to greatly increase the coverage of SAR helicopter trajectories. The ADS-B/AIS fused trajectories can then be further processed to generate mission characteristics as for the maritime vessel trajectories.

Fusion of ADS-B and AIS trajectories for SAR Helicopters (data sources: Opensky Network, MarineTraffic)

Future Plans

Exploitation of Meteorological Data Products

Following completion of the SAR mission characterisation via AIS and ADS-B data sources, the project will intend to couple the trajectories to meteorological data products to fully characterise the SAR operational environment. This level of data fusion could support automated post-mission reporting, draw correlations between the search characteristics and operating environment, as well as support future planning with respect to the impacts of climate change on UK SAR operations.

Engage further with Inland SAR Organisations (PhD projects)

So far, the project has focused on maritime SAR. Future work will engage with inland SAR organisations to a greater extent and initial links have been formed with the relevant organisations. Dr Steve Bullock has successfully secured funding for two PhD students in the area of SAR planning for UAVs and these project will aim to leverage the expertise from the SAR connections made during this project.

Future SAR Data Research Partnerships

The workshop with the RNLI highlighted a significant number of data-centric avenues that could be pursued within future research projects, including aspects of machine learning, computer vision, weather and climate, along with mission analysis. A future workshop is planned, and researchers from across the data community at the University of Bristol are encouraged to participate, so please get in touch via the contact details below. The University of Bristol team are also very keen to explore collaborative partnerships within this area with other research institutions (GW4 and beyond) and SAR organisations. Please send any expressions of interest regarding future opportunities to the contact details below.

Contact Details Dr Josh Hoole, Department of Aerospace Engineering, University of Bristol, josh.hoole@bristol.ac.uk