Brunel’s Network project

Photo courtesy of SS Great Britain Trust

A blog written by James Boyd, Brunel Institute

Isambard Kingdom Brunel

As many Bristolians know, during the 19th century, Isambard Kingdom Brunel was a major force in the development of the city, and the wider region. Working on railways, docks, bridges and revolutionary ships that forged connections across Britain and the world, many of his works are with us today. One of the most significant is the SS Great Britain, which today lies in the heart of Bristol’s old city docks, on the very spot it was built. The first ocean-going vessel in the world to be built of iron, and the first to be moved by a propeller, it is the prototype of the modern ship. Restored as a museum vessel, and visited by hundreds of thousands every year, it is one of the city’s major visitor attractions, and an icon of maritime history.

Brunel Institute

In 2010, the SS Great Britain Trust and the University of Bristol entered into a collaboration to promote and support educational, academic and professional studies in maritime, scientific, industrial and technological history, archaeology and ethnography, through the creation of the Brunel Institute. Situated in the Great Western Dockyard alongside the ship, the Institute is a dedicated facility encompassing archives, reading and teaching space and state of the art conservation suite, which houses the collections and resources of the SS Great Britain Trust, and the University of Bristol’s Brunel Collection, donated by the family in 1950.

Since 2018, James Boyd, resident research fellow at the Institute, has been using its collections, in combination with other major national collections, to piece together the wider network of engineers, investors and patrons involved in the creation of I K Brunel’s three path-breaking steamships. These ships, the Great Western, (1838) Great Britain (1843) and Great Eastern (1859) were each critical to global history. The first of them was the very first ocean-going vessel purpose built to steam non-stop between Britain and North America; the second, as mentioned, transferred ocean ships to modern materials and propulsion; the last was a costly failure for its investors, a hugely oversized vessel for voyaging to Asia Pacific, but it ended up laying the first continual, working telegraph cable connecting Europe and America.

Brunel’s Network

‘DM162/10/1 – Isambard Kingdom Brunel Letter Books, by Courtesy of the Brunel Institute – a collaboration of the SS Great Britain Trust & University of Bristol

Brunel’s Network is a project that aims to find, record, assess and weight the influence of all the individuals with whom Brunel collaborated in order to deliver these projects. It is an analytical enquiry into communities of innovation, and how they functioned in the past, with Brunel at the epicentre. The analysis initially utilised some basic static network visualizations, built in Gephi, in order to construct a picture from the source material of who provided significant contributions and connections within each project. However, the visualisations and data needed greater dexterity, and also aimed to have a significant public engagement angle, by making interactive, temporal network diagrams available for public exploration.

In a great example of the active collaboration fostered by the Institute, this dynamic element has been generated and continues to be developed by Christopher Woods, Head of Research Software Engineering at Bristol. Christopher has taken the opportunity to create software capable of temporal network analysis that not only comprehends Victorian social, political and professional interactions, but has significant ongoing potential in general temporal analysis of human networks and their development.

With funding support from the Jean Golding Institute, Christopher has also been able to add a third team member, Gareth Jones, so that a public exhibition of Brunel’s Network will be available in the Brunel Institute from the 19th of July 2020 – the 50th anniversary of the day the historic SS Great Britain was towed back to Bristol for restoration. The project team are aiming to have all data, visualisations and analysis prepared for the launch of an interactive app by the close of 2020, before the findings, methods, outcomes (and lessons learned!) are collated and presented to both the digital humanities and data science communities. Hopefully, the project will demonstrate ways in which historical source material and digital methodologies can work in harmony to help both the academic world and wider public comprehend the past, whilst generating present-day software innovations that expand the analytical tools available in Bristol and beyond.

Further progress of this project will be reported in future blog posts.

To keep up to date with this and other projects, news, events, funding and other opportunities please Join the JGI Mailing list.

A Challenge Owner’s perspective of the inaugural Turing Network Data Study Group – Part 2

The challenge team with Challenge Owner Simon de Lusignan and Data Science Principle Investigator Mark Joy

Understanding and improving the reliability of disease monitoring in GP surgeries is the extensive, yet critical task taken on by the team of researchers at Royal Society of General Practitioners (RCGP) and the University of Surrey headed by Professor Simon de Lusignan and Dr Mark Joy. With this goal in mind, the team challenged participants of the first Turing Network Data Study Group to attempt to develop a predictive algorithm using machine learning that corrects sub-optimal data allowing for better disease monitoring. In part two of our blog series focusing on a Challenge Owner’s perspective of the Turing Network Data Study Group, Professor de Lusignan and his team tell us about their experience of the DSG and the challenge they presented: Improving our ability to use routine data to inform the management of key disease areas. You can read part one of this series, where we spoke to another Challenge Owner, University of Bristol’s Danielle Paul about her experience of the event on the JGI blog. Challenge Owner team – who was involved?

  • Prof Simon de Lusignan, University of Oxford/University of Surrey/Royal College of General Practitioners
  • Dr Mark Joy, University of Surrey
  • Rachel Byford, University of Oxford/University of Surrey
  • Dr John Williams, University of Oxford/University of Surrey
  • Dr Nadia Smith, University of Surrey/National Physical Laboratory

Can you give us a brief overview of the challenge you presented to the Data Study Group participants? It is essential to monitor blood pressure in various chronic diseases (e.g. heart disease, diabetes, etc). However, GPs tend to indicate certain biases in recording measurements, for example a preference for round numbers. We have 47 million blood pressure readings and 7 million glycated haemoglobin (HbA1c) readings (a measure of diabetes control) and we were interested in finding the true blood pressure and HbA1c trends from the inaccurate data, comparing trends for different groups of patients (e.g. on various medications). Participants were challenged to attempt to develop a predictive algorithm using machine learning that corrects suboptimal data allowing for better disease monitoring. The challenge ended up being split into three sub-challenges:

  1. Identifying whether a case is a new (incident) or a follow-up (prevalent) when this information is not recorded in the computerised medical record
  2. What is the true underlying blood pressure (BP) in a population where there is marked end-digit preference for zero, when data are recorded?
  3. What is the trend in diabetes control when there is additional testing at the time of ill health?

What kind of solutions did the challenge team come up with? The solutions suggested to the three sub-challenges were as follows:

  1. Tree classifiers for classification as this is essentially a binary classification problem (is a GP visit a follow-up or a new, incident visit?); decision trees and random forests for classification of episodes into new and ongoing; data driven approaches to finding threshold and min-max range of number of days between two episodes per diseases.
  2. Latent variables, time series ideas
  3. Bayesian-type approach with an iterative procedure for uncovering the posterior (incorporating Neural Network classifiers for patient characteristics)

What are your hopes for the potential applications of the team’s findings from this week?

The team had to work together to find solutions to the three sub-challenges

We have two members of the group interested in carrying on this work. We hope to explore further the team’s approach to Sub-challenge 1 as we feel this is a promising area for further exploration. The team’s contribution to Sub-challenge 2 is already planned to be incorporated in to the RCGP report to Public Health England. It increases the scope and applicability of this report on the nation’s health in certain key disease areas. Sub-challenge 3 was arguably the more difficult challenge, and the team’s feedback has led us to reconsider how we engineer our data to better address this prediction problem. As a Challenge Owner, what was your favourite part of the Data Study Group week? New perspectives, the opportunity to make more use of our data. We enjoyed engaging with the enthusiasm and energy of the team. Our favourite part was listening to the presentations at the end of the week. Were there any surprises for you at the event? How narrow population health and epidemiological technique are compared with the wealth of ideas and approaches available. Is there anything else you would like to tell us? Two members of the group have been in contact about continuing this work. One to work on “episode types” the other on end-digit preference in blood pressure recording. The event was immensely enjoyable, truly challenging for the team members, and a joy to participate in.

The Alan Turing Institute and Data Study Groups

The inaugural Turing Network Data Study Group was hosted by the Jean Golding Institute at the University of Bristol – one of The Alan Turing Institute’s 13 partner universities in August 2019. The event united six Challenge Owners with 50 students, postdocs and senior academics to tackle real-world data science challenges spanning a variety of fields, from spectroscopy and analytical chemistry to text mining and digital humanities. Building on the popular Data Study Groups (DSGs), held three times a year at Turing HQ in London, this ‘Turing Network’ event was the first of its kind to be hosted by a partner university. It followed the tried-and-tested format of a five-day collaborative hackathon. The Challenge Owners – organisations from industry, government and the third sector – provided real-world data challenges that were tackled by small groups of highly talented researchers. The results were presented on the final day. Find out more about Data Study Groups, including how you can get involved as a researcher or Challenge Owner on The Alan Turing Institute website

JGI Seed corn funding call winners 2020 announced!

The Jean Golding Institute are delighted to announce the winners of the Seed corn funding call 2020.

This funding call has been successfully running for last three years and aims to support activities to foster interdisciplinary research in data science (including AI) and data-intensive research.

The Jean Golding Institute has funded a total of 32 seed corn projects since 2016. This year, we have been able to fund 10 projects and are grateful to have received funds from the Faculty of Arts and Strategic Funding in order to offer additional awards. Our winners this year are:

  • Oliver Davis, Claire Haworth and Nina Di Cara with ‘Mood music: using Spotify to infer wellbeing’
  • Brendan Smith and Mike Jones with ‘Digital humanities meets Medieval financial records: the receipt rolls of the Irish exchequer
  • Zoi Toumpakari, Ivan Palomares Carrascosa, Daniele Quercia and Luca Maria Aiello with ‘Automating food aggregation for nutrition and health research’
  • Avon Huxor, Emma Turner, Eleanor Walsh and Raul Santos-Rodriguez with ‘Elements of free text used in decision making: an exemplar from death reviews in prostate cancer and learning disabilities’
  • Jim Dunham, Gethin Williams, Nathan Lepora, Tony Pickering and Manuel Martinez Perez with ‘Decoding pain: development of a clinical tool to enable real-time data visualisation and analysis of human pain nerve activity’
  • Ranjeet Bhamber, Andrew Dowsey, Febe Van Maldegem and Julian Downward with ‘Super-charging single cell imaging pathology’
  • Elaine McGirr and Julian Warren with ‘Mapping Oliver Messel’
  • Liz Washbrook with ‘Mental health and educational achievement in two national contexts: a machine learning approach
  • Ella Gale, Natalie Fey, Craig Butts, Varinder Aggarwal with ‘Chemspeed data capture and curation’
  • Pierangelo Gobbo and Lars Bratholm with ‘Machine learning assisted polymer design’.

We will  be interested to hear how all these projects progress this year and will report back on their progress in the summer of 2020. Our next Seed corn funding call will be in the Autumn of 2020.

To ensure you keep up to date with any other funding calls, news, events and other opportunities, please join the JGI mailing list.

A Challenge Owner’s perspective of the inaugural Turing Network Data Study Group

The DSG challenge team with Danielle Paul (far right)

Using AI and machine learning to increase understanding of cardiac muscle proteins – the molecular basis of heart disease – is a potentially daunting challenge. But it was one that Turing Fellow Danielle Paul, from the School of Physiology, Pharmacology & Neuroscience at the University of Bristol, was keen to explore. Danielle took part in the first Turing Network Data Study Group, hosted by the Jean Golding Institute at the University of Bristol – one of The Alan Turing Institute’s 13 partner universities. The event united six Challenge Owners with 50 students, postdocs and senior academics to tackle real-world data science challenges spanning a variety of fields, from spectroscopy and analytical chemistry to text mining and digital humanities.

Building on the popular Data Study Groups (DSGs), held three times a year at Turing HQ in London, this ‘Turing Network’ event was the first of its kind to be hosted by a partner university. It followed the tried-and-tested format of a five-day collaborative hackathon. The Challenge Owners – organisations from industry, government and the third sector – provided real-world data challenges that were tackled by small groups of highly talented researchers. The results were presented on the final day.

Here, Danielle tells us about her experience of the DSG, and the challenge she presented: Applying AI and machine learning to reveal the molecular basis of heart disease.

What was your challenge about? 

An example of the image data the challenge team were working on

This was an image-processing challenge with potential outcomes that could improve our fundamental understanding of cardiac muscle proteins. The proteins in our images are susceptible to mutations that cause hypertrophic cardiomyopathy, which is a known cause of adult sudden death.

To obtain high-resolution molecular models of these proteins we need to collect hundreds of thousands of images of our protein from noisy data obtained via cryogenic electron microscopy (cryo-EM). It took us six months to manually annotate the small dataset we used in the DSG. It’s a laborious process and it highlights the pressing need for a robust, machine-based approach. If we can automate the protein-identification step, it would overcome a significant bottleneck in our image-processing workflow.

What solutions did the challenge team generate?

The team implemented several deep learning algorithms, known as Convolutional Neural Networks, which were then trained to recognise our proteins in the images. The automated methods that they presented to us performed very well and upon testing, providing as much as 90% accuracy.

What are your hopes for potential applications of the findings from the week?

The protein hunters: the challenge team hard at work finding a way to automate a laborious image-processing task

I hope that the various methods that were implemented and trained using our challenge dataset can now be put through their paces on our larger datasets. It will be interesting to see how they perform with slightly different data and imaging conditions. This methodology will be built upon as part of a Turing-funded research project to make a more general software tool to identify proteins in Cryo-EM images.

As a Challenge Owner, what was your favourite part of the Turing Network DSG?

My favourite part was having discussions with the participants. In particular, hearing the ideas and thoughts they had in response to the problems described in our image processing workflow. I appreciate how much effort they put in and their enthusiasm towards addressing the challenge.

Were there any surprises during the event?

That the participants were keen to do more at the end!

Find out more about Data Study Groups, including how you can get involved as a researcher or Challenge Owner

The Alan Turing Institute 

The Alan Turing Institute’s goals are to undertake world-class research in data science and artificial intelligence, apply its research to real-world problems, drive economic impact and societal good, lead the training of a new generation of scientists and shape the public conversation around data. 

Find out more about The Alan Turing Institute.