Dr Dan Lawson appointed as interim Director of the Jean Golding Institute

Dr Dan Lawson has been appointed as the interim Director of the Jean Golding Institute for an initial period of six months. As well as the interim Director role, he will assume the role of Academic Liaison for The Alan Turing Institute, the UK’s national institute for Data Science and AI, on behalf of the University of Bristol.

“Dan Lawson is a visionary academic leader whose prominent work in data science has transgressed disciplinary boundaries. I am delighted that he is taking up the position of Director of the Jean Golding Institute, and greatly look forward to working with him.” said Pro Vice Chancellor for Research and Enterprise, Professor Phil Taylor.

Dr Lawson is Associate Professor in Data Science in the School of Mathematics, University of Bristol. He has been a longstanding friend of the Jean Golding Institute, becoming the Academic Lead for the JGI Data Science Seminar Series from 2018 and a JGI Steering Group Member since 2021, bringing his experience and knowledge in the field of Data Science advising the JGI of where to focus its activities and contributing to our five year plan.

Dr Lawson is a member of the Royal Statistical Society, a Fellow of the Higher Education Academy, and a Turing Fellow in Data Science, with The Alan Turing Institute and co-directs Compass, the EPSRC centre for Computational Statistics and Data science.

“It is a great honour to be guiding the Data Science community at Bristol as interim Director of the Jean Golding Institute. This is a great time to celebrate and build upon the monumental impact that the JGI has already had on Data Intensive Research, both within the University and beyond.” said Associate Professor Lawson on his appointment.

“I am looking forward to speaking to people across the University about their ideas for finding more ways to interact with, learn from, and understand the world with data. With so much exciting research being done, this is a great time to be a data scientist, and change is a good opportunity to start new discussions.”

“The JGI is open for business as ever. From the Ask JGI service for getting advice, to providing data expertise on grants, we are here to serve – and to inspire – our community.” he added.

Associate Professor Lawson began his career at Imperial College London, receiving his PhD in Mathematics and Computer Science in 2007.

In 2014, he joined the University of Bristol as a Sir Henry Dale Wellcome Trust Research Fellow before progressing to Lecturer, Senior Lecturer to Associate Professor in Data Science.

Associate Professor Lawson’s dedication to diversity and outreach initiatives is commendable. Through pioneering initiatives like “Access to Data Science,” he increased the gender, ethnic, and social diversity in academia, thereby contributing to a more inclusive research environment. His engagement in data science outreach efforts, spanning from Bayesian interpretation of Ghost Stories, to COVID modelling, and “What to know before studying Data Science” showcases his commitment to making complex concepts accessible to broader audiences. Additionally, his research, which ranges from landscape management policy documents to industry applications like the finestructure software, underscores his impact across diverse domains, cementing his status as a trailblazer in the field of data science and beyond.

His membership in advisory boards such as the Transdisciplinary Centre of Excellence Estonian Roots (CoEER), showcases his commitment to fostering interdisciplinary collaborations and advancing scientific endeavours.

The Jean Golding Institute is the central hub for data science and data-intensive research at the University of Bristol. We connect a multidisciplinary community of experts across the University and beyond. We offer free 1 day of support from our Ask-JGI “ask a data scientist” service for all staff and doctoral students at the University of Bristol, as well as a calendar of events and training throughout the year, such as the annual Bristol Data Week held in early June packed with interactive talks, training, and workshops, open to all and completely free of charge. Save the date for this year’s Bristol Data Week which will be held 3rd – 7th June 2024.

Associate Professor Lawson will commence in his role of Director of the Jean Golding Institute on the 19th February 2024.

Hear the JGI’s first monthly podcast: Data Hazards and Digital Phenotyping 

The JGI is delighted to launch the JGI Podcast, where the team at the Jean Golding Institute talk to different members of the University of Bristol’s data science research community. Each episode aims to highlight both the variety of backgrounds and paths that our guests come from as well as the diversity in methods, approaches and applications in data science research at the university.

Nina Di Cara shown on the left, Huw day shown in the middle & Léo Gorman on the right.

This month, Huw Day and Léo Gorman (Data Scientists at the JGI) talk to Nina Di Cara about Data Hazards and Digital Phenotyping.

Visit our podcast website, find it in your usual podcast catalogue, or use the player below:

Assessing the pathogenic potential of novel bacterial lineages: Towards an early warning system for problem pathogens

Read our latest JGI Seed Corn Funding Project Blog by Sion Bayliss and Daniel Lawson

The future of disease control relies heavily on understanding the evolution and emergence of bacterial strains in real time, a daunting yet crucial task. This project marked a significant step towards this goal, aiming to rapidly unmask potentially problematic ‘hybrids’ caused by the crossing of already established disease lineages and demystify the evolution and adaptation of harmful bacteria. We aim to provide the first steps in an early warning system against potential public health threats posed by new bacterial strains.

Understanding Bacterial Evolution: The Key to Future Disease Outbreaks

Genomic sequencing has become a key part of research programmes worldwide, notably pathogens that affect human and animal health and related strains that exist in the wild. This has led to the deposition of vast numbers of disease genomes in public repositories, a rich resource for the study of evolutionary processes underpinning the emergence of new bacterial lineages. During this project we developed a tool which could be used to search within these large collections and identify potentially problematic strains of disease-causing bacteria.

The focus of the work was on differentiating between whether emerging lineages had predominantly evolved due to hybridisation between of existing lineages or were a product of transfer from novel sources, such as animals or the environment. This differentiation was achieved by scrutinizing the bacterium’s genome sequences, enabling us to identify hybridized DNA sections, a critical step towards comprehending bacterial evolution.

Outcomes: A Rapid Detection System for Dangerous Hybrid Lineages

To classify lineages, we clustered sample based upon their ‘family-tree’. By examining these ‘phylogenetic trees’ we are able to identify candidate hybrid lineages present on distinctive long-branches (Figure 1). These ‘long-branch lineages’ could have evolved in various ways – having an increased evolutionary rate, being imported from a previously unknown source, or having recently emerged via hybridisation between two or more parent lineages.

Our groundbreaking achievement lies in the development of a software pipeline capable of rapidly differentiating between these various scenarios. Tested on both simulated and real-world genomic datasets, our software can identify otherwise cryptic ‘hybrid’ lineages (Figure 2). This development lays the foundation for a software tool to ‘flag’ potentially alarming strains being routinely added to large genomic databases which could pose a significant public health threat.

Future Developments

Our ultimate goal is to equip researchers and healthcare professionals with a tool that can provide early warnings for new and potentially dangerous bacterial strains. To this end, our project’s next steps include testing the tool on large and diverse databases of pathogens of public health concern, updating and streamlining the codebase for broader and easier use by other researchers, and development of a web-portal that would allow users to upload their samples for testing against a comprehensive example datasets.

Staying Connected

For more details about our project and future updates, please contact either Sion Bayliss or Daniel Lawson, and feel free to engage with us for any queries or discussions.

Figure 1. Example of a phylogenetic tree or ‘family tree’ of bacterial genomes with an example of a long-branch isolate (red circle).
Figure 1. Example of a phylogenetic tree or ‘family tree’ of bacterial genomes with an example of a long-branch isolate (red circle).
Figure 2. Example output of the software tool on small dataset with a simulated hybrid strain. The simulated hybrid strain is indicated in red. The correctly identified recipient strain is shown in blue and the donor strain is shown in green. The donor strain contributed approximately 20% of their genome to the recipient strain in randomly sized DNA tracts.
Figure 2. Example output of the software tool on small dataset with a simulated hybrid strain. The simulated hybrid strain is indicated in red. The correctly identified recipient strain is shown in blue and the donor strain is shown in green. The donor strain contributed approximately 20% of their genome to the recipient strain in randomly sized DNA tracts.

Developing an Integrated and Intelligent Algo-trading System

JGI Seed Corn Funding Project Blog 2022-2023: Jin Zheng

Introduction:

The financial trading landscape is constantly evolving, driven by advancements in technology and the need for faster, more efficient decision-making. Traditional algo-trading strategies have become central features of modern financial markets due to their speed, accuracy, and cost-effectiveness. However, these strategies often rely solely on the analysis of present and past quantitative data, neglecting the importance of incorporating qualitative data in the decision-making process. To address this limitation, we aim to develop an integrated and intelligent algo-trading system that combines advanced technology, data integration, and intelligent decision-making.

Data Integration:

In financial trading, individuals collect diverse information from various sources, including market-based data, past performance, financial reports, public opinion, and news. Our integrated system seeks to leverage the power of data integration by combining market-based quantitative data with qualitative data sources. By integrating these different data types, we can gain a more comprehensive understanding of the financial landscape and make informed trading decisions.

Advanced Technology:

The integrated system will harness advanced technologies such as artificial intelligence (AI), cloud computing, and machine learning algorithms. These technologies enable the analysis and processing of vast amounts of data in real-time. AI algorithms can identify patterns, trends, and correlations that may not be immediately apparent to human traders. Cloud computing provides the scalability and computing power necessary to handle large volumes of data and perform complex calculations. By leveraging these advanced technologies, our system can enhance decision-making and improve trading performance.

Intelligent Decision-Making:

The core objective of our system is to enable intelligent decision-making in the trading process. While traditional algo-trading strategies focus on quantitative analysis, our integrated approach incorporates qualitative data, allowing traders to better assess potential risks and identify market trends. By factoring in qualitative data, traders can make more informed decisions and adjust their strategies accordingly. Intelligent decision-making is achieved through the application of AI and machine learning algorithms, which can analyze vast amounts of data and provide valuable insights to traders.

With support from the Jean Golding Institute, we successfully ran a hybrid workshop on Machine Learning and data science in finance. The workshop aimed to bring together experts and enthusiasts in the field to exchange knowledge, share insights, and explore the intersection of machine learning and finance. We were fortunate to have a line-up of esteemed speakers, including four external experts one internal expert who are renowned in their respective areas of expertise. Their diverse backgrounds and experiences enriched the workshop and provided valuable perspectives on the application of machine learning in finance.

We have successfully finished the development of a robust data pipeline and have created a unified API that efficiently retrieves data from various sources. We have implemented effective data cleaning techniques and implemented measures to filter out spam. Additionally, we have utilized Graph Neural Networks (GNN) to determine the influence rate of each account and calculate the daily sentiment rate for several stocks with significant market capitalization. Furthermore, we have incorporated predictive models into our system.

Moving forward, our next objective is to create a cloud-based web service that empowers users to build their own trading robots, develop unique trading strategies, and design customized trading algorithms. To enhance the user experience, we will incorporate advanced data visualization techniques, allowing traders to effortlessly interpret and analyze the vast array of information available. Moreover, we aim to enhance the system’s capabilities by integrating machine learning algorithms for improved decision-making and risk management. Our ultimate goal is to create a user-friendly and versatile platform that caters to the needs of researchers, individual traders, and students alike. Through this platform, users will be able to gain practical experience, enhance their financial knowledge, and utilize cutting-edge technologies in the field of algorithmic trading.

Seeking ground truth in data: prevalence of pain and its impacts on well-being and workplace productivity

JGI Seed Corn Funding Project Blog 2022-2023: Neo Poon

Chronic pain is a major health issue across the globe. Researchers estimated that at least 10% of the population in the United Kingdom are suffering from pain conditions. If we consider the entire world, some estimated that over 20% of the population have chronic pain and that results in more than 20 million ‘pain days’ per year. Naturally, it is important to examine how pain conditions affect people’s well-being and their productivity in the workplace.

Our research team (Digital Footprints Lab at the Bristol Medical School, led by Dr Anya Skatova) specialises in using Big Data to investigate human behaviours and social issues. In our previous works, we have already established a link between the purchase of pain medicines and the proportion of people working part-time across geographical regions of the United Kingdom, which suggests an economic cost of chronic pain and an impact on national productivity.

With the funds provided by the Jean Golding Institute (JGI), we decided to directly investigate the ‘ground truth’. That is, instead of examining pain at geographical levels, we designed a survey to ask individuals about their pain conditions, well-being, physical health states, and employment status. Importantly, and relevant to JGI’s focus on data science, the survey also asks individuals to share their shopping history data with us. With the General Data Protection Regulation (GDPR) in place, residents in the United Kingdom have the right to data portability, which means people can choose to share their data held by companies to external organisations, such as a university or a research team. In our design, participants are asked to donate their loyalty card data related to their shopping at a major supermarket with us. This study allows us to ask important questions, such as how the frequency and types of pain relief purchases are related to different types of pain conditions reported by participants. We further ask questions including how pain conditions affect people’s life satisfaction and their ability to work, which might collectively have an impact on their shopping patterns beyond just the purchases of pain relief products.

The JGI funds facilitates the data collection process, which is being finalised at the moment of writing. Moving forward, this study will allow us to define chronic pain with shopping patterns alone, which can drive future research: by connecting the frequency and types of pain medicines with self-reported pain conditions from this study, we can find a way to define a metric and more accurately compute the prevalence of chronic pain from transaction data itself. Our research team has ongoing partnerships with other supermarket and pharmacy chains, which provide us access to commercial data for research purposes. When we conduct similar research using these external data and when it is not possible to directly involve participants with surveys, we can then employ our metric and estimate the proportion of people suffering from chronic pain. Furthermore, our study also includes questions about menstrual pain, which is an important but seldom studied aspect of pain experience, which opens up further avenue for research. Potentially we can examine how menstrual pain impacts the quality of life and people’s workplace productivity. Finally, our study also controls for Covid-19 history, which might have a long-term effect on pain conditions and subjective well-being, paving the way for research studying the longitudinal effect of Covid-19.