Meet the Ask-JGI team – Adrianna, Fahd, Yujie & Huw

The new Ask-JGI helpdesk cohort started in September 2024 and have been busy answering queries from researchers across the university! We introduced half of the team in our January blog. Meet the other half of the team below:

Adrianna Jezierska (she/her) – Ask-JGI PhD Student

Headshot of Adrianna Jezierska
Adrianna Jezierska, PhD candidate in in the School of Business

I’m a PhD student at the University of Bristol Business School. My project focuses on social media influencers and their vegan content on YouTube. Using language derived from video transcripts, I analyse to what extent they legitimise veganism so that it becomes popular and desirable in society. Whilst most organisation and management scholars have developed theories based on qualitative data, resulting in small datasets and case study approaches, in my work, I highlight the role of computational social sciences and big data in helping social scientists answer their research questions.

Coming from a social science background, I was initially hesitant about joining the Ask-JGI team. However, this decision has turned out to be the most rewarding and challenging experience. Being part of the team is a continuous learning journey. The questions we receive span various disciplines, often pushing us out of our comfort zones. The most exciting part of the job is the opportunity to communicate with other researchers and receive their positive feedback. On the other hand, we constantly collaborate with other team members and learn from each other, which makes it a very supportive environment. I’m pleased to see more queries from social scientists and humanities researchers. The growing popularity of computational approaches and the shift towards interdisciplinary research is a trend that I find inspiring and exciting

Fahd Abdelazim (he/him) – Ask-JGI PhD Student

Headshot of Fahd Abdelazim
Fahd Abdelazim, PhD student on the Interactive AI CDT in the School of Computer Science

I am a PhD student in the Interactive Artificial Intelligence CDT, specializing in model understanding for Vision-Language models. My research focuses on introducing improvements to Vision-Language models that allow for better linking of specific ideas or attributes to physical items, in order to help models recognize and understand the properties of objects in images.

I first heard of the Ask-JGI team through fellow PhD students, and it was recommended to me as a way to apply data science skills to real-world applications. Joining the Ask-JGI helpdesk has been a unique experience where I’ve been able to delve into various domains and learn about topics that I would otherwise not have had the chance to learn about. The team truly values cross-functional collaboration and encourages tackling new challenges and learning on the job.

Working at Ask JGI is incredibly rewarding. I enjoy the diversity of challenges presented by each query which gives me the chance to improve as a data scientist and gain a better understanding of how data science can help improve academic research. I really enjoy the collaborative spirit within the team. The Ask-JGI team are from many different disciplines and interacting with them allows for interesting exchanges of ideas and problem-solving approaches. This allows me to grow not just as a data scientist but as a researcher as well.

Yujie Dai (she/her) – Ask-JGI PhD Student

Headshot of Yujie Dai
Yujie Dai, PhD student in the Digital Health and Care CDT

I am a PhD student in the Digital Health and Care CDT, specializing in population health data science. My research focuses on leveraging large-scale real-world health data to address critical challenges in infectious diseases. Specifically, I utilize explainable AI (XAI) techniques to characterize and diagnose diseases, aiming to bridge the gap between data science and public health.

 My journey with Ask-JGI began with a recommendation from a friend who was previously part of the team. They spoke highly of the collaborative and dynamic environment, and I was intrigued by the opportunity to apply my skills in real-world research settings. Joining Ask-JGI is an extension of my academic and research pursuits. I was drawn to the idea of supporting researchers across diverse disciplines, helping them navigate technical challenges in their projects, and learning from their different perspectives. The chance to engage with cutting-edge problems and contribute to solutions beyond the scope of my own research was exciting.

There’s so much to love about being part of Ask JGI. I love the variety of work. Each question I encounter presents a new challenge, whether it’s developing a data analysis pipeline, troubleshooting code, or brainstorming creative solutions for a computational problem. The variety keeps me constantly learning and growing as a data scientist. I also love the collaborative atmosphere. Working closely with researchers from different fields gives me diverse ways of thinking and problem-solving. It’s an opportunity to not only apply my skills but also to know more about the scientific community.

Huw Day (he/him) – Ask-JGI Lead

Headshot of Huw Day
Huw Day, JGI Data Scientist

I am a JGI Data Scientist with a background in mathematics, working on a variety of data science projects with researchers across the university using a variety of data science methodologies and techniques. I also help run the Data Ethics Club.

As Ask-JGI Lead, I am responsible for recruiting, training and the general managing of the Ask-JGI team. They’re a fantastic group and I consider myself really lucky to be able to work with them. I support some of the general queries and I’m also responsible for talking with researchers interested in costing out data science support in grant applications.

To me, the Ask-JGI helpdesk is based on the idea that any researcher who wants to do data science should be empowered to do so. Whilst we often do the data science for people, I think the most rewarding outputs from our helpdesk is when we empower researchers to do data science themselves, guiding and validating their work. It’s also a wonderful opportunity for myself and the rest of the helpdesk to learn about research across the university.


All University of Bristol researchers (including PhDs) are entitled to a day of free data science support from the Ask-JGI helpdesk. Just email ask-jgi@bristol.ac.uk with your query and one of our team will get back to you to see how we can support you.

If you’re a PhD student interested in joining the Ask-JGI team, we will do recruiting for the next academic year in summer of 2025 so keep an eye on the JGI mailing list for when we have our recruiting call. We recruit a new cohort every year but do not accept speculative applications outside of the recruiting call.

Meet the Research Data Advocate team

We are delighted to announce a new pilot training scheme led by our newly-appointed JGI Research Data Science Advocates. This is a new way to take part in training in a low-stress, collaborative and supportive environment, and at the same time form a community of data scientists in your area. 

The pilot will run JGI training events over a whole week in Schools, supported by a local Data Science Advocate. They will run sessions to support a cohort to undertake the training together, over the course of a week. The formal training takes only around 2-3 hours to complete, but it is anticipated that this format will allow deeper learning and more useful application to research.  

To take part in the pilot (which is aimed at relatively inexperienced coders within a discipline), please email to jgi-training@bristol.ac.uk. If your school doesn’t have a volunteer, you would be welcomed at a research-adjacent community. Bios for our Advocates are below and even if you don’t need this particular training, they would love to include you in an ongoing data science community, so please get in touch. 

Ruolin Wu

Headshot of Ruolin Wu

I am a PhD student of paleobiology diving into the mysteries of evolutionary history. Armed with code, fossils, and molecular data, I craft stories about topological and temporal pattern of animals and plants. Outside of academia, I like climbing, handcrafts, succulents and ferns of any kind.

Zhiyuan Xu

Headshot of Zhiyuan Xu

I am a 1st year PhD student focusing on data science and artificial intelligence, with a particular focus on large language models and their applications. My background includes experience in machine learning, data-driven research, and interdisciplinary collaboration to address complex problems.

Bryony Clifton

Headshot of Bryony Clifton

I’m a PhD student in Biochemistry, studying the molecular details underpinning neurotransmission. My project focuses on identifying the biological role for an uncharacterised intramembrane protease found in the human brain. During my PhD, I have become aware of the importance of developing tools to present complex datasets in a clear and informative way. I am excited to begin my role with the JGI where I can support others to build these skills too.

Catherine Upex

Headshot of Catherine Upex

I’m Catherine and I’m a first year PhD student based in the medical school. I’m using data science and AI to understand the shape and movement patterns of the heart over different disease states. I’m also currently working on a mini-project using AI protein folding tools, like AlphaFold, and computer simulations to uncover interactions between synthetic cannabinoids and the hERG potassium channel and its relation to arrythmia risk.

Kaan Deniz

Headshot of Kaan Deniz

Aerospace Engineer who has intensive industrial experience in numerical modelling with a MSc degree from the University of Bristol/ Aerospace Engineering.  Current PhD student in Aerospace Engineering at the University of Bristol. Research focus is numerical modelling of composite manufacturing processes. 

Boy Li

Headshot of Boy Li

I study how to synergize domain-specific knowledge with data-driven deep learning models to extract information from remote sensing imagery.

Vaishnudebi Dutta

Headshot of Vaishnudebi Dutta

I am an Engineering Mathematics PhD student working on model and data-driven design of combination therapies for non-small cell lung cancer. Beyond my research, I serve as the School of Engineering Mathematics and Technology (SEMT) PhD Student Representative, advocating for and supporting the academic community. I also hold a key position as the PhD Representative for the Bristol Cancer Research Network where I get the opportunity to share research updates to Clinicians, and others in the network. Additionally, I manage the network’s official X (formerly Twitter) presence, helping to disseminate research developments and maintain engagement with the broader scientific community.

Zhengzhe Peng

Headshot of Zhengzhe Peng

I am a PhD student with a diverse background in computer science, business, and over a year of IT work experience. My research applies advanced data science methods, with a focus on AI, to explore real-world challenges. I am dedicated to expanding my knowledge in these fields and eager to help others who are new to data science, working together to advance and explore new possibilities in this ever-evolving domain.

Winfred Gatua

Headshot of Winfred Gatua

Winfred Gatua is a PhD Fellow at the University of Bristol, specializing in Molecular Genetics and Life Course Epidemiology. Her research focuses on the triangulation of evidence between Mendelian randomization and randomized controlled trials for complex diseases. She holds an MSc in Bioinformatics, a Postgraduate Diploma in Health Research Methods, and a BSc in Biomedical Science and Technology. Transitioning from wet lab biomedical sciences to dry lab bioinformatics, Winfred is a self-taught coder passionate about open science, automation, and reproducible research in genetics. Beyond research, Winfred is dedicated to capacity building, particularly in increasing computational and data literacy among non-computer science researchers. Since 2021, she has been a volunteer instructor with The Carpentries, securing funding, hosting and instructing carpentries lessons that equip researchers with essential skills in data analysis, open science, reproducible research and best practices in scientific computing in different institutions across the globe.

The Royal Statistical Society Annual Conference 2024

The Royal Statistical Society meets annually for their internationally attended conference. It serves as the UK’s annual showcase for statistics and data science. This year they met in Brighton for a conference attended by over 600 attendees from around the world, including JGI Data Scientist Dr Huw Day.

The conference had over 250 presentations, including contributed talks, rapid-fire talks, and poster presentations. At any one time, there could be as many as 6 different talks going on, so it was impossible to go to everything but below are some of Huw’s highlights of the conference.

Pre-empting misunderstandings is part of trustworthy communication

From left to right; Dr Huw Day, Professor Sir David Spiegelhalter and Dr Simon Day
From left to right; Dr Huw Day, Professor Sir David Spiegelhalter and Dr Simon Day (RSS Fellow and Huw’s dad) at the RSS International Conference 2024.

As part of a session on communicating data to the public, Professor Sir David Spiegelhalter talked about his experiences trying to pre-bunk misinformation when displaying data.

Data in June 2021 showed that the majority of COVID deaths are in the vaccinated group. The Brazilian president President Jair Bolsonaro used this data to support claims that Covid vaccines are killing people. Spiegelhalter and his colleague Anthony Masters tried explaining why this wasn’t a sign the vaccine was bad in an article in The Observer “Why most people who now die with Covid in England have had a vaccination”.

Consider the following analogy: most car passengers who die in car accidents are wearing seatbelts. Intuitively, we understand that just because these two variables are associated, it doesn’t mean that one causes the other. Having a story like that means you don’t have to talk about base rates, stratification or even start to use numbers in your explanations.

We should try to make the caveats clearer of data before we present them. We should be upfront from what you can and can’t conclude from the data.

Spiegelhalter pointed to an academic paper: “Transparent communication of evidence does not undermine public trust in evidence” where participants were shown either persuasive or balanced messages about the benefits of Covid vaccines and nuclear power. It’s perhaps not surprising to read that those who already had positive opinions about either topic continued to have positive views after reading either messages. Far more interesting is that the paper concluded that “balanced messages were consistently perceived as more trustworthy among those with negative or neutral prior beliefs about the message content.”

Whilst we should pre-empt misconceptions and caveats, being balanced and more measured might prove to be an antidote to those who are overly sceptical. Standard overly positive messaging is actively reducing trust in groups with more sceptical views.

Digital Twins of the Human Heart fueled Synthetic 3D Image Generation

Digital twins are a digital replica/simulator of something from the real world. Typically it includes some sort of virtual model which is informed by real world data.

Dr Dirk Husmeiser at the University of Glasgow has been exploring the application of digital twins of the human heart and other organs to investigate behaviour of the heart during heart attacks, as well as trying to use ultrasound to measure blood flow to estimate pulmonary blood pressure (blood pressure in the lungs). Usually, measuring pulmonary blood pressure is an extremely invasive procedure, so using ultrasound methods has clear utility.

One of the issues of building a digital twin is having data about what you’re looking at. In this case, the data looks like MRI scans of the human heart, taken at several “slices”. Because of limitations in existing data, Dr Vinny Davies and Dr Andrew Elliot, (both colleagues of Husmeiser at the University of Glasgow)have been attempting to develop methods of making synthetic 3D models of the human heart, based on their existing data. They broke the problem down into several steps, working to generate synthetic versions of the slices of the heart (which are 2D images) first.

The researchers were using a method called Generative Adversarial Networks (GANs), where two systems compete against each other. The generator system generates the synthetic model and the discriminator system tries to distinguish between real and synthetic images. You can read more about using GANs for synthetic data generation in a recent JGI blog about Chakaya Nyamvula’s JGI placement.

Slide on “Generating Deep Fake Left Ventricle Images for Improved Statistical Emulation”.
A slide from Dr Vinny Davies and Dr Andrew Elliot’s talk on “Generating Deep Fake Left Ventricle Images for Improved Statistical Emulation”. The slide depicts how progressive GANs work, where the generator learns how to generate smaller, less detailed images first and gradually improves until it can reproduce 2D slices of MRIs of the human heart.

Because the job of the generator is far harder than that of the discriminator (consider the task of reproducing a famous painting, versus spotting the difference between an original painting and a version drawn by an amateur), it’s important to find ways to make the generator’s job easier early on, and the discriminator’s job harder so that the two can improve together.

The researchers used a method called a Progressive GAN. Initially they gave the generator the task of drawing a lower resolution version of the image. This is easier and so the generator did easier. Once the generator could do this well, they then used the lower resolution versions as the new starting point and gradually improved the correlation. Consider trying to replicate a low resolution image – all you have to do is colour in a few squares in a convincing way. This then naturally makes the discriminator job’s harder, as it’s tasked with telling the difference between two, extremely low resolution images. This allows the two systems to gradually improve in proficiency.

The work is ongoing and the researchers at Glasgow are looking for a PhD student to get involved with the project!

Data Hazards

On the last day of the conference, Huw alongside Dr Nina Di Cara from the School of Psychology at the University of Bristol presented to participants about the Data Hazards project.

Participants (including Hadley Wickam, keynote speaker and author of the famous R package tidyverse) were introduced to the project, shown examples of how it has been used and then shown an example project where they were invited to take part in discussions about which different data hazards might apply and how you might go about mitigating for those hazards. They also discussed the importance of focussing on which hazards are most relevant and prominent.

Dr Huw Day (left) and Dr Nina Di Cara in front of a screen that says 'Data Hazards Workshop'
Dr Huw Day (left) and Dr Nina Di Cara (right) about to give their Data Hazards workshop talk at the RSS International Conference 2024.

All  the participants left with their own set of the Data Hazard labels and a new way to think about communicating hazards of data science projects, as well as invites to this term’s first session of Data Ethics Club.

Chakaya Nyamvula’s JGI Placement 

Hi, I’m Chakaya. I am currently pursuing my MSc in AI and Data Science at Keele University and working as a Business Intelligence Analyst at iLabAfrica at Strathmore University in Nairobi, Kenya. This summer, thanks to the partnership between iLabAfrica and JGI, I had an amazing opportunity to work with JGI for my Master’s placement. I wanted to immerse myself in a research environment and connect with people in academia to help figure out my future career path. Working under the guidance of Dr Huw Day, I gained valuable insights into the world of research and expanded my professional network, all while experiencing life in the UK. 

Chakaya Nyamvula in front of a body of water
Chakaya Nyamvula, JGI Intern

What was the project about? 

Previously for a JGI funded Seedcorn project Mark Mumme, Eleanor Walsh, Dan Smith, Huw Day, and Debbie Johnson had surveyed researchers on their thoughts on how they might want to use synthetic data to help with their research. 

Synthetic data is when you take an existing dataset and create a synthetic (i.e. fake) version of it. You might want to do this so you can share something that looks like the data but preserves the privacy of individuals in it, whilst still having a flavour of what the data looks like and what statistical patterns might be present within it. This is useful for writing data pipelines whilst you go through necessary ethics checks to access sensitive data, amongst other things. 

For my summer placement with JGI, I worked with the MIMIC IV dataset of electronic health records and explored methods of generating synthetic versions of some of this data. It was also important to understand how you could measure or benchmark how successful your synthetic data generation has been, based on how well you had preserved privacy or how well the statistics of your synthetic data emulated those of your real data. 

What else did you do as part of your placement? 

Alongside my main work, I attended JGI Data Science meetings and learnt about some of the data science projects at the JGI including a project on antimicrobial resistance and another on 3D image analysis of CT scanned zebrafish to study bone development. 

For some of the more computationally demanding aspects of the project, I got taught how to make use of the JGI’s server (known within the office as “Jeeves”). 

I also had the opportunity to meet some PhD students at the University of Bristol, ask them about their research, and get advice on applying for PhDs in the future. 

Left to right, Huw Day, Elena Fillola Mayoral, Yujie Dai and Chakaya Nyamvula sat at a table at an ice cream shop
From left to right: Huw Day (JGI Data Scientist), Elena Fillola Mayoral (PhD student in AI for Climate), Yujie Dai (CDT in Digital Health) and Chakaya Nyamvula (JGI Intern) discussing PhDs over ice cream

What did you learn about? 

One deep learning method we used was something called a Generative Adversarial Network (GAN). Prior to this project, I had never worked with GANs before, so diving into this methodology was both challenging and exciting.  

A GAN works by having two competing neural networks, a generator and a discriminator. The generator’s job in this case was to take the original data and generate synthetic versions of that data. The discriminator’s job is to try and spot the difference between the real and the synthetic data that has been generated. One of the advantages of such a system is that you have two outputs: 1) a neural network which can generate synthetic data based on some training data and 2) a second neural network which can discriminate between real and synthetic data. This has advantages for applications where people might maliciously generate synthetic data, for example deep fake images. 

A good analogy for GANs is two people learning chess by playing against one another. If both start at similar skill levels, then as one person improves, the other slowly improves too. If you lose a chess game, you know you made a mistake and you might be able to work out how to improve for the next time. If you win, then you know you were doing something right.  

However, if you pit a chess grandmaster against a complete beginner, then the beginner will lose every time and will struggle to understand where they are going wrong, making it difficult to improve. Because the task of making synthetic data is quite complicated, when we began the process of training the GAN, the generator was frequently getting it wrong and wasn’t really able to figure out how to improve. 

To combat this, we did two things. First, you can handicap the discriminator a bit to give the generator a head start (imagine making your grandmaster play blindfolded). This helped, but still wasn’t enough. 

One of the pair plots showing generated vs real data a epoch 0
One of the pair plots showing generated vs real data a epoch 25000
Pair plots showing how well the real and the synthetic data matches by comparing each column. Real data is in blue, synthetic data is in red. The diagonal plots show histogram density plots of each column and how it compares between real and synthetic data. The off diagonal show scatter plots between pairs of variables. The left pair plot shows the output at the start of training, where the synthetic generator just randomly samples a scatter of points. You can see that this is not a good match for the original data. The right pair plot shows that after training, the generator does a lot more of a convincing job at emulating the real data. It is still not perfect, but it is particularly good at identifying clumps of data.

Secondly, you can start to think about how you inform your neural networks whether or not they were successful. Imagine if instead of “win” or “lose” as your outcome of the chess games, you got a measure of how well you performed, say a measure of how many good moves you made. With this more specific information, it becomes easier to decipher why you lost and how you might improve.  

To Be Continued? 

To finish my placement, I shared my experience with my placement supervisors at Keele University through a presentation and a report. I then had the opportunity to present my work to the Data Science Seminar at the University of Bristol, with several lecturers from the data science community in attendance, alongside JGI Data Scientists and some friends I made along the way.  

Additionally, all the code we worked on can be found in a public GitHub repository for other researchers to use and experiment with can be found on Chakaya’s Github.

Chakaya Nyamvula and Huw Day standing in front of a projector presenting at the Data Science Seminar. The projector has a slide on it that says 'Introduction to synthetic data' 
Chakaya Nyamvula (left) and Huw Day (right) presenting at the Data Science Seminar 

Reflecting on my placement at JGI, I can confidently say it was an incredible learning experience. I had the privilege of working with a fantastic supervisor, Dr Huw Day, who provided guidance throughout the project. Co-working with the talented data scientists at JGI was both inspiring and rewarding, and I thoroughly enjoyed networking with professionals in academia. The challenges I faced particularly working with GANs for the first time, pushed me to grow and expand my skill set.  Overall, this experience not only deepened my technical expertise but also solidified my interest in pursuing a career that bridges research and data science. 

New Turing Liaison Officers join the JGI team

As an active member of the Turing University Network, we have appointed a Turing Liaison Manager and two Turing Liaison Academics to support and enhance the partnership between Alan Turing Institute and the University of Bristol. These roles will be focusing on increasing engagement from Turing, developing external and internal networks around data science and AI, and supporting relevant interest groups, Enrichment students and Turing Fellows at the University of Bristol.

Turing Liaison Manager, Isabelle Halton and Turing Academic Liaisons, Conor Houghton and Emmanouil Tranos, are keen to build communities around data science and AI, providing support to staff and students who want to be more involved in Turing activity.

Isabelle previously worked in the Professional Liaison Network in the Faculty of Social Sciences and Law. She has extensive experience in building relationships and networks, project and event management and streamlining activities connecting academics and external organisations.

Conor is a Reader in the School of Engineering Mathematics and Technology, interested in linguistics and the brain. Conor is a Turing Fellow and a member of the TReX, the Turing ethics committee.

Emmanouil is currently a Turing Fellow and a Professor of Quantitative Human Geography, specialising primarily on the spatial dimensions of the digital economy.


If you’re interested in becoming more involved with Turing activity or have any questions about the partnership, please email Isabelle Halton, Turing Liaison Manager via the Turing Mailbox