Ask-JGI Recruitment Blog 

We are recruiting a new team of PhD students for the Ask-JGI helpdesk to work from October 2025 until September 2026! 

The Jean Golding Institute (JGI) for data science and AI offers a consultancy service to researchers via its Ask-JGI helpdesk. We offer one day of free support to all staff and doctoral students at the University of Bristol, for queries relating to data science, AI, and software engineering. The helpdesk is run by PhD students and supported by the JGI’s own team of data scientists and research software engineers. 

What we’re looking for 

New recruits will be part of a team with overlapping and complementary skills, who will work together to support researchers in a range of ways. 

It is not expected that you will start with all the skills/experience that we are looking for the team to cover, however you should be enthusiastic about continuous learning and working outside your subject area. 

Typical queries (and skills/experience you may want to highlight in your application) include:  

  • Troubleshooting – Collaborating with researchers from different disciplines and of varying expertise, to find out what they need to do to solve their problem. 
  • Study design and planning – Providing statistical advice on experimental design. Identifying potential data hazards and ethical issues. 
  • Data cleaning and management – Helping to develop pipelines to make raw data ready for analysis. Advice on data management plans and data governance. 
  • Data analysis – Recommending or providing support with tools and methods for modelling, AI/machine learning and statistics. This might involve multilevel modelling, bioinformatics, GIS, NLP, random forests, deep learning, use of LLMs, or mixed/qualitative methods. 
  • Programming – Technical support and coding in (primarily) Python or R. But this could include other tools like SQL, MATLAB, SPSS, STATA, NVivo, Excel, C, Rust, Bash scripts etc. Code review and code optimisation. Deployment to HPC. 
  • Best practices – Giving advice on best practices for writing reproducible research code and creating packages. Support with tools like Git, GitHub, virtual environments and Conda, Docker. 
  • Data communication – Help with data visualisation. Providing advice with dashboards or websites. 

Applicants will need to be current full-time PhD students at the University of Bristol and will need to obtain approval from their primary supervisor. It is expected that applicants can commit on average 5-10 hours per month for 12 months. The team rotates responsibilities every fortnight and there are periods with a higher/lower volume of queries, so time commitments can vary throughout the year. 

Expected start date is the week commencing Monday 29 September 2025, working ad-hoc approximately 5-10 hours per month for 12 months. 

What’s in it for you? 

You will gain experience/skills which will be useful for your future research or career outside academia: 

  • Technical skills – learning from one another and developing best-practice skills in data science, AI and research software engineering. 
  • Project management – managing and prioritising multiple queries and allocating them to fellow team members. 
  • Team working – chairing team meetings, minute-taking, and collaborating with other team members on queries. 
  • Communication – sharing your expertise with researchers (of all levels) from different disciplines. 
  • Adaptability – developing and applying your skills to new and difficult problems, outside your immediate subject.  

This is a paid opportunity at Graduate Teacher – Level 1 for PhD students. 

How to apply 

Complete an online application form 

The deadline to apply is Thursday 31 July 2025. We will assess applications at the start of August and hope to communicate a decision in mid-August. 

The JGI aims to make data science, statistics and software engineering expertise accessible to all. We value diversity in our teams and so applicants from communities traditionally under-represented in data science, AI or research software engineering are strongly encouraged to apply. 

If you have any questions about the role, email jgi-reseng@bristol.ac.uk with the subject “Ask-JGI recruitment”. 

Testimonials from Ask-JGI team members 

Headshot of Yujie Dai

“Over the past year, I had the pleasure of working with the Ask-JGI team, and it was a truly enjoyable experience. The team was welcoming and supportive, and I had the opportunity to engage with researchers from a wide range of departments across the university, which broadened my perspective on different fields of study and enhanced my personal skills. I highly recommend joining this team!”Yujie Dai, Digital Health CDT 

 “What I enjoy most about working at the Ask-JGI helpdesk is the chance to connect with and assist researchers from all kinds of academic backgrounds. I may not always have the immediate answer to queries, but what really counts is doing my best to help and being willing to keep learning along the way.” Yueying Li, PhD student in Genetic Epidemiology 

Headshot of Yueying Li
Headshot of Fahd Abdelazim

“Working with the Ask-JGI service has been incredibly rewarding. I genuinely enjoy contributing directly to researchers’ projects, witnessing the tangible impact of our support. The variety of challenges, from diving into complex data analysis to helping visualize findings, keeps every day engaging and fulfilling.” –  Fahd Abdelazim, PhD student in Interactive AI, specializing in model understanding for Vision-Language models

“Being part of the Ask-JGI team is an excellent opportunity to improve communication skills over statistics/ data science tasks. As PGR students, most of us are accustomed to working within specialized areas of research, it is easy to overlook efforts and skills necessary for collaborating outside of those narrow fields of expertise. I have benefitted from working on the team to improve those skills.”Mirah Zhang, PhD student in Geographic Data Science 

Headshot of Mirah Zhang
Headshot of Dan Collins

“Working as an Ask-JGI data scientist has been a hugely rewarding experience. Each query involves supporting researchers from diverse specialisms across the University. It’s a great way to expose yourself to different technical challenges and research areas, and to explore new technologies that you haven’t worked with before.”Daniel Collins, PhD student in Interactive AI focussed on multi agent AI systems 

Tracing Voices: A Visual Journey through Latin American Debates about Race  

JGI Seed Corn Funding Project Blog 2023/24: Jo Crow

I’m a historian who is keen to learn how digital tools can strengthen our analysis of the material we find in the archives. I research histories of race, racism and anti-racism in Latin America. I’m particularly interested in how ideas about race travelled across borders in the twentieth century, and how these cross-border conversations impacted on nation-state policies in the region.  

The book I am currently writing investigates four international congresses that took place between the 1920s and 1950s: the First Latin American Communist Conference in Buenos Aires, Argentina (1929); the XXVII International Congress of Americanists in Lima, Peru (1939); the First Inter-American Conference on Social Security in Santiago, Chile (1942); and the Third Inter-American Indigenista Congress, in La Paz, Bolivia (1954). These were very different kinds of international meetings. but they all dedicated a significant amount of time to debating the problem of racial inequality, especially the ongoing marginalisation of indigenous peoples. 

Who was at these congresses? Who spoke to whom, and what conversations did they have? Where did the conversations took place? What did the rooms look like? How were they set up? And what about the spaces outside the formal discussion sessions – the drinks receptions that delegates attended, the archaeological sites and museums they visited, the film screenings and book exhibitions they were invited to, the restaurants they frequented, the hotels they stayed in? Luckily, I have found a great variety of source materials – conference proceedings, newspaper reports, personal and institutional correspondence, memoirs of participating delegates – that help me begin to answer these questions.

Black and white photos from a newsletter of men sat down in a room for the  XXVII International Congress of Americanists in Lima
Photographs of the XXVII International
Congress of Americanists in Lima. Published in
El Comercio newspaper, 11 September 1939.
Black and white photo of three delegates at the III Inter-American Indigenista Congress in La Paz.
Photograph of three delegates at the III Inter-American Indigenista Congress in La Paz. Included in an International Labour Organization report of 1954. 

As part of my JGI seed-corn project, I’ve been able to work with two brilliant researchers: Emma Hazelwood and Roy Youdale. Emma helped me to explore the uses of digital mapping for visualising the “who” and “where” of these congresses, and Roy helped me to experiment with machine-reading. In this blog, I share a few of the things we achieved and learnt.   

Digital Mapping

Emma started by inputting the data I had on the people who attended these congresses – their names, nationalities, where they travelled from – into Excel spreadsheets. She then found the coordinates of their origins using an online resource, and displayed them on a map using a coding language called Python. Below are a few of the results for Lima, 1939. The global map (Map 1) shows very clearly that this was a forum bringing together delegates from North, Central, and South America, and several countries in Europe too. We can zoom in to look more closely at the regional spread of delegates (Map 2), and further still to see what parts of Peru the Peruvian delegates came from (Map 3). For those delegates that were based in Lima – because we have their addresses – we can map precisely where in the city they or their institutions were based (Map 4).

Global map with red dots to show delegate locations and a green dot to highlight Peru
Map 1. The global map shows very clearly that this was a forum bringing together delegates from North, Central, and South America, and several countries in Europe.
Map of South America on the left and a zoomed in version on the right with red dots to show delegate locations and a green dot to highlight Peru
Map 2 (left) shows a zoomed in version of the global map to see the regional spread of delegates. Map 3 (right) shows what parts of Peru the Peruvian delegates came from.
Satellite image of Lima with different colour dots to symbolise different institute locations
Map 4. For delegates in Lima, the satellite image maps where in the city they or their institutions were based. 

In some ways, these visualisations don’t tell me anything I didn’t already know. From the list of conference attendees I compiled, for instance, I already had a sense of the spread of the countries represented in Lima in 1939. What the maps do do, however, is tell the story of the international nature of the conference much more clearly and speedily than a list or table can. With the city map, showing where Lima-based delegates lived and worked, we do learn something new. By plotting the addresses, I can envisage the contours of the space they occupied. I couldn’t do that in my head with just a list of the addresses, especially without knowing road names.   

The digital maps also help with comparative analysis. If we look at the global map (like Map 1) of all four congresses together we get a clear view of their very similar reach; most delegates to all of them were from South America. We are also able to swiftly detect the differences – for example, that the Lima conference attracted more delegates from Europe than the other meetings, or that there were no delegates from Europe at the 1954 congress in La Paz. We can then think about the reasons why.  

Satellite image of Lima with an old map layered on top with different colour dots to symbolise different locations
Map 5. Shows the main venues for the XXVII International Congress of Americanists.

Map 5 above takes us back to Lima. It shows the main venues for the XXVII International Congress of Americanists. It visualizes a circuit for us. I don’t think we can perceive this so clearly from a list of venues, especially if we are not very familiar with the city. Here we can see that most of the conference venues and the hotels where delegates stayed were clustered quite closely together, in Lima’s historic centre. Delegates could easily walk between them. There are a few outliers, though: one of the archaeological sites that delegates visited, the museum that threw a reception for delegates, and a couple of restaurants too. This prompts further questions and encourages us to imagine the delegates moving through the city.  

Machine Reading

As well as digital mapping, I’ve been keen to explore what machine or distant reading can add to our analysis of debates about race in early twentieth century Latin America. It’s widely known, for example, that, in the context of the Second World War, many academic and government institutions rejected the scientific validity of the term race (“raza” in Spanish). A machine reading of the proceedings of these four congresses gives us concrete, empirical evidence of how the word race was, in practice, used less and less from 1929, to 1939, to 1942, to 1954. Text analysis software like Sketch Engine, which Roy introduced me to, also enables us to scrutinise how the term was used when it was used. For instance, in the case of the 1929 conference in Buenos Aires, Sketch Engine processes 300+ pages of conference discussions in milliseconds and shows us in a systematic way which so-called “races” were being talked about, the fact that “race” was articulated as an object and a subject of the verb, and how delegates associated the term race with hostile relations, nationhood, indigenous communities, exploitation, and cultural tradition (see below). In short, it provides a really useful, methodical snapshot of the many different languages of race being spoken in Buenos Aires. It is then up to me to reflect on the significance of the detail, and to go back to specific moments in the text, for example the statement of one delegate about converting the “race factor” into a “revolutionary factor”.  

Results from a text analysis in Sketch Engine
Results from a text analysis in Sketch Engine for the 1929 conference in Buenos Aires. The result shows us in a systematic way which so-called “races” were being talked about.

In all, I’ve learnt how digital tools and methodologies can productively change how we’re able to look at things, in this case “race-talk” and who was speaking it. By looking differently we see differently too. What I’d like to do now is to trace where the conversations went from these congresses, and see how much they shifted and transformed in the process of travel.  


Jo Crow Professor of Latin American Studies , School of Modern Languages 

Using ‘The Cloud’ to enhance UoB laboratory data security, storage, sharing, and management

JGI Seed Corn Funding Project Blog 2023/24: Peter Martin, Chris Jones & Duncan Baldwin

Introduction

As a world-leading research-intensive institution, the University of Bristol houses a multi-million-pound array of cutting-edge analytical equipment of all types, ages, function, and sensitivity – distributed across its Schools, Faculties, Research Centres and Groups, as well as in dozens of individual labs. However, as more and more data are captured – how can it be appropriately managed to comply with the needs of both researchers and funders alike?  

What were the aims of the seed corn project? 

When an instrument is purchased, the associated computing, data storage/resilience, and post-capture analysis is seldom, if ever, considered beyond the standard Data Management Plans. 

Before this project, there existed no centralised or officially endorsed mechanism at UoB supported by IT Services to manage long-term instrument data storage and internal/external access to this resource – with every group, lab, and facility individually managing their own data retention, access, archiving, and security policies. This is not just a UoB challenge, but one that is endemic of the entire research sector. As the value of data is now becoming universally realised, not just in academia, but across society – the challenge is more pressing than ever, with an institution-wide solution to the entire data challenge critically required which would be readily exportable to other universities and research organisations. At its core, this Seed Corn project sought to develop a ‘pipeline’ through which research data could be; (1) securely stored within a unified online environment/data centre into perpetuity, and (2) accessed via an intuitive, streamlined and equally secure online ‘front-end’ – such as Globus, akin to how OneDrive and Google Drive seamlessly facilitate document sharing.   

What was achieved? 

The Interface Analysis Centre (IAC), a University Research Centre in the School of Physics currently operates a large and ever-growing suite of surface and materials science equipment with considerable numbers of both internal (university-wide) and external (industry and commercial) users. Over the past 6-months, working with leading solution architects, network specialists, and security experts at Amazon Web Services (AWS), the IAC/IT Services team have successfully developed a scalable data warehousing system that has been deployed within an autonomous segment of the UoB’s network, such that single-copy data that is currently stored locally (at significant risk) and the need for it to be handled via portable HDD/emailed across the network can be eliminated. In addition to efficiently “getting the data out” from within the UoB network, using native credential management within Microsoft Azure/AWS, the team have developed a web-based front-end akin to Google Drive/OneDrive where specific experimental folders for specific users can be securely shared with these individuals – compliant with industry and InfoSec standards. The proof of the pudding has been the positive feedback received from external users visiting the IAC, all of whom have been able to access their experiment data immediately following the conclusion of their work without the need to copy GB’s or TB’s of data onto external hard-drives!  

Future plans for the project 

The success of the project has not only highlighted how researchers and various strands within UoB IT Services can together develop bespoke systems utilising both internal and external capabilities, but also how even a small amount of Seed Corn funding such as this can deliver the start of something powerful and exciting. Following the delivery of a robust ‘beta’ solution between the Interface Analysis Centre (IAC) labs and AWS servers, it is currently envisaged that the roll-out and expansion of this externally-facing research storage gateway facility will continue with the support of IT Services to other centres and instruments. Resulting from the large amount of commercial and external work performed across the UoB, such a platform will hopefully enable and underpin data management across the University going forwards – adopting a scalable and proven cloud-based approach.  


Contact details and links

Dr Peter Martin & Dr Chris Jones (Physics) peter.martin@bristol.ac.uk and cj0810@bristol.ac.uk 

Dr Duncan Baldwin (IT Services) d.j.baldwin@bristol.ac.uk  

Ask-JGI Example Queries from Faculty of Health and Life Sciences 

All University of Bristol researchers (from PhD student and up) are entitled to a day of free data science support from the Ask-JGI helpdesk. Just email ask-jgi@bristol.ac.uk with your query and one of our team will get back to you to see how we can support you. You can see more about how the JGI can support data science projects for University of Bristol based researchers on our website (https://www.bristol.ac.uk/golding/supporting-your-research/data-science-support/). 

We support queries from researchers across all faculties and in this blog we’ll tell you about some of the researchers we’ve supported from the Faculty of Health and Life Sciences here at the University of Bristol. 

AI prediction on video data 

Example of AI video prediction using video data taken from the EPIC-KITCHENS-100 study. The image shows qualitative results of action detection. Predictions with confidence > 0.5 are shown with colour-coded class labels.

One particularly interesting query came from a PhD researcher with no prior experience in programming or AI. She was exploring the idea of using AI to predict how long doctors at different skill levels would need to train on medical simulators to reach advanced proficiency. Drawing inspiration from aviation cockpit simulators, her project involved analysing simulation videos to make these predictions. We provided guidance on the feasibility of using AI for this task, suggesting approaches that would depend on the availability of annotated data and introducing her to relevant computer vision techniques. We also recommended Python as a starting point, along with resources to help her build foundational skills. It was exciting to help someone new to AI navigate the early stages of their project and explore how AI could contribute to improving medical training. 

Species Classification with ML 

Bemisia tabaci (MED) (silverleaf whitefly); two adults on a watermelon leaf. Image by Stephen Ausmus.

Another engaging query came from a researcher in biological sciences aiming to classify different species of plant pest insects—Bemisia, tabaci and two others—based on flight data. Her goal was not only to build machine learning classifiers but also to understand how different features contributed to species differentiation across various methods.

She approached the Ask-JGI data science support for guidance on refining her code and ensuring the accuracy of her analysis. We helped restructure the code to make it more modular and reusable, while also addressing bugs and improving its reliability. Additionally, we worked with her to create visualizations that provided clearer insights into model performance and feature importance. This collaboration was a great example of how machine learning can be applied to advancing research in ecological data analysis.  

Providing guidance for HPC, RDSF, and statistical software users 

High performance computing (HPC) and the Research Data Storage Facility (RDSF) have been used by an increasing number of people at our university. We also recommend them to students and staff when these tools align with their projects’ needs. However, getting started can be challenging—each system has its own frameworks, rules, and workflows. Researchers often find themselves overwhelmed by extensive training materials or stuck on specific technical issues that aren’t easily addressed.  

We provide tailored guidance to make these tools more accessible and practical for our clients, which includes troubleshooting, script modifications, and directing researchers to relevant university services. 

Additionally, this year’s Ask-JGI Helpdesk has brought together experienced users of SPSS, Stata, R, and Python. For researchers transitioning to new statistical software or adapting their workflows, we’ve helped them navigate the subtle differences in syntax across platforms and achieve their analysis goals. 

Handling Group-Level Variability in Quantitative Effects: A Multilevel Modelling Perspective

A visualisation of a multilevel model, original figure produced by JGI Data Scientist, Dr Leo Gorman.

We had a client who was researching differences in fluorescence intensity. This may be potentially due to factors such as antibody lot variation, differences in handling between researchers, or biological heterogeneity. This raises the question: How should such data be represented to ensure meaningful interpretation without misrepresenting the underlying biological processes? One of the key solutions that we recommend is to introduce multilevel modelling.  

Modelling fluorescence intensity at one or multiple levels (e.g., individual, batch, researcher) can help distinguish biological effects from biases. To be specific, for example, by applying mixed effects, we can account for between-individual variation in baseline fluorescence levels (random intercept), as well as differential responses to experimental conditions (random slope). Sometimes, the application of multilevel modelling also appears to be limited by the group-level sample size. If this is the case, as we discussed with the client, we don’t need to go as extreme as fitting multilevel models. To control for variations with such a small amount of changes, we can use alternative strategies, such as correcting standard errors and introducing dummy variables to achieve similar performance. 

The Turing Seminars 2024-2025

From November 2024 – April 2025, we (the Turing Liaison Team at Bristol) ran a fruitful Turing Seminar Series. This series boasted academics connected to the Turing Institute, speaking about their cutting-edge research in data science and AI.

From Machine Learning, Large Language Models and Digital Twins, to early prediction of dementia, disambiguation in historical texts and evolutionary biology, the range of speaker specialisms reflected the breadth of research at Bristol in this space, reaching academics and early career researchers across the institution.

Below are a list of the talks and speakers:

Wednesday 6 November:

  • Title: Machine Learning and Dynamical Systems meet in Reproducing Kernel Hilbert Spaces
  • Speaker: Boumediene Hamzi, Marie Curie Fellow, Imperial College London.

Wednesday 20 November:

  • Title: Trustworthy Digital Twins: designing, developing, and deploying open and reproducible pipelines
  • Speaker: Chris Burr, Head of the Innovation and Impact Hub, Turing Research and Innovation Cluster for Digital Twins, Alan Turing Institute

Wednesday 4 December:

  • Title: What can your shopping basket say about your health?
  • Speaker: Anya Skatova, Senior Research Fellow, Bristol Medical School (PHS)

Wednesday 15 January:

  • Title: AI-guided tools for early prediction of brain and mental health disorders
  • Speaker: Zoe Kourtzi, Professor of Computational Cognitive Neuroscience, University of Cambridge

Wednesday 12 February:

  • Title: Temporal models for Word Sense Disambiguation in historical texts
  • Speaker: Barbara McGillivray, Lecturer in Digital Humanities and Cultural Computation, Kings College London

Wednesday 26 February:

  • Title: “If you can’t tell, does it matter?” What should the law say about humanlike AI?
  • Speaker: Colin Gavaghan, Professor of Digital Futures, Bristol Digital Futures Institute, University of Bristol

Wednesday 12 March:

  • Title: “Cognition-first evolution”
  • Speaker: Richard Watson, Professor, (evolutionary biology and computer science), University of Southampton

Wednesday 26 March:

  • Title: “Big data as propeller for dynamic and time-sensitive service industries: a tourism sector perspective.”
  • Speaker: Nikolaos Stylos, Associate Professor in Marketing and Digital Innovation, Business School, University of Bristol

Wednesday 9 April:

  • Title: Can large language models reason about qualitative spatial information?
  • Speaker: Robert Blackwell, Senior Research Associate, Alan Turing Institute

These seminars have connected external researchers with relevant academics and departments at Bristol, and we have already seen these connections turn into longer-term collaborations. After the talk by Chris Burr, Alan Turing Institute, we organised a workshop between the Alan Turing Institute and the Bristol Digital Futures Institute (BDFI). This workshop provided an insight into digital twin projects run by both institutes, as well as facilitating connections. We took visitors on a tour of BDFI to show the incredible facilities, namely the Reality Emulator – the world’s first large-scale digital twin facility. Staff then went into roundtable discussions delving into shared areas of interest and what a longer-term collaboration could look like.

Over the series, we had 184 internal and external attendees, with 80% feeding back that they found the information / content provided during the event helpful. We are planning on running another series in the 2025-2026 academic year, building on our momentum and further increasing our external and internal networks.

If you have any suggestions of who you would like to see speak as part of next year’s series, please contact Isabelle Halton, Turing Liaison Manager – uob-turing@bristol.ac.uk

You can find out more about Turing events and opportunities at Bristol, including the previous Turing Seminar talks and slides on the Turing web pages.