JGI Seedcorn follow on funding 2023-25: Mike Jones and Brendan Smith
Introduction
This project was a follow-on extension of the JGI-funded seed-corn initiative titled ‘Digital Humanities meets Medieval Financial Records: The Receipt Rolls of the Irish Exchequer.’ The original project, along with a subsequent paper, ‘The Irish Receipt Roll of 1301–2: Data Science and Medieval Exchequer Practice,’ focused on a single receipt roll from the 1301–2 financial year. Building on this foundation, the follow-on project aimed to enhance software and techniques across a larger collection of receipt rolls from Edward I’s reign (1272–1307), offering broader insights into medieval financial practices. However, developing the scripts and troubleshooting errors took longer than expected, which reduced the time available for more in-depth analysis. Nevertheless, we managed to develop a data processing pipeline that allowed a broad analysis of the pipe rolls.
Data
The Irish Exchequer was a government institution responsible for collecting and disbursing income within the lordship of Ireland on behalf of the English Crown. Receipt rolls documented the money received each day by the Irish Exchequer from crown officials, private individuals, and communities. The entries in the rolls consisted of heavily abbreviated Medieval Latin.
There are forty surviving receipt rolls from the reign of Edward I held at the National Archives (TNA) in London. The Virtual Record Treasury of Ireland (VRTI) has translated the rolls from Latin into English for Edward I and later reigns. They have also encoded the translations into TEI/XML (https://tei-c.org), creating a machine-readable and structured digital corpus. The translations and high-quality images of the original documents are accessible to the public on the VRTI website. We gained early access to the TEI/XML documents for Edward I’s reign, which formed the foundation of our data corpus.
Data processing pipeline

To analyse the data, it was first necessary to parse the TEI/XML files and generate comma-separated (CSV) files that could be processed by Pandas, the standard Python library for data analysis, which would then allow us to create plots and visualisations with Matplotlib and Seaborn.
Each payment given to the Irish Exchequer is called a proffer. Each row in the CSV should represent an individual proffer and should include several pieces of information, including:
- The financial term. The year was divided into four terms – Michaelmas, Hilary, Easter and Trinity
- The date of the proffer, e.g., ‘1286-09-30’
- The day of the proffer, e.g., ‘Monday’
- The source of the proffer, which is a marginal heading in the roll, e.g., ‘Limerick’
- The details of the proffer, e.g., ‘From the debts of various people of Co. Limerick by James Keting: £40’
- The extracted monetary offering, e.g., £40
- The extracted monetary offering converted to pence, e.g., 9600.0
The pipeline consists of three stages: (1) generate a CSV for each roll; (2) categorise the proffers, for example, whether they relate to profits of justice or rents; and (3) merge all the CSV files into a single ‘mega’ file.
The development of the data processing pipeline in Python was an iterative process. The script was initially written to parse the 1301–2 roll. Although the TEI/XML encoding provided structure, not all the rolls adhered to the composition of the later receipt rolls. For instance, the earlier rolls do not record dates, and some rolls were only partially complete. Consequently, significant time was spent repeatedly refining the script to accommodate the different rolls, allowing us to establish a consistent CSV format.
Part of the iterative development involved error checking, which means verifying the total income calculated from the CSV files against the totals given by the Exchequer clerk on the original roll. Ideally, the values should be either identical or have only minor differences. If the computed total is lower, this may be due to details of the proffers being lost because of damage to the original roll. Computed totals might be higher if additional proffers were added to the roll after the clerk provided the total. Either could indicate parsing errors in the TEI/XML, and any discrepancies require investigation.

The error checking facilitated a productive conversation between the project and VRTI, enabling the identification of errors caused by typos in the translations and markup. It also highlighted interesting features in the original rolls. For example, for E 101/230/28, the computed total was significantly greater than that provided by the clerk. The archivists at the TNA re-examined the roll and postulated that membranes from other rolls had been sewn onto this roll during repairs in the Victorian period or later.
Early access to the TEI/XML documents likely meant that more errors were encountered, as not all documents had undergone the whole VTRI editorial process. This resulted in significant time being spent tracking errors, which was not anticipated when the JGI project was conceived.
Analysis and Visualisations
Limitation in scope
After the data was processed, it became possible to analyse and visualise the proffers to the Irish Exchequer. There are 40 existing rolls for the reign. However, due to resource constraints, the analysis is limited to the 21 rolls that are ‘general’ in nature, meaning those relating to proffers from various sources and for different reasons. It does not cover the specialised rolls, such as those related to taxation.
The ‘landscape’ of the rolls
One of the initial visualisations created was to understand the ‘landscape’ of the rolls, specifically what had survived and what had not. In the subsequent plot, we display for each financial year whether we have data for each financial term or whether payments were received outside of those terms. A red box with a tick indicates we have data, and a white box with a cross indicates a gap. As you can see, there are gaps in survival (1281–82, 1283–84, 1289–90, 1297–98, 1302–03, and 1303–04), as well as years with only partial survival (1284–85, 1294–95, 1304–05).

However, even this does not provide a complete picture since 1280–1 has an incomplete entry for Michaelmas.
Annual and termly totals
Our dataset does not encompass all income received by the Crown. As noted, some years are missing or contain only partial data, and we do not include additional rolls related to specific sources of income, such as taxation. The subsequent plot depicts the total income from our available data for each financial year, not the actual income received by the Crown.

We can break down the total income into what was received per term for each financial year. The data is presented as a heatmap, with the darker colours indicating a greater amount of income received. Different terms received the most income in various years. For example, Michaelmas in 1285–86, 1286–87, 1288–89; Easter in 1282–83, 1291–92, 1292–93, 1301-02; and Trinity in 1306–07.

The following plot shows the number of proffers received as a percentage of the total extant proffers for each financial year.

Unlike the 1301–2 roll examined in the first project, Easter was not always the term that generated the highest income. However, similar to the 1301–2 roll, we can see in the following plot that, in terms of the number of proffers received each term as a percentage of the financial year, Michaelmas was often the busiest term.

Types of business
The proffers were categorised into five broad categories, namely, ‘farms and rents’, ‘profits of justice’, ‘customs’, ‘profits of escheatry, wardships, and temporalities’, and ‘other revenues’. The following plot shows the total income received per category for each financial year. By far, the greatest source of income is from the ‘profits of justice’ category.


Further work is required here, such as distinguishing the profits of justice into fines and amercements: a fine was a voluntary payment made to the king to gain favour or a privilege, such as obtaining a royal writ, whereas an amercement was a financial penalty imposed by the king or a court.
Sources of income
All the rolls specify the ‘source’ of a proffer, often a place, e.g., ‘Dublin.’ However, it can also refer to a group or other entity, e.g., ‘English debts of the merchants of Lucca’, or a specific cause, e.g., ‘By writ of England.’ The following plot shows the total income received per source in the dataset, for the twenty sources that recorded the most proffers. Dublin, by far, accounts for the most significant number of individual proffers.

Conclusion
Like other Digital Humanities projects, this initiative relied heavily on human labour, especially from archivists and historians who translated the original Latin documents into English and encoded those translations into TEI-XML documents. Although we could process machine-readable datasets, extra effort was needed to clean the data and ensure its accuracy. This additional work was understandable, as the VRTI TEI/XML was created to support a digital edition of the receipt rolls rather than for statistical analysis. However, this limited the time available for detailed analysis, with most work focusing on understanding what was present in the datasets, their limitations due to document loss, and providing a general overview of the payments received. Nonetheless, the project demonstrated opportunities to develop and explore further research questions with additional funding and time.
The project was undertaken by Mike Jones of Research IT and Brendan Smith of the Department of History, with the assistance of Elizabeth Biggs of the Virtual Record Treasury of Ireland and Paul Dryburgh of The National Archives, UK.


















