Dr Gavin Robinson: Services

Last updated: 5 June 2021. I will not be available until July 2021.

All of the petitions that I transcribed for The Power of Petitioning have now been published at British History Online.

Contact details.

I offer high quality historical manuscript transcription services at a price that large academic projects can afford. I have 24 years' experience of palaeography from my own academic research, and have been doing professional manuscript transcription work for over 12 years. Some of this work has been published at British History Online. I specialize in transcription of English language documents from the 16th century onwards, and can also extract data from formulaic Latin documents, and transcribe printed text in any language that uses the Latin alphabet. See the headings below for more details of services and prices, and published examples of my work. I can give free estimates to help with project planning and funding applications, with no obligation to use my services if the application is successful. I am based in the UK but often work for overseas clients, especially in the US. Most of my work is for organizations, but I can also work for private individuals.

Historical Manuscript Transcription and XML Markup

I can deliver very accurate full text transcriptions of historical manuscripts according to any transcription conventions you specify. I can usually expand abbreviations if required. Transcripts can be delivered as plain text, word processor files, or XML. I can add basic XML markup at no extra cost if you can supply a schema and human-readable tagging instructions. This markup can include the basic structure of the text, named entities, and dates. I am familiar with TEI P5.

You will have to supply digital images of the pages to be transcribed, and get copyright clearance if necessary. High quality images are easier to use, but I can deal with whatever you've got, even if it's too difficult for HTR/OCR software or unskilled double keyers.

Prices and timescales for full transcripts vary according to the number of words per page, image quality, and difficulty of handwriting. Assuming an average of around 250 words per page, I could deliver up to 400 pages per month at the following prices per page:

These are full prices for small contracts. We may be able to negotiate lower prices for large contracts.

To give an accurate quote, I would ideally need to see all of the pages to be transcribed, but this isn't always necessary if the documents are in a very standard form or I'm already familiar with them. Prerogative Court of Canterbury wills in PROB 11 will cost three times the above prices per full page because they contain a very large amount of text and the scans available online are very low quality.

I use the following methods (which are included in the prices given above) to increase accuracy for unstructured full text transcripts:

I do not use double keying, because it is always likely to be at least one of:

Trying to fix one of these problems will inevitably make one of the others worse. Double keying is fundamentally flawed and should be avoided.

Data Entry

I can enter structured data into a spreadsheet, database, or XML file. This is easiest to do if the original document is already structured, but I can also extract structured data from unstructured or semi-structured documents.

I will need to test transcribing a sample of data before I can quote a price. Some basic data cleaning checks will be included at no extra charge.

Data Cleaning

I can check and correct existing structured data, or apply checks to data that I have entered myself. I use a combination of OpenRefine, LibreOffice spreadsheet, and custom Python scripts. Checks and corrections typically include:

I will need to see the whole dataset before quoting a price as it depends more on the number of unique values than on the total number of records. This kind of work can be difficult to cost in advance, so it may have to be done as employment or casual work to be paid by the hour.

Data Wrangling

I can extract data and convert it to other formats for reuse elsewhere. For example, I developed a semi-automated process to create wiki pages for Linking Experiences of World War One. This involved extracting catalogue records for WO 95 war diaries from TNA's Discovery catalogue, manually cleaning and reconciling the data, and using custom Python scripts to generate wiki XML that could be imported into MediaWiki. This method created basic pages for around 7,000 individual military units with much less effort than creating pages manually. This kind of work can be difficult to cost in advance, so it may have to be done as employment or casual work to be paid by the hour.

Examples of my work

The identities of my clients and the work I do for them are kept confidential by default, but these clients have chosen to credit me on their websites or social media. These examples show that I am capable of producing high quality work suitable for academic research and publication within the budgets of AHRC and ESRC grants.