Friday, 22 July 2016

The Manuscript Laboratory: The role of digital imaging in constructing the photographic biography of a manuscript

The following is the text of a talk given at a workshop organised by Professor Dr. Claudine Moulin and colleagues at the University of Trier Centre for Digital Humanities (the workshop was called Möglichkeiten der automatischen Manuskriptanalyseon Feb 24th 2014. At the time, I was University of Wales Chair in Digital collections at The National Library of Wales (NLW), running a collaborative research programme on the digital collections of Wales. I was interested in questions of how cultural heritage organisations which hold cohesive national digital collections, and collaborative frameworks to support interdisciplinary and collaborative research, can create new opportunities to build a digital workspace that can effect the sort of transformative research that has been the promise of digital humanities for many years. In the talk, I looked at the history of 'copying' at NLW, and looked at how the digital dissemination of knowledge is part of a long tradition of copying and distributing manuscripts in Libraries. In the talk, I looked at the ways that some of NLW's most iconic material, including the Hengwrt Chaucer manuscripts, have been copied and disseminated over the past hundred years. As new methods of manuscript digitisation are developed (including RTI and hyperspectral imaging) are developed, I make the argument that we should be moving away from 'mass digitisation' towards 'slow digitisation': that we can learn more from looking at all the copied versions of a manuscript than a form of 'distant reading' of whole collections. Libraries should spend effort on bringing together all existing copies of a manuscript (rotograph, photostat, photograph) and bring them together with images gathered using new technologies, making the complete 'biography' of a manuscript available to scholars. I've also written about some of the preparatory work for this an article taken from a keynote I gave at the Sheffield HRI Digital Humanities Conference in 2012. As this talk was written to be read, it's a little rough in places but I do hope to continue this research now I am at Glasgow University and write this up as a fuller article in the near future. Some of these themes were also taken up at a workshop I organised with Andrew Prescott at NLW in 2015, and we hope to run a follow on event next year. 

As ever, I am enormously grateful to my former colleagues at NLW for their help with developing this research. 

Digitization at the National Library of Wales

The National Library of Wales is a legal deposit library, established by Royal charter in 1907. It is the preeminent repository of information for Wales, offering a world class collection of documentary heritage, including numerous rare, valuable and significant works. The foundation collections of NLW are its manuscript collections: the Peniarth, Llanstephan and Cwrtmawr collections. The Peniarth Collection is listed in UNESCO’s Memory of the World register.

The Library has always had an appreciation of the importance of state-of-the-art technologies for access to its collections for education and research, and infrastructures in Wales support connected communities linked to library and archive resources: since devolution, the Welsh Government has made cooperation, collaboration and digital delivery key areas of focus.

As a result, NLW created A National Digital Public Library of Wales: a distinct, unified national collection that is freely available to users as the “research data” for all disciplines. Most of our content is licensed through creative commons licenses for free use and re-use. The Library has built internal expertise and capacity in the entire digital lifecycle: selection, conservation, capture, management and preservation. Copyright and other intellectual property rights are cleared as a managed part of digitisation process: where material is on deposit and/or the current rights holders are known, permission is requested, and where declined, materials are not used; when a current rights holder is unknown, reasonable efforts are made to identify and/or contact them.  Digitised resources are licensed for re-use and re-purposing under an open license (ideally, BY-NC-SA: Creative Commons Attribution-Non Commercial-Sharealike license). A fundamental principle is that free access is key to realising the potential community, social, research and economic benefits of digitised resources. 

There are several strategic objectives for digitizing NLW collections. The most paramount is access: The library has approx 85,000 physical user per year, but over 2 million online users.  The remote location of the Library a primary driver for making resources accessible digitally – not everyone can get to Aberystwyth to work with the original materials! Online access ensures the material reaches researchers, students, and the public (especially the Welsh diaspora) worldwide. Digitization also offers enhanced access to primary sources, building in functionalities including the ability to search, browse, collate and annotate sources. Digitisation also supports preservation – while not a preservation medium, digital access protects rare and fragile materials from handling, and also identifies conservation needs, as digitization selection is an opportunity for carrying out an inventory of material that is not in circulation, or uncatalogued. Another key reason for digitisation is collections enhancement and reunification – digitization is an opportunity to bring materials together – for example, the AHRC funded research project Imaging the Bible in Wales brought together a collection of manuscripts from archives and special collections all around Wales. And finally, there is the potential for digital collections to effect a transformation of scholarship: The traditional library is now a digital research infrastructure, with reading rooms replaced by Internet browsers, and primary sources accessible for new types of analysis using computer tools and methods.


NLW Research Programme in Digital Collections

In 2011, in recognition of these developments in digital scholarship, the Library set up a research programme in Digital Collections with the establishment of my post, a research Chair funded by the University of Wales. We’ve developed a fairly large portfolio of projects with two main areas of focus: Better and increased use of our existing digital content for research, and creating new digital resources that address specific research challenges across the disciplines. We are building the programme on the principle of digital humanities: using digital humanities methods and tools to foster scholarship across the disciplines, and to act as a bridge between content and curators, building essential collaborative relationships that integrate research into all aspects of our collections.  

It’s also built on the assumption that digital humanities is about working with digital content, using and methods for the analysis and interpretation of this content, and communicating the results of this work to the widest possible audience using traditional and non-traditional publishing methods, allowing greater engagement with research and research data than was previously possible: this binds scholarship to research infrastructures in ways that are deeper and more explicit than we are generally accustomed to in scholarship, and makes it dependent on networks of people (Kirschenbaum, 2011)

This transforms humanities research in two ways:

- Firstly, by facilitating and enhancing existing research, by making research processes easier via the use of computational tools and methods,

- And secondly, by enabling research that would be impossible to undertake without digital resources and methods, and asking new research questions that are driven by insights only achievable through the use of new tools and methods.

Greg Crane, Humbolt Professor of the University of Leipzig has referred to this work as e-Wissenschaft reflecting that the best examples of digital humanities are a new intellectual practice with elements that distinguish qualitatively the practices of intellectual life in this emergent digital environment from print-based practices (Crane, 2009). 

One of the key elements of diversion from traditional scholarly practice is that the digital humanities is collaborative: as the field matures, it is becoming recognized as one in which the best research is created through partnerships between different aspects of research, and indeed, between researchers from multiple disciplines and stakeholder communities – researchers across the arts and humanities and scientific disciplines, librarians, archivists, cultural heritage staff, funders, technical experts, data scientists…In many ways, the library is the ideal locus for digital humanities, as a place where all this comes together around the original source materials.

In order to think through some of these questions, and to see how the research library is becoming a digital research infrastructure, it’s useful to look at some of the National Library’s work with manuscripts. The next few images show examples of some of the things that happen to a manuscript in a Library, some of the conversations around its use, and the way that information about a Library’s manuscripts are collected and used.

Copying is, of course, part of this documentation and information gathering, and going back to the early histories of reproduction of manuscripts allows us to see digitization as a continuum of the use of current technologies to construct knowledge about specific manuscripts.

In 1919, the Library acquired a Photostat machine, and began advertising the possibility of making reproduction copies available of its manuscripts, advertising in the Journal Welsh Outlook, from January 1920: "The National Library of Wales: By means of the Photostat recently installed the National Library can supply at very reasonable rates facsimile reproductions from manuscripts, books, maps, prints, drawings, etc., for the use of students and others. Enquiries should be addressed to the Librarian, National Library of Wales, Aberystwyth". 

One of the main reasons for the acquisition was the ability to make copies of manuscripts and to send them to schools in Wales for education purposes. The image below is a negative photostat print of NLW Peniarth 610 MS 191:  




Photostats were also popular with scholars who used them for research, and to illustrate journal articles. Foreshadowing creative commons licenses – the library didn’t restrict re-use of these images, seeing getting its content out there as part of its mission. .

The Library’s archive of correspondence shows who requested these images – and what they did with them. By 1926, a Professor John M. Manly in Chicago was writing to the Librarian for a Photostat of Peniarth 1926, the Hengwrt Chaucer….



…Which the Library promptly sent, but for some reason sent the positive copy to Chicago, keeping the negative:



Manley and the Librarian kept up a lively correspondence over the years, which forms an interesting record of ‘technology transfer” – when Manley observed the use of fluorescence technology in use at the Huntington in 1930, he wrote to NLW to suggest this was a new technology that could be used. NLW were equally enthusiastic about the new technology, and had invested in fluorescence cabinet. Ballinger reported to Manly to report that staff and readers reported good results with the cabinets, with some readers spending 'a whole day' reading 'difficult' manuscripts under the lights:



Manly and Edith Rickert’s research on Hengwrt found its way back to the Library – in 1939, their research into the Hengwrt Chaucer to was published in the NLW Journal, illustrated with a new photograph of the manuscript (a record of the order for the photograph can be found in the Library's archives).



In the archive of correspondence, we see the original draft of the article, and the editorial comments by Manly and Rickert. Interestingly, the Chaucer “workshop” was originally called the “Chaucer Laboratory” – I like to think that this use was over-ruled as an unnecessarily scientific nomenclature, but it's an interesting way of conceptualising the ways that the research was dependent on early imaging technologies. 



Other documents in the NLW archives of correspondence between Ballinger and his successor as Librarian, William Llewelyn Davies and Manly show that the connection between the University of Chicago and Aberystwyth was a mutually beneficial collaboration over many years. Manly and Rickert came to Wales to authenticate the 'Merthyr fragment’ in 1936, and also to advise on the re-binding of Hengwrt, showing how fluid the relationship between library and scholar was around these manuscripts.

The next major technology to be adopted by the Library was, of course microfilming, and the existence of the photographic section made it possible for the Library to adopt this technology. From 1941-45,  the Library was the home of many of the treasures of the British Library and the British Museum, which were moved to Aberystwyth for safekeeping, and stored in the Library “Cave” for the duration of the war. During this time, many of these materials were microfilmed – partly as an insurance against the possible loss of the originals, but again, for access, preservation and as an attempt to better represent the information in collections to scholars. These microfilms found their way to the Library of Congress British Manuscripts Project, A Checklist of the Microfilms Prepared in England and Wales for the AmericanCouncil of Learned Societies, 1941-1945, promoting Welsh manuscripts to an even wider audience.

The 'Cave' at NLW today
The 'Cave at NLW during the Second World War












The use of digitization technologies to increase and enhance access to the collections of Wales can be seen as a continuum of the enthusiasm and innovation attached to the adoption of new technologies – be they Photostat or microfilm – throughout the history of the National Library of Wales, and as a pragmatic response to particular issues associated with the Library’s mission, collections, history, and location. The National Library has been slowly digitizing the manuscript collection and putting it online, and exploring the use of emerging imaging technologies for analysis of manuscripts. 

In 2013, a team from Mellon-funded ‘Digitally Enabled Scholarship with Medieval Manuscripts’ project at Yale came to NLW to carry out photospectral imaging of our three Chaucer manuscripts. This method of capture enables imaging across the colour spectrum, to highlight different aspects of an image.

A multispectral image is one that captures image data at specific frequencies across the electromagnetic spectrum. The wavelengths may be separated by filters or by the use of instruments that are sensitive to particular wavelengths, including light from frequencies beyond the visible light range, such as infrared. Spectral imaging can allow extraction of additional information the human eye fails to capture with its receptors for red, green and blue. It was originally developed for space-based imaging.

Multispectral imaging divides the spectrum into bands – in our case, seven. Each one acquires one digital image (in remote sensing, called a 'scene') in a small band of visible spectra, ranging from 0.7 µm to 0.4 µm, called red-green-blue (RGB) region, and going to infrared wavelengths of 0.7 µm to 10 or more µm, classified as near infrared (NIR), middle infrared (MIR) and far infrared (FIR or thermal). The scenes are combined to comprise a seven-band multispectral image.

This technology has also assisted in the interpretation of ancient papyri, such as those found at Herculaneum, by imaging the fragments in the infrared range (1000 nm). Often, the text on the documents appears to be as black ink on black paper to the naked eye. At 1000 nm, the difference in light reflectivity makes the text clearly readable. It has also been used to image the Archimedes palimpsest by imaging the parchment leaves in bandwidths from 365-870 nm, and then using advanced digital image processing techniques to reveal the undertext of Archimedes work.

Here you can see the outline of different captures at each level of the spectrum, and the composite image, incorporating all seven captures: 



For the Yale project, ultimately, these images will be presented alongside related manuscripts from other Libraries, like the Huntington Ellesmere Chaucer, using a ‘shared canvas’ for annotation.

The following images show how important the early captures are in the history of a manuscript. This is a mss from the Llanstephan collection. The 1941 microfilm shows a fairly legible text – but a recent 2013 digital capture shows text loss, probably due to a conservation incident in the 1950s.



NLW imaging experts carried out ultraviolet imaging to see if some of the text elsewhere in the manuscript could be made legible again, but with very limited success. This is a case where the 1940's microfilm image is the only record of some of the intellectual content in a manuscripts: showing the importance of keeping all copies of images of manuscripts taken over the years. Information can be captured by some processes (especially rotograph or even photostats) that are not to be seen in more recent images. 

This shows how all types of image capture (not just digital imaging) become part of the biography of a manuscript. Photostats, microfilms and digital captures all tell us new things about a manuscript and what has happened to it over the years. We can’t anticipate what will happen to manuscripts in the future, and we also can’t predict if new technology will give us new ways to read and understand manuscripts that we may have captured digitally: one of the great benefits of digital content is its use for rare and unforeseen purposes, which again is an argument for retaining all historic image captures and making them available to scholars for analysis .

Many different methods for digital capture and presentation of manuscripts are now becoming part of a manuscript Library’s documentation and dissemination of manuscripts.  The ease of access to a large body of manuscripts from a collection enables the manuscript scholar to work in different ways – to take a more 'archival approach',  working through large quantities of manuscripts, captured in many ways, and ideally also using the related documentary materials, such as the correspondence between Manly and the Librarians of Wales documented above. 

However, if a Library is to be more than just a digital photocopy service, publishing pretty pictures on its website, there must be direct engagement with scholars who can use these images for analysis and interpretation. Libraries also need scholars to be involved in the dialogue of encouraging the adoption of new technologies – it is the researchers that can advise on candidate manuscripts for this sort of imaging and presentation, and provide annotations and shared texts. This way, all the data that we gather digitally about manuscripts can become integrated into the scholars’ toolkit, part of the digital ecosystem that supports research.

Once manuscripts are available as digital images – especially those that capture different aspects of the image through photospectral and other technologies – a range of methods is available to support scholars who wish to ask new questions, or explore old questions in new ways. For example, the systematic analysis of similarities and distinctions in hands can be measured and calculated, enabling analysis on the number of scribes, processed used in creating manuscripts, etc. It’s also possible to analyse fragments for the purpose of reconstruction, as well as using hyperspectral and UV images to recover text. All this requires access to the full collection of images of a manuscript, including those developed in the early era of photographic reproduction. This calls for a more integrated approach to digital dissemination by libraries, and a focus on 'slow digitization' as opposed to 'mass digitisation': deploying the time and cooperation of manuscript libraries to work slowly and acquire all available photographic documentation of a manuscript, allowing for the establishment of 'layers' of data that can be added to add information about an archive. This sort of work requires digitisation in depth rather than mass: it will take more time, but ultimately provide a richer archive for scholars. 

One of the great advantages of basing a DH programme in a Library is the ability to seamlessly build these bridges between expertise required to explore the potential of digital imaging; scholars who are experts on the manuscripts; and expertise in digital methods that are used across the disciplines as they become familiar from other projects, in a sustainable way over the long term. This sort of collaboration will contribute to the resources available to manuscript scholars in the digital age, which ideally should not just be about a use of static tools and methods, but about fostering a more fluid environment of interdisciplinary co-production. The digital research infrastructures infrastructure to support this sounds a bit like the ‘manuscript laboratory’ envisaged back in 1939. We can see this envisaging as prescient or optimistic but regrettably, more likely (given the lack of resources and institutional will for such partnerships) wishful thinking.