My (Kirstyn’s) portion of Team Stainforth’s presentation today will illustrate the origins of the project and the process of turning the Stainforth library catalog manuscript into data, or machine-readable and electronically shareable text. Since I am condensing 3 years of work into a few minutes, I want to use our project blog to emphasize two additional points:
- You have to dig into your data in order to learn what it means and why it is important. In other words, the significant labor required to edit and curate your data may seem like far too many lost hours that you could spend writing articles or reading in your field, but you are doing something very similar to writing an article and doing close and distant reading in your field. You are making knowledge that you can then share in traditional (print) and digital ways.
- While making and editing your data, it is crucial to have collaborators and a consistent, energetic communication stream to help you see what you cannot see and learn from your data. You cannot attack a project of scale like this with a “single-author” mentality. (Caveat: in many fields, like English, you still need to execute single-author publications on the data that required a team to produce.) Every day, I learn from email communication with other researchers on our team. These moments range from errors we’ve made to discovering new authors or patterns in Stainforth’s acquisition and documentation habits. This morning, a suggestion from our researcher, Cayla, offered us the idea to create a modern and standardized version of the digital catalog that will be more searchable than the raw transcription version.
Here is a list of links that will animate our project for you. Explore our documents and data (in read-only):
- Shared Google Drive folder with sub-folders for data, edited content, guidelines, and our sub-projects, like the mapping project. There are many aspects of Google Drive that make me regret the decision to use it for our files.
- Transcription Editing Guidelines
- We use annotated PDF pages to organize our transcriptions by page number and line number.
- Sources frequently used to help us decipher the ms: