The Civil War: Letters From the Front

Welcome. This is a collaborative undertaking to make available the many letters from Union soldiers that exist in the Widows' Certificates pension files at the US National Archives. For a complete description, please see the "Project Proposal" page. If you would like to contribute, please read the instruction page and contact the administrator for more information.

Project Proposal



Rediscovering the Civil War: Digitally, Collaboratively, Personally
by Carolyn McCarthy
Introduction
            An overabundance of information has long been a fundamental truth about the National Archives and Records Administration (NARA). The seventy-six year old institution has responsibility for (so far) two hundred and thirty four years of United States history. Over the decades, the National Archives has had to continuously expand in order to physically accommodate the growing number of records added to the nation’s history every year. In just the past decade, however, the daunting thought of digitization has become central in the minds of many archivists. Digitization is crucial to accessibility, and to upholding NARA’s mission statement of “[ensuring] continuing access to the essential documentation of the rights of American citizens and the actions of their government” in today’s increasingly digital world.[1] Consequently, in recent years the National Archives has steadily undertaken projects to digitize its highest priority records. Both internally and under various partnerships, NARA is making available many of its records that are most frequently pulled by researchers, both in an effort to increase accessibility and to decrease the wear and tear of physically handling the aging records.
            I work for the largest digitization project within NARA’s Office of Records Services (NW). The Civil War Conservation Corps (CWCC) is working to digitize 1.28 million widows’ pension files from NARA’s Civil War and Later Pension Files, Widows’ Certificate Series.[2] Starting in 1862 and going up through the Spanish-American War, the “WC” series is one of the richest within the National Archives in terms of containing unique and personal records from one of the most interesting—and most researched—eras in American history. Unlike other often-researched Civil War records such as service records, medical records, and muster rolls, the WC pension records contain personal documents such as marriage certificates, baptismal records, and, most importantly, letters from soldiers on the front. These letters can range from mundane to profound, but each of them signifies a powerful piece of American history.
            In partnership with Footnote.com and Family Search (or the Genealogical Society of Utah), the CWCC is steadily digitizing pension records. When the project began in 2007, it was estimated to take ninety years to complete. Three years later, after tripling our number of volunteer support and continuously working to streamline the process, the current best estimate is fifty-three years. The process is moving as fast as it possibly can, and its speed (or relative lack thereof) is for very good reasons of thoroughness and quality control. Each and every file “prepped” by the CWCC is opened up, dusted off, rid of superfluous blank papers, and put in a specific order. When the project began, volunteers pulled out “selected documents” from the file: the brief, the original application, proof of service, proof of death, proof of marriage, and proof of children. After that, they were asked to put the papers in chronological order. This practice was soon abandoned. Files can range from 10 pages to 500 pages, and so at a certain point, thoroughness finds itself at odds with practicality. Now, the volunteers only separate out selected documents. The rest of the papers must still be individually looked at, unfolded, and assessed for conservation issues, but they no longer need to be in any order. Our team made this decision based on the assumption that at a certain point, we had to let researchers be researchers. Doing the work of totally organizing the file ahead of time might be well appreciated, but it was making the process of getting the files online painstakingly slow. After preparation, every file must go through a quality assurance check. The “QA” person (usually me) goes through only the selected documents to ensure everything is there and that the target sheet—the written page off of which the indexers base their metadata—is correct. After that, the boxes are sent by the cartful to the conservation lab, before finally making it to the camera scanners. This extremely long process has its pitfalls and bottlenecks, but it is the best practice possible in light of the needs of NARA’s partners and the demands of the researcher.
            There are currently 38,000 files available on Footnote.com. At the CWCC, we are about to pass the 69,000 mark in files prepared for digitization. Countless letters from soldiers are now available online, and countless more are waiting to be discovered by CWCC volunteers. However, the Footnote metadata model offers no way to single out files that contain personal letters. Footnote.com adheres to the NARA directive concerning digital collections, which recognized that “a deep level of description at the item level… is not usually accommodated by traditional archival descriptive practices. The functional purpose of metadata often determines the amount of metadata that is needed. Identification and retrieval of digital images may be accomplished on a very small amount of metadata; however, management of and preservation services performed on digital images will require more finely detailed metadata.”[3] Accordingly, Footnote focuses on making the collection available online as quickly as possible, limiting itself to bare bones metadata: soldier’s name, pensioner’s name(s), file number, and military unit information. Beyond that, one can narrow the search by “selected documents” within the file. Although digitized for all the world to see, these letters from soldiers are essentially lost in the digital void—discoverable, but hidden. This is where my project proposal comes in.
Proposal
            I propose to create a digital repository for letters found in the WC pension series. Starting by simply working to collect the letters—which is no easy task—I can foresee the site becoming a place for researchers to go to find sources, exchange ideas, and collaborate with peers they might never have come into contact with before the advent of new media. While Footnote has exclusive rights to their own digital images for five years from their creation date, there is, for now, no obstacle to individuals continuing to pull records and digitize them independently. They are public records, and NARA is obliged to make them available to anyone who asks for them—a responsibility to which NARA is so vigilant that records are occasionally pulled right out from underneath FamilySearch’s cameras to be served to the Central Research Room. Once the entire series is digitized, this situation might change—it is NARA’s hope that the files’ digital availability will enable the physical records to be permanently retired and sent to a remote storage facility. In the meantime, however, the letters are simply waiting to be found. And, thanks to the Footnote project, they are being found. It would be a simple task to give each CWCC volunteer a pad of paper and ask him or her to simply write a file number down when they came across a letter. Currently, there is a long list of WC numberings identifying files that contain letters based on my own pad-of-paper system. Contributors, of course, are encouraged to add to this list, or to update it when they complete a digitization. Whether letters are singled out in this manner or come across by non-CWCC researchers, the important part is that my proposed site gives these letters a place to go. The site will be a place for individuals to share the letters they happen upon. Anyone is welcomed to post letters, and likewise anyone is encouraged to help in transcribing the images.  By crowd-sourcing the discovery process, the site will be a place for letters to be found more easily, made available more quickly, and (hopefully) used and discussed in a richer collaborative context.
Form & Content
            The project currently exists in the form of a blog. This is merely for reasons of my own inexperience in website building, and to make open collaboration as simple as possible to begin with. However, with potential funding and the involvement of those with the relevant expertise, I envision the project becoming a database much like Footnote.com. The identifying information to the files that contributors are asked to post would become metadata to make the browsing process as user-friendly as possible. An important aspect of the project in its current state is the labels. Beyond the name and unit information contained in the actual post, contributors are invited to “tag” posts with virtually any label they find relevant. Diseases, battles, locations, topics of conversation, and sentiments are all important identifiers that I can foresee researchers being interested in specifically. If every love letter were tagged with the simple word “love,” researchers would be able to narrow their selection accordingly. Because of the endless possibilities of relevant tags, I do not envision these becoming metadata. There would simply be too many categories to make it a useful narrowing-down process. Instead, I propose limiting the actual indexed metadata to military unit identifiers, locations (where the letter was written), and type of letter. Type of letter would divide into two major categories: letters from soldiers and letters announcing a soldier’s death. Within the former category, metadata could narrow down letters by recipient—wives, mothers, friends, etc. In addition to the specifically indexed metadata, researchers would be able to search the tags to get a wider assortment of letters according to their specific interest. Of course, all of these potential innovations depend on continued user collaboration.
Collaboration
            By depending on collaboration, this project utilizes one of the biggest potentials of the digital humanities. Ideally, both the legwork of collecting the data (in this case, the file images, the identifying information, the transcription, and the tags) and the analysis would be done and shared by a wide array of people, from the casual historian to the professional academic. This project aspires to prove in practice Lisa Spiro’s vision of digital history “the archive… [becoming] a dynamic, living repository of current history, a space where researchers and citizens come together.”[4] Standard procedures are articulated to get participants started, but it is my hope that different potentials of new media will be added to the project as it grows, as the technology becomes available, and as researchers continue learn to better participate in the digital world.
Quality Control
            In the interests of open access and collaboration, the quality control standards of this project will be distinct but limited. To help link letters back to their original locations, I ask for the basics: WC file number, soldier’s name, and military unit information. Although I know firsthand from working on the CWCC project that it is all too easy to get unit information wrong and skew metadata forever (don’t let anyone tell you they were in the 11th Vermont Infantry. Every member of that unit was transferred to the 1st Vermont Heavy Artillery—even the dead members!), I also know firsthand that a moderate amount of research reveals the true facts pretty quickly. If a contributor, say, were to annotate that a letter came from John Smith of the 1st Excelsior Brigade, a simple Google search will let anyone know that he was in fact either in the 70th, 72nd, 73rd, or 74th New York Infantry. It is my hope that misnomers like these would be continuously updated and corrected by users. In terms of the quality of the images, I would suggest that anyone taking the time to image a document would likely be doing it for their own purposes, and posting on the site as a secondary motive. A blurred or unreadable image is of no use to anyone, and so it is my belief that sub-standard images would rarely even make their way on the site. As for documents that are particularly faded, decaying, or otherwise difficult to read, I would suggest that perhaps the posters of these letters might add the best transcription possible while they still have the original document in front of them, where it is as readable as it will ever be. In this case, the accompanying image would be a valuable illustrative tool, but the content of the letter would have to be experienced in type.
Terms of Use and Questions of Ownership
            By nature of being an independent and collaborative project, the site (in its current unfunded state) would be managed and monitored by the site administrator (me), and the images would be accepted for posting based on the assumption that they are owned by the poster, and not copyrighted material (from, for example, Footnote.com’s images). The entire cite is protected by an extremely open Creative Commons license, virtually limiting only the reproduction of any part of the site for commercial use. Posting images of letters would give implicit usage rights to the website. Much like the Wikipedia model of ownership, the site would function under a basic terms of use agreement requiring that participants agree that their words enter the public domain, and that they pledge not to use or manipulate copyrighted material in their contributions. Everything generated on the site will be by contributors—by posting comments or participating in conversations, users are agreeing to trust their words under the site’s license. As for images and transcriptions, it is the express purpose of the site to make these records available. Thus, the site is not interested in contributors who seek to protect their own images of files that someone else could simply go and digitize instead. Scholarship taking place on the website would still belong to the scholars saying the words—the site would merely be a forum to say them. If users would like to protect their posts of original material (i.e. analytical scholarship, not simply images of letters or transcriptions), they are encouraged to secure their own license. I would claim no responsibility for the scholarship taking place on or stemming from the use of my project. Any quoting of individuals would have to be appropriately cited back to that individual, but any quoting of the actual letters would have to refer to the National Archives record series, with the site merely listed as the place accessed. Instructions for all of these citation issues are clearly articulated on the site to allay any liability.
Potential for Scholarship
            It is hard to predict the types of scholarship that would benefit from this project, because forms of scholarship are always evolving. A site like mine can appeal to a wide array of interests, from the archivally-minded individual who is simply eager to collect, digitize, organize, and preserve, to the investigative academic looking for a source site that will make research significantly easier. Once the site has a decent amount of letters, there is no limit to the types of discussion it could host. A simple letter from a private in the Union army, then, holds vast potential for interpretation: mechanical things like the handwriting, the spelling, and the grammar; content-based analysis of the thoughts and emotions expressed; even bottom-up history drawing out mentions of things such as what kind of food soldiers ate in encampments. This is because the site itself does not purport to be a piece of scholarship; rather, it is a place for research and scholarship to take place. It is in the very nature of a database not to make any argument; rather, as Lev Manovich wrote, “[it is a collection] of individual items, where every item has the same significance as any other.”[5] It is the way the items—in this case, the letters—are used that makes this site an important tool for scholarship. While the collection process is being outsourced to the user under the assumption (or hope) that the researcher would want to participate as fully as possible in the project, the work beyond mere digitization and transcription is up to the scholar. As is the nature with databases, as Manovich also pointed out, it is the task of the scholar to start with, essentially, “a list of items…[and to create] a cause-and-effect trajectory of seemingly unordered items (events.)”. Essentially, the usefulness of this project depends on the merit Manovich’s assumption that “database becomes the center of the creative process in the computer age.”
Further New Media Potential
            Other forms of new media could easily fit in to the goal of this project. It does not purpose to be a stand-alone piece of scholarship; rather, its aim is to be a place for researchers to go and easily find Civil War letters organized in exactly the ways most useful to them. One important technology advancement I can see being useful to this project is the use of GIS maps. It is a reasonable assumption that researchers would be interested in soldiers writing from specific camps or killed in certain battles, and adding an interactive map would be an easy way to illustrate this. I envision a map much like the Danish folklore map we looked at in class, where researchers can narrow the pinpoints on the map according to the same metadata parameters already in place for simple document searches. Additionally, the transcriptions of letters make the potential for text-mining applications extremely easy. Any researcher with interests beyond that encompassed by the index, with the right technical skills, could easily text-mine the site for relevant information.
Conclusion
            With the advancements of new media in the practice of the digital humanities comes new problems and drawbacks. As with any technological advancement, it boils down to a basic question: does the improvement to industry outweigh the bugs that come along with it? In most cases, the answer is yes. My project does not purport to be a solution to the hopelessly flawed Footnote database model. The Footnote model is a well-maintained and valuable example of the tremendous potential of digital archiving. Rather, like almost every site on the internet, this project attempts to a niche subject matter out of a huge repository of source material. The work that would go into the project would quickly be overshadowed by the convenience of use it would provide. It is likely that researchers of WC pension files are digitizing these letters anyway. With a matter of a few clicks, contributors can share those gems of primary source material with the world, and participate more fully in the history community.

Bibliography
Bureau of Pensions, Civil War and Later Pension Files, Widows' Certificate Series, Record Group 15 (Veteran's Administration), National Archives Building, Washington, D.C.

“Civil War ‘Widows’ Pensions’.” U.S. National Archives and Records Administration, via Footnote. <http://www.footnote.com/documents/115520748/civil_war_widows_pensions/>

“Danish Folklore Data Nexis.” Danish Folklore: The Evald Tang Kristensen Collection. Timothy Tangerlini, University of California, Los Angeles. <http://projects.cdh.ucla.edu/danishfolklore/bin/index.html>

“Database as a Genre of New Media.” Lev Manovich, AI and Society. < https://docs.google.com/Doc?docid=0ATw6of_TykmmZDM3eGhoZF8xMTRjYzYzbjhmcQ&hl=en&authkey=CMurnJkD>

“Examples of Collaborative Digital Humanities Projects.” Lisa Spiro. Digital Scholarship in the Humanities. 1 June 2009. <http://digitalscholarship.wordpress.com/2009/06/01/examples-of-collaborative-digital-humanities-projects/>

“NARA-iArchives Digitization Agreement.” U.S. National Archives and Records Administration and iArchives, Inc. 2007. <http://www.archives.gov/digitization/pdf/footnote-agreement.pdf>

 “Our Mission Statement.” National Archives and Records Administration. <http://archives.gov/about/info/mission.html>

 “Ownership of Articles.” Wikipedia. Last modified 6 December 2010. <http://en.wikipedia.org/wiki/Wikipedia:Ownership_of_articles>

 “Technical Guidelines for Digitizing Archival Materials for Electronic Access: Creation of Production Master Files—Raster Images.” Steven Puglia, Jeffrey Reed, and Erin Rhodes. U.S. National Archives and Records Administration. June 2004. <http://archives.gov/preservation/technical/guidelines.pdf>



[1]  “Our Mission Statement.” National Archives and Records Administration.
[2] Bureau of Pensions, Civil War and Later Pension Files, Widows' Certificate Series, Record Group 15 (Veteran's Administration), National Archives Building, Washington, D.C.
[3] “Technical Guidelines for Digitizing Archival Materials for Electronic Access: Creation of Production Master Files—Raster Images.” Steven Puglia, Jeffrey Reed, and Erin Rhodes. U.S. National Archives and Records Administration. June 2004. Page 6.

[4] “Examples of Collaborative Digital Humanities Projects.” Lisa Spiro. Digital Scholarship in the Humanities (Blog). 1 June 2009.
[5] “Database as a Genre of New Media.” Lev Manovich, AI and Society.