Digital Archival Research Guide

The unprecedented events of the COVID-19 pandemic have emerged as an opportunity to utilize new methods and sources for data collection. Digital archival research in particular is emerging as an important aspect of research during the COVID-19 pandemic. However, its overall benefits will continue to make digital archival research an attractive option in the future.

Benefits of Archival Research

  • Cost-Efficient
    • Large quantities of data available at relatively low cost, especially when factoring in travel and other expenses
  • Unbiased Selection
    • Clear-cut collection procedure
  • Over-inclusive data
    • Large volumes of data can benefit future research

Potential Drawbacks

  • Imperfect Finding Aids
    • Risk of not finding more nuanced information
  • Time-Intensive Digitizing Process
    • Scanning delays and other issues can hold up delivery of materials
  • Overwhelming Amounts of Data
    • Time must be spent curating and making sense of digital files
  • Incomplete Data
    • Collections can be incomplete

Best Practices

Acquiring and Receiving Archives
  • Contact special collections librarians directly, be polite and reasonable in your expectations. Archivists are currently overwhelmed with demand for digitization, expect that there will be a backlog of at least one month and be flexible with scanning turnaround times.
  • Consult with archivists on finding aids and secondary sources to utilize in identifying potentially useful records
  • Consider hiring a private researcher and/or trade off private research assistance with researchers at other institutions
  • Example: Create an arrangement where a graduate student at another institution does scanning for you, while you scan documents at UChicago for them in return.
  • Trading off private research assistance can also have potential cost-saving benefits
  • General Tips
  • Consider asking archivists about hand lists and other non-digitized finding aids
  • Be very considerate of the research questions being asked; think about why things are and are not digitized
  • Contact colleagues and students in the field; many people are willing to share archives
  • Weigh benefits of over/under-inclusivity
Finding Aids and Secondary Sources

General Tips:

  • Make a list of basic databases
  • Determine what parts of the metadata are available for public use
  • Issues can occur when utilizing metadata that is not public, including a possible closure to access
  • Consult lists of commercial vendors to see what other databases exist
  • Commercial vendors can have materials that UChicago does not currently own
  • Ask librarians for a trial period and/or go to your committee to potentially purchase a new database
  • List of older finding aids: archive.org
Curating Received Archival Files
  • Keep a set of the originals in the order received
  • Determine organization scheme (combining or separating digital records)
  • Consider utilizing optical character recognition (OCR) when processing data
  • Recommended Software: ABBYY FineReader
  • Free UChicago Library Text Mining Resources
  • Be deliberate and consistent with naming conventions (date of creation, description of object, number in series or sequential order)
Data Upkeep
  • Be aware of proprietary, licensing, usage terms, and copyright, when maintaining received data.
  • Consider long-term storage:
  • Be aware that local storage can be redundant
  • Consider the costs of digital storage
  • Conduct fixity checks and be wary of bit rot
  • Plan for the future – What happens when you leave the University
  • Who will maintain this in the future? Who will be in charge of future platform migration?

Links and Other Resources

General Resources

“Archival Research in the Digital Age”

University of Chicago Social Sciences Dialogo article that covers how digitization of archives is transforming the academic research landscape, enabling faculty and students to deliver groundbreaking projects in months rather than years, and tdebunking prior historical assumptions that were based in incomplete datasets.


“Guide to Archival Research”

American Historical Association guide from the Graduate and Early Career Committee with suggestions on all aspects of planning an archival research.


UChicago Library Special Collections

The Hanna Holborn Gray Special Collections Research Center is the principal repository for and steward of the Library’s rare books, manuscripts, University Archives, and the Chicago Jazz Archives. Its mission is to provide primary sources to stimulate, enrich, and support research, teaching, learning, and administration at the University of Chicago. Special Collections makes these resources available to a broad constituency as part of the University’s engagement with the larger community of scholars and independent researchers.


ABBYY FineReader Software

Optical character recognition software to convert documents to editable and easily searchable text.  


UChicago Library Text Mining Resources

Text mining is a research technique using computational analysis to uncover patterns in large text-based data sets. Text and data mining is sometimes permitted according to the Library’s license agreements. This guide is a non-exclusive list of resources where the library has secured rights for text and data mining.

UChicago Library Resources

UChicago Library Experts

Subject matter librarians are available for 1:1 consults to assist researchers, students, and staff with various resources, student success resources, and scholarly communication.


UChicago Library Subject Guides

University of Chicago Library Subject Guides provide a basic overview of the various resources available to specific subject matter areas. These guides serve as a central location to house subject specific resources such as related journals, digital archives, and finding aids.


UChicago Library Special Collections

The Hanna Holborn Gray Special Collections Research Center is the principal repository for and steward of the Library’s rare books, manuscripts, University Archives, and the Chicago Jazz Archives. Its mission is to provide primary sources to stimulate, enrich, and support research, teaching, learning, and administration at the University of Chicago. Special Collections makes these resources available to a broad constituency as part of the University’s engagement with the larger community of scholars and independent researchers.


UChicago Library Text Mining Resources

Text mining is a research technique using computational analysis to uncover patterns in large text-based data sets. Text and data mining is sometimes permitted according to the Library’s license agreements. This guide is a non-exclusive list of resources where the library has secured rights for text and data mining.

Recommended Digital Archives

Foreign Relations of the United States (FRUS) series

The Foreign Relations of the United States (FRUS) series presents the official documentary historical record of major U.S. foreign policy decisions and significant diplomatic activity.


Central Intelligence Agency (CIA) CREST/FOIA

Since 2000, CIA has installed and maintained an electronic full-text searchable system named CREST (the CIA Records Search Tool). The CREST system is the publicly accessible repository of the subset of CIA records reviewed under the 25-year program in electronic format (manually reviewed and released records are accessioned directly into the National Archives in their original format).

The FOIA Electronic Reading Room is provided as a public service by the Office of the Chief Information Officer’s Information Management Services.


National Security Archive

The Digital National Security Archive is an invaluable online collection of more than 100,000 declassified records documenting historic U.S. policy decisions.


Wilson Center Digital Archive

The Digital Archive contains once-secret documents from governments all across the globe, uncovering new sources and providing fresh insights into the history of international relations and diplomacy.


Gale Declassified Documents Online

U.S. Declassified Documents Online’s greatest value lies in the wealth of facts and insights that it provides in connection with the political, economic, and social conditions of the United States and other countries. Materials as diverse as State Department political analyses, White House confidential file materials, National Security Council policy statements, CIA intelligence memoranda, and much more offer unique insights into the inner workings of the US government and world events in the twentieth and twenty-first centuries.


Gale Archives Unbound

The Archives Unbound program has published more than 300 titles. The roots of the program are in microfilm, and the collection makes available targeted collections of interest to scholars engaged in serious research.

Particular strengths in the Archive Unbound catalog include U.S. foreign policy; U.S. civil rights; global affairs and colonial studies; and modern history. Broad topic clusters include: African American studies; American Indian studies; Asian studies; British history; Holocaust studies; LGBT studies; Latin American and Caribbean studies; Middle East studies; political science; religious studies; and women’s studies. The Archives Unbound program consists of more than 300,000 documents totaling more than 13 million pages. Individual titles in the collection range between 1,200 and 200,000 pages.


ProQuest History Vault

ProQuest History Vault debuted in 2011 and is continuously growing to include numerous archival collections documenting the most important and widely studied topics in eighteenth- through twentieth-century American history.


HathiTrust Digital Library

Founded in 2008, HathiTrust is a not-for-profit collaborative of academic and research libraries preserving 17+ million digitized items. HathiTrust offers reading access to the fullest extent allowable by U.S. copyright law, computational access to the entire corpus for scholarly research, and other emerging services based on the combined collection. HathiTrust members steward the collection — the largest set of digitized books managed by academic and research libraries — under the aims of scholarly, not corporate, interests.


Church of Latter-Day Saints Family Genealogy Sites