Education and Society Data Archive

What is the Education and Society Data Archive?

The Education and Society Data Archive is a resource for students interested in secondary data analysis.  With the available resources in the archive, we hope to make courses in quantitative methods more engaging and to stimulate original student research. The creation of this archive was made possible thanks to the generous support of the College’s Curriculum Innovation Fund (CCIF).


The archive has several purposes.  One is to provide an accessible place where students can find raw data, can navigate through external data repositories, and can access resources on data analysis, data cleaning, and statistical software programs.  The Committee on Education recognizes that writing a serious thesis within a single academic year is a challenge: obtaining a field site, collecting new data, IRB approval, doing the original data collection (typically with few resources or research assistance) etc. can impose prohibitive challenges for students.  But if really interesting data are ready at hand, and if there is some support in gaining access and navigating secondary data, one can, in principle, write a substantially important thesis within the time frame of one year.

A second goal is to provide easy access to instructors who wish to use interesting big data sets in their classroom teaching.  Instructors can give students assignments to practice their statistical skills by answering important substantive questions using national survey data or large experimental field trials.  It makes the statistical instruction so much more interesting by creating opportunities for students to learn to think scientifically and to be engaged throughout the research process.

Archive Resources
  • Data
    • The archive contains data and syntax files. Data sets have been prepared for various usages depending on user experience and research project needs.  Codebooks have been created to provide information on the available measures within each data set.  Additionally, literature and academic articles are included to provide examples of how this data has been used and how it can be used for future research projects. 
    • Data sets that are available and have been cleaned:
      • 2008 National Asian American Survey
      • 2016 National Asian American Survey (Pre & Post 2016 Presidential Election)
      • The Early Childhood Longitudinal Study, Birth Cohort (Restricted Data License)
      • The Early Childhood Longitudinal Study of 1998
      • The Early Childhood Longitudinal Study of 2011
      • The High School Longitudinal Study of 2009
    • Data sets that are available but have not been cleaned:
      • The Education Longitudinal Study of 2002
      • Maryland Adolescent Development in Context Study
      • 2003 National Assessment of Adult Literacy
      • 1997 National Longitudinal Survey of Youth
      • The National Educational Longitudinal Study of 1988
    • Computing Resources
      • The archive has various resources for common statistical software programs: Mplus, R, SAS, SPSS, and Stata. These files contain help on installation, access, and practical applications.
    • Data Cleaning Resources
      • The archive has a Data Cleaning Instructions for Secondary Data Analysis to introduce students to cleaning and preparing their own data for analysis. This guide outlines common steps in the research process including: removing duplicate or irrelevant information, fixing structural errors, validating, and transposing.  In addition to general data cleaning steps, this guide contains links to external data repositories, highlights common Stata commands, and defines key research concepts and terminology.
    • Qualitative Studies
      • The archive has begun collecting information on qualitative research studies. The guides have information on different qualitative research designs (e.g. interviews, ethnography, visual sociology), different data analysis techniques, and links to external qualitative data repositories.
    • Examples of Student Work
      • Current and past student work will be available as examples of types of research projects that can be carried out using primary and secondary data. These examples include a mix of qualitative and quantitative research designs.


The Education and Society Data Archive is available to University of Chicago students and staff.  It is managed by Dr. Marshall Jean, Assistant Instructional Professor of Sociology in the MAPSS program.  For more information, or to request access to the archive, reach out to Dr. Jean at