Skip to main content

NAEP Technical DocumentationMerging Files

Prior to merging, preparation of the files is performed and includes the following tasks:

  1. Files are separated by subject area to improve maintenance and efficiency.
  2. The files are restructured, eliminating unused (blank) areas to reduce the size of the files.
  3. In cases where students chose not to respond to an item, the missing responses are recoded as either "omit" or "not reached."
  4. A school file is created from the school questionnaire file (provided by Contractor 1) and can be associated with a student record in order to report school information for students.

Following the reorganization of data files, the following merging steps take place:

  1. Final student data are merged with student-level sampling information, including student-level weights, received from the NAEP Sampling and Data Collection (SDC) contractor. Similarly, the school data are merged with the school-level sampling information, including school-level weights, received from the SDC contractor.
  2. The resulting file is then merged with the students with disabilities/English learners (SD/EL) questionnaires and teacher questionnaire data.

The matching criteria used in these steps are:

  • When matching files involving student level data, a 10-digit booklet or digital test form identification number is used as the primary matching criterion. For the paper-based assessments administered in NAEP prior to 2017, the first three numbers correspond to the 3-digit booklet number common to every booklet with the same blocks of items. The next six digits correspond to the 6-digit serial number unique to the booklet a student is given, and the last number is a single-digit check. For the digitally based assessments administered in NAEP beginning in 2017, the first nine digits correspond to the serial number unique to the digital test form a student is given, and the last number is a single-digit check.

     

  • The teacher data are linked to the student data by Contractor 1 through three data variables: the Federal Information Processing Standards (FIPS) code, school code, and teacher number within school. Prior to 2002, when NAEP used separate national and state samples, the teacher data could be linked to the student data through four data variables: primary sampling unit (PSU), school code, teacher number within school, and classroom period. Teacher data are added to Contractor 2's student-level database from Contractor 1's student-level teacher file using the 10-digit booklet or digital test form ID.

  • Schools can be uniquely identified through the FIPS (or PSU) codes and a sequential school code. This school ID also exists in the student-level file and thus can be used to link students to schools. Since 2002, the FIPS code has been the matching criterion used for all school data.  Prior to 2002, the PSU and school codes were used as the matching criteria for the national school data, and the FIPS code was used for state school data. Since some schools do not return a questionnaire, some of the records in the school file contain only school-identifying information.

     

  • Whenever new data values (such as composite contextual variables or plausible values) are derived, they are added to the appropriate database files using the same matching procedures described above.

School and student names are kept secure and are not included in Contractor 2's data files.



Last updated 02 November 2022 (SK)