Skip to main content

NAEP Technical DocumentationComparisons of Distributions of Linking Samples for NAEP Assessments

When using a linear transformation equation to link assessment scales, it is assumed that the transformation coefficients (i.e., intercepts and slopes) of the transformation equations represent the means and standard deviations of the distributions of scores from the linking samples. The linking samples are defined as the samples from the current and previous assessments used to transform the score scale of the current assessment to the scale of the previous assessments. This assumption can be checked by comparing the score distributions for the samples used to determine NAEP linking.

For reporting sample links across assessment years designated in the linking diagrams, a supporting piece of information is provided: tables showing the means and percentiles of the distributions of the samples used to link one assessment scale to another. The difference between these values for the two samples is also provided.

Original estimates are based on previous assessment data and previous assessment scaling results, transformed to the reporting metric using the transformation identified during the analysis of the previous assessment data. Provisional estimates are based on previous assessment data and current assessment scaling results. These are on the theta metric specified by the scaling program. The original and provisional estimates for the linking sample are used to find the transformation constants necessary to place the current assessment scale scores on the scale used to report previous assessment scale scores. The transformed provisional estimates are based on previous assessment data and current assessment scaling results, transformed to the reporting metric using the transformation constants calculated from the original and provisional estimates for the linking sample. The means and selected percentiles for the original estimates and for the transformed provisional estimates are provided. If the transformation constants are correct, the means of the original and transformed provisional estimates should be the same. If the standard deviations represent the variability of the distribution well and the distribution was symmetric, the difference between the two estimates for each percentile should be close to zero. A value not close to zero could indicate a violation of the assumption of normal distribution underlying the transformation methodology being used here. If the distribution is not normal, then mean and standard deviation would not be sufficient for transformation of the distribution. Decisions about what is close to zero are not based on set cut-offs to be applied across the board. Acceptable numbers are based on historical NAEP data for each of the subjects; that is, are these values consistent with what has been seen for this particular subject?

Links to comparisons of distributions of linking samples, NAEP assessments, by subject, year, and grade: Various years, 2000–2018
SubjectYearGrade 4Grade 8Grade 12
Arts2016 R3
Civics2018 R3
2014 R3
2010 R3 R3 R3
2006 R3 R3 R3
Economics
2012 R3
Geography2018 R3
2014 R3
2010 R3 R3 R3
2001 R2 R2 R2
Mathematics2017 R3 R3
2015 R3 R3 R3
2013 R3 R3 R3
2011 R3 R3
2009 R3 R3 R3
2007 R3 R3
2005 R3 R3
2003 R3 R3
2000 R2 R2 R2
Reading2017 R3 R3
2015 R3 R3 R3
2013 R3 R3 R3
2011 R3 R3
2009 R3 R3 R3
2007 R3 R3
2005 R3 R3 R3
2003 R3 R3
2002 R3 R3 R3
2000 R2
Science2015 R3 R3 R3
2011 R3
2009
2005 R3 R3 R3
2000 R2 R2 R2
Technology and engineering literacy (TEL)2018 R3
2014
U.S. history2018 R3
2014 R3
2010 R3 R3 R3
2006 R3 R3 R3
2001 R2 R2 R2
Vocabulary2015 R3 R3 R3
2013 R3
2011 R3 R3
Writing2011
2007 R3 R3
2002 R3 R3 R3
† Not applicable. 
NOTE: R2 is the non-accommodated reporting sample; R3 is the accommodated reporting sample. If sampled students are classified as students with disabilities (SD) or English learners (EL), and school officials, using NAEP guidelines, determine that they can meaningfully participate in the NAEP assessment with accommodation, those students are included in the NAEP assessment with accommodation along with other sampled students including SD/EL students who do not need accommodations. The R3 sample is more inclusive than the R2 sample type and excludes a smaller proportion of sampled students. The R3 sample is the only reporting sample used in NAEP after 2001. In NAEP, vocabulary, reading vocabulary, and meaning vocabulary refer to the same reporting scale. Because preliminary analyses of students' writing performance in the 2017 NAEP writing assessments at grades 4 and 8 revealed potentially confounding factors in measuring performance, results will not be publicly reported. Some of the NAEP assessments included in this table are linked to previous assessments (prior to 2000) that are not included in the technical documentation on the web.
SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), various years, 20002018 Assessments.

 

Comparisons of distributions of linking samples, long-term trend assessments, by year, subject, and age: 2004, 2008, and 2012
YearSubjectAge 9Age 13Age 17
2012Mathematics long-term trend R3 R3 R3
Reading long-term trend R3 R3 R3
2008Mathematics long-term trend R3 R3 R3
Reading long-term trend R3 R3 R3
2004Mathematics long-term trend R2 R2 R2
Reading long-term trend R2 R2 R2
NOTE:  R2 is the non-accommodated reporting sample; R3 is the accommodated reporting sample. If sampled students are classified as students with disabilities (SD) or English learners (EL), and school officials, using NAEP guidelines, determine that they can meaningfully participate in the NAEP assessment with accommodation, those students are included in the NAEP assessment with accommodation along with other sampled students including SD/EL students who do not need accommodations. The R3 sample is more inclusive than the R2 sample type and excludes a smaller proportion of sampled students. The R3 sample is the only reporting sample used in NAEP after 2001. The R2 sample was used as the bridge sample type in 2004 bridge studies to examine comparability of scoring based on an assessment sample similar to those used for LTT in 2001 and years prior.
SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2004, 2008, and 2012 Long-Term Trend Assessments.


Last updated 02 November 2022 (SK)