Skip to main content

Table of Contents  |  Search Technical Documentation  |  References

NAEP Sample Design → NAEP 2003 Sample Design → 2003 State Assessment Sample Design → School Sample Selection → Sample Selection → Evaluation of State Achievement Data in the Sampling Frame

NAEP Technical DocumentationEvaluation of State Achievement Data in the Sampling Frame

The purpose of this analysis was to determine whether public schools selected for the NAEP 2003 samples were representative of the schools on the NAEP sampling frames in terms of achievement. Percentiles of the achievement distributions were compared between the frame and sample schools for each public school jurisdiction in grades four and eight. The results show that differences between frame and sample percentiles are not statistically significant except for grade 8 New Hampshire, which differed only slightly at the 90th percentile level.

Achievement Data

The achievement variable used in the analysis was the same variable used in NAEP sample design to stratify the public school frame. For most jurisdictions, the variable was an achievement score provided by the jurisdiction. However, for some jurisdictions where achievement data were not available, median household income from the 1990 Census was used. (In 1990, the Census determined median household income based on the five-digit ZIP Code area in which the school was located.) The achievement data consisted of various types of school-specific achievement measures from state assessment programs. The type of achievement data available varied by jurisdiction. For instance, in some states, the measure was the average score for a given state assessment. In other states, the measure was a percentile rank or percent of students above a specific score.

During frame development, not every record on the Common Core of Data (CCD) file matched to the achievement data files created for the National Center for Education Statistics (NCES), even in jurisdictions where those data were generally available. For schools that did not match, an attempt was made to impute an achievement score using a hotdeck imputation approach, but in some cases an adequate donor could not be found. Schools which persisted in having missing achievement values even after imputation were removed from the NAEP frame and sample data sets used in this analysis.

Methodology

To determine whether the distributions between the frame and sample schools were different, comparisons of quantile estimates were made for the 10th, 25th, 50th, 75th, and 90th percentile levels for each public school jurisdiction by grade. Frame and sample school estimates were considered statistically different if the frame value fell outside the 95 percent confidence interval of the corresponding sample estimate. The percentile values for the frame schools were calculated by weighting each school by the estimated number of students in the given grade. The percentile estimates for the sample schools were calculated using school weights and weighted by the school measure of size (estimated number of students in the given grade). The 95 percent confidence intervals for the school sample estimates were calculated in WesVar—software for computing estimates of sampling variance from complex sample survey (Westat 2000b)—using the Woodruff method (Sarndal, Swensson, and Wretman 1992) and without the use of a finite population correction factor. A finite population correction is not traditionally used in computing variances for NAEP estimates.

Results

As previously mentioned, sample and frame achievement distributions were determined to be different if at least one of the percentile estimates differed significantly at the 95 percent confidence level. Out of all the jurisdiction and grade comparisons, only one distribution was found to be significantly different. The difference between the frame and sample estimates at the 90th percentile is statistically significant in the New Hampshire grade 8 jurisdiction. The frame estimate was 54,125 compared to the sample estimate of 53,577 (with a 95 percent confidence interval of 53,211–53,933). Although the difference is significant, it is relatively small (1 percent), and the frame estimate barely falls outside the sample 95 percent confidence interval. In addition, the data for grade 8 New Hampshire are actually median income and not achievement. While median income is a good proxy for achievement, it is not perfectly correlated with it.

These results do not come as a surprise. The achievement/median income variable is used as the fourth-level sort order variable in the school systematic selection procedure. While it may be a rather low level sort variable, it nonetheless helps control the representativeness of the sampled schools in terms of achievement.

 


Last updated 09 June 2008 (DB)

Printer-friendly Version