NAEP Technical DocumentationEstimation of the Degrees of Freedom

Because of clustering and differential weighting in the sample, the degrees of freedom are less than for a simple random sample of the same size. The degrees of freedom of this t test are defined by a Satterthwaite (Johnson and Rust 1992) approximation as follows:

The degrees of freedom equals the square of the sum over k from one to uppercase n of uppercase s sub a sub k squared divided by the sum over k from one to uppercase n of uppercase S sub A sub k to the fourth divided by degrees of freedom sub A sub k

where N is the number of student groups involved, and the estimate df_Ak is as follows:

The degrees of freedom sub a sub k equals the quantity 3.16 minus 2.77 divided by the square root of m times the square of the sum over j from one to m of the square of the quantity t sub j k minus t sub k divided by the sum over j from one to m of the quantity t sub j k minus t sub k to the fourth power

where m is the number of jackknife replicates (usually 62 in NAEP), t_j is the j^th replicated estimate for the mean of a student group, and t_k is the estimate of the group mean using the overall student group weights and the first plausible value.

The number of degrees of freedom for the variance equals the number of independent pieces of information used to generate the variance. In the case of data from NAEP, the 62 pieces of information are the squared differences (t_jk-t_k)², each supplying at most one degree of freedom (regardless of how many individuals were sampled within primary sampling units (PSUs). If some of the squared differences (t_jk-t_k)² are much larger than others, the variance estimate of m_k is predominantly estimating the sum of these larger components, which dominate the remaining terms. The effective degrees of freedom of S_Ak in this case will be nearer to the number of dominant terms. The estimate df_Ak reflects these relationships.

The two formulas above illustrate that when the estimate df_Ak is small, the degrees of freedom for the t test, df, will also be small. This will tend to be the case when only a few PSU pairs have information about student group differences relevant to a t test. It will also be the case when a few PSU pairs have group differences much larger than other PSU pairs.

Last updated 11 November 2008 (GF)

Printer-friendly Version