Skip Navigation
small header image
The Condition of Education Indicator List Site Map Back to Home
User's Guide


Data Sources and
Estimates

Standard Errors

- Data Analysis and Interpretation

Variation in Populations

Rounding and Other Considerations
Technical Guide

Data Analysis and Interpretation

When estimates are from a sample, caution is warranted when drawing conclusions about the size of one population estimate in comparison to another, or about whether a time series of population estimates is increasing, decreasing, or staying about the same. Although one estimate may be larger than another, a statistical test may find that there is no measurable difference between the two estimates because of the standard error associated with one or both of the estimates. Whether differences in means or percentages are statistically significant can be determined using the standard errors of the estimates.

Readers who wish to compare two sample estimates to see if there is a statistical difference will need to estimate the precision of the difference between the two sample estimates. This would be necessary if one wanted to compare, for example, the mean proficiency scores between groups assessed in the National Assessment of Educational Progress (NAEP). To estimate the precision of the difference between two sample estimates, one must find the standard error of the difference between the two sample estimates (EA and EB). Expressed mathematically, the difference between the two is EA-EB. The standard error of the difference (seA-B) can be calculated by taking the square root of the sum of the two standard errors associated with each of the two sample estimates (seA and seB) after each has been squared. This relationship can be expressed as

seA-B√seA2 + seB2

After finding the standard error of the difference, one divides the difference between the two sample estimates by this standard error to determine the "t value," or "t statistic," of the difference between the two estimates. This t statistic measures the precision of the difference between two independent sample estimates. The formula for calculating this ratio is expressed mathematically as

t = EA-EB
     seA-B

Assuming a normal distribution, the next step is to compare this t statistic to 1.96, the statistically determined value for making a decision at a 95 percent confidence level as to whether there is a statistical difference between two estimates. A 95 percent confidence level means that if a test is conducted 100 times, only 5 times out of 100 would it be expected that the difference between the two sample estimates (EA and EB) is due to chance alone. Therefore, if the t statistic is greater than 1.96, then there is evidence that a difference exists between the two populations. If the t statistic is equal to or less than 1.96, then there is less certainty that the observed difference is a real difference and is not simply due to sampling error. This level of certitude, or significance, is commonly referred to as the ".05 level of (statistical) significance."

As an example of a comparison between two sample estimates to determine whether there is a statistically significant difference between the two, consider the data on the performance of 12th-grade students in the 1992 and 2005 NAEP reading assessments (see table A-12-1). The average scale score in 1992 was 292 and the average scale score in 2005 was 286. Is the difference of 6 scale points between these two different samples statistically significant? The standard errors of these estimates are 0.6 and 0.6, respectively (see table S-12-1). Using the formula above, the standard error of the difference is 0.85. The t statistic of the estimated difference of 6 scale points to the standard error of the difference is 7.07. This value is greater than 1.96—the critical value of the t distribution for a .05 level of significance with a large sample. Thus, one can conclude that there was a statistically significant difference in the performance of 12th-graders between 1992 and 2005 in reading and that the reading score for 12th-graders in 2005 was lower than the reading score for 12th-graders in 1992.

For all indicators in The Condition of Education that report estimates based on samples, differences between estimates (including increases or decreases) are stated only when they are statistically significant. To determine whether differences reported are statistically significant, two-tailed t tests at the .05 level are typically used. The t test formula for determining statistical significance is adjusted when the samples being compared are dependent. The t test formula is not adjusted when performing multiple comparisons. When the difference between estimates is not statistically significant, tests of equivalence are often used. An equivalence test determines the probability (generally at the .15 level) that the estimates are statistically equivalent, that is, within the margin of error that the two estimates are not substantively different. When the difference is found to be equivalent, language such as "x" and "y" "were similar" or "about the same" has been used; otherwise, the data will be described as having "no measurable difference." When the variables to be tested are postulated to form a trend, the relationship may be tested using linear regression, logistic regression, or ANOVA trend analysis instead of a series of t tests. These other methods of analysis test for specific relationships (e.g., linear, quadratic, or cubic) among variables.

A number of considerations influence the ultimate selection of data years featured in The Condition of Education. To make analyses as timely as possible, the latest year of data is shown if it is available during report production. The choice of comparison years is often also based on the need to show the earliest available survey year, as in the case of the NAEP and the international assessment surveys. In the case of surveys with long time frames, such as for enrollment, the decade's beginning year (e.g., 1980 or 1990) often starts the trend line. In the figures and tables of the indicators, intervening years are selected in increments in order to show the general trend. The narrative for the indicators typically compares the most current year's data with those from the initial year and then with those from a more recent period. Where applicable, the narrative may also note years in which the data begin to diverge from previous trends.

1990 K Street, NW
Washington, DC 20006, USA
Phone: (202) 502-7300 (map)