College of Natural Sciences
 
FAQs
This is for IE7 to hold div open

AMOS FAQ #7: Handling non-normal data using AMOS

Question:

I am using AMOS to fit a model to my data. I am concerned that I may have non-normal input data. How can I check the normality of my data and, if necessary, make adjustments for it when using AMOS?

Answer:

This FAQ assumes that you understand the assumptions of structural equation models (SEM) and can specify and test SEMs using AMOS. If not, see our AMOS tutorial. This FAQ also assumes you have read our FAQ on why non-normal input data are a problem for SEMs and the various methods that are used to deal with non-normal data in popular SEM software programs; see General FAQ #33: Handling non-normal data in structural equation modeling (SEM).

There are three steps you can take when you believe your data are not normally distributed and you are using AMOS:

  1. Verify that your variables are not distributed joint multivariate normal
  2. Assess overall model fit using the Bollen-Stine corrected p-value
  3. Use the bootstrap to generate parameter estimates, standard errors of parameter estimates, and significance tests for individual parameters

Each of these steps is explained in more detail below.

Diagnosing non-normality

The first step in dealing with non-normal sample data is to verify that it is non-normal. Consider the cars.sav example database provided as part of the SPSS program. The database contains the following variables of interest:

If you eliminate the cases with incomplete (i.e., missing) data, the remaining database contains 392 cases. Suppose you plan to fit the following model to the cars database.

Regression Model Diagram

AMOS can assess the univariate skewness and kurtosis of each variable contained in the model, as well as the joint multivariate kurtosis. To request that these statistics be included in the AMOS output, choose:

View/Set
    Analysis Properties

Click the Output tab and then check the Tests for normality and outliers check box. Also check the Standardized estimates and Squared multiple correlations tabs.

Analysis Properties Window

Run the model by selecting Calculate Estimates from the Model Fit menu. Next, examine the Normality portion of the output. Each observed variable has a minimum value, maximum value, skewness value, critical ratio for skewness, kurtosis value, and critical value for kurtosis reported. Critical values that exceed +2.00 or that are smaller than -2.00 indicate statistically significant degrees of non-normality. AMOS also reports the joint multivariate kurtosis value and its associated critical ratio at the bottom of the table in the row labeled Multivariate.

Multivariate Normality Test Output

Practically, very small multivariate kurtosis values (e.g., less than 1.00) are considered negligible while values ranging from one to ten often indicate moderate non-normality. Values that exceed ten indicate severe non-normality. In this example, every variable except Accel departs significantly from normality according to the critical ratio criterion, but the Year variable is clearly the most extremely non-normal variable in the model.

The Bollen-Stine Bootstrap and associated test of overall model fit

One method to correct for non-normality in the underlying database is to use the Bollen-Stine p-value rather than the usual maximum likelihood-based p-value to assess overall model fit. To obtain the Bollen-Stine test, choose:

View/Set
    Analysis Properties

then select the Bootstrap tab and check the Perform bootstrap and Bollen-Stine boostrap check boxes. Specify the number of bootstrap samples you would like AMOS to draw for computing the Bollen-Stine p-value. The example shown here features 2000 drawn samples.

BootstrapTab in Analysis Properties

Window

The output from the Bollen-Stine bootstrap is broken into three parts. The first section contains diagnostic information. If a solution is not found for a particular bootstrap sample or AMOS is unable to fit the model in a given bootstrap sample due to a singular covariance matrix, AMOS will draw a replacement sample to ensure that the final results are based upon the actual number of usable samples that the user initially requested.

Bootstrap Diagnostic Information

In the bottom portion of the output shown, you can see that no samples were discarded due to inability to find a solution or due to a singular covariance matrix. However, if AMOS discards more than a few samples, you should double check your model specification and re-run the analysis.

AMOS is flexible in that it can use one of two different methods for minimization during the bootstrap process. According to the AMOS help system, Method 0 converges quickly for easy problems, but is slow for difficult problems. Method 0 is not yet available in AMOS, so the Method 0 column will always contain zero values for all rows. By contrast, Method 1 is a fast and generally reliable algorithm, so AMOS will first perform minimization using Method 1. If Method 1 minimization is too difficult for a particular bootstrap sample, AMOS will switch to Method 2 which is slower than Method 1, but more reliable. Each method's column lists the number of samples for which AMOS arrived at a successful solution for that many iterations. For instance, in 17 of the 2000 bootstrap samples AMOS arrived at a successful solution using Method 1 in seven iterations. By contrast, 167 samples converged in just four iterations when AMOS switched to Method 2. The Total row shows that 1357 of the 2000 bootstrap samples converged successfully using Method 1 whereas the remaining 643 samples employed Method 2 successfully.

The second portion of the output displays the p-value for the hypothesis test of overall model fit.

Bollen-Stine test of overall model fit

Recall that you requested 2000 bootstrap samples from AMOS. In this example AMOS found that the model fit better than expected in 240 of the 2000 samples, or 240/2000 = .12, which is the obtained p-value of overall model fit. Using a conventional significance level of .05, you would not reject this model; you would conclude that it fits the data well.

By contrast, consider the normal-theory maximum likelihood chi-square test of model fit. This is the familiar test statistic that you would ordinarily use to assess model fit. Since it assumes joint multivariate normality of the observed variables, and, as you saw above in the diagnostics segment of this FAQ, these variables are clearly non-normally distributed, it is little surprise that this test rejects the null hypothesis of overall model fit: chi-square = 10.061 with 3 DF, p = .018. In this instance, the Bollen-Stine bootstrap enables you to accept a model that you would otherwise reject using the maximum likelihood-based chi-square.

The final segment of the Bollen-Stine bootstrap output illustrates the distribution of the chi-square values obtained for the 2000 bootstrap samples.

Chi-square distributions from bootstrap
samples

The most notable features of this output are the mean chi-square value and the general shape or form of the distribution of chi-square values. Across the 2000 samples, the expected chi-square value of 4.94 is higher than the value expected under joint multivariate normality, which is the same as the model's degrees of freedom: 3. The mean chi-square from the bootstrap samples serves as the critical chi-square value against which the obtained chi-square of 10.061 is compared. When the obtained chi-square is compared to 4.94, the p-value associated with that hypothesis test is .120 and is therefore not statistically significant. By contrast, when the obtained chi-square of 10.061 is compared to the critical chi-square expected under joint multivariate normality of the observed variables, 3.00, the p-value is .018 and is therefore considered statistically significant at the usual .05 cutoff criterion.

Interestingly, the form of the distribution of chi-square values obtained from the bootstrap replications shows a number of values clustering near the multivariate normal expected value of 3.00, but there are also a substantial number of values that exceed 3.00 and even some values that are in double-figures. The distribution of the chi-square values is decidedly non-normal, but that is not a problem for the Bollen-Stine test statistic.

Bootstrapped parameter estimates and standard errors

After you obtain satisfactory overall model fit, the next questions you are likely to pose are: What path coefficients are statistically significant and what are their values? AMOS provides an array of bootstrapping options to address these questions. Unfortunately, you cannot obtain bootstrap parameter estimates and their associated standard errors at the same time as the Bollen-Stine p-value, so you must return to the Bootstrap tab in the Analysis Properties window.

Bootstrap tab in Analysis Properties
Window

In this analysis, you deselect the Bollen-Stine bootstrap checkbox and select the Percentile confidence intervals and Bias-corrected confidence intervals check boxes. Set the number of bootstrap samples at 250 based upon the recommendations of Nevitt and Hancock (1998): Nevitt and Hancock (1998) found little improvement in the quality of bootsrap estimates due to larger numbers of bootstrap samples. If you plan to interpret probability values (also known as p-values) as shown below, you should use a larger number of bootstrap samples (e.g., 2000) to ensure stable probability estimates.

The relevant output from the analysis appears below. Bootstrap parameter estimates are computed for each parameter estimate in the model: regression (path) coefficients, variances, covariances, and means and intercepts (if these quantities are estimated). For presentation purposes, selected output showing the original normal theory maximum likelihood-based covariance estimates and their bootstrap-based counterparts are shown here. The actual table is too wide to fit on a single figure, so the output from this table will be shown in multiple figures. The first figure, shown immediately below, displays the normal theory maximum likelihood estimates of the covariances of the independent variables in the model.

Normal theory estimates

The initial part of this output contains the familiar Estimate, S.E. (standard error), and C.R. (Critical Ratio, the estimate divided by its standard error) quantities that are computed assuming normal distribution of the observed variables. Notice that each covariance is statistically significant. In particular, take note of the hypothesis test that the WEIGHT with YEAR covariance is equal to zero in the population of cars from which this sample was drawn. The normal theory parameter estimate is -504.570 with an estimated standard error of 299.771. Dividing -504.771 by 299.771 returns a critical ratio of -2.196, which is statistically significant using the conventional .05 cutoff level for statistical significance (at alpha = .05, critical ratios that fall between -1.96 and +1.96 are not statistically significant). The p-value of .028 shown in the table above is the p-value from the normal theory test of the null hypothesis that the covariance between WEIGHT and YEAR is zero in the population of cars from which this sample was drawn. Next, consider the bootstrap output from the same table:

Bootstrapparameter estimates

The Bootstrap section of the output contains the mean of the parameter estimates from the multiple bootstrap samples. The difference between the maximum likelihood-based estimate and the bootstrap-based estimate is shown in the Bias column. Large bias values, as is the case here, suggest a substantial discrepancy between the results of the bootstrap analysis and the original normal theory-based analysis.

You can use the bootstrap Mean and SE columns to compute critical ratio values based on the bootstrap results. For example, consider testing the null hypothesis that the covariance between WEIGHT and YEAR in the table shown above is zero. The mean parameter estimate value from the 250 bootstrap samples is -542.551 with an estimated standard error equal to 403.171. Notice that the estimated standard error across the bootstrap samples is almost twice as large as the normal theory standard error. The result of this discrepancy has a profound impact on the significance test for the WEIGHT and YEAR covariance: When you divide the bootstrap parameter estimate by the estimated standard error (-542.551/403.171), the resulting critical ratio, 1.35, is not statistically significant. There is no p-value for this test reported in the AMOS output. Instead, consider referring to the percentile-corrected and bias-corrected hypothesis tests. The same AMOS table contains the percentile-corrected (PC Confidence) and bias-corrected (BC Confidence) confidence intervals and p-values:

Bias-corrected and percentile-corrected p-values
and confidence intervals

The results are largely consistent across the two methods, though the substantive conclusions you would draw for the ENGINE with YEAR covariance would differ depending upon the choice of technique: The percentile-corrected confidence interval rejects the null hypothesis that the parameter estimate is zero in the population of cars sampled, yet the bias-corrected p-value indicates a failure to reject the null hypothesis (p = .059). Moony and Duval (1993, p. 50) note that the various available bootstrap confidence interval techniques can and frequently do perform differently under different circumstances. Therefore, there is no one best method to use in all data analysis situations. A sensible recommendation offered by Mooney and Duval is to report multiple confidence interval types and allow your audience to draw appropriate conclusions from the results.

Notice in this example that the upper and lower bias-corrected confidence intervals do not include zero, yet the p-value of .059 is not statistically significant. This is because the p-values are computed independently of the confidence intervals. If you return to the Bootstrap tab in the Analysis Properties window and change the default 90% confidence intervals to 95%, the upper and lower confidence interval values for the covariances will change, though the p-values for both the PC and BC confidence intervals remain unchanged. In this example, with 95% confidence intervals the ENGINE and YEAR covariance confidence interval includes zero.

AMOS will compute bootstrap test statistics for all requested output, including standardized coefficients, squared multiple correlations (r-square values), and total and indirect effects. This last feature is very useful - even if your data meet the assumption of multivariate normality, you may still want to explore bootstrapping to test significance of indirect effects, standardized coefficients, or squared multiple correlations.

Cautions

There are several cautionary notes to keep in mind when you use bootstrapping with AMOS. First, AMOS requires that the input database be complete for diagnosing sample data non-normality and for using any of its bootstrap features. In other words, if you have missing data, you must solve the missing data problem before you can use AMOS's non-normality diagnostic and bootstrap features. In this example, the total number of cases in the cars database was 406. Omitting cases with missing values resulted in a database containing 392 cases. Approximately 3.5% of the original cases were lost due to missingness. According to Roth (1994), with case losses of 5% or less, removal of entire cases (listwise data deletion) is a defensible strategy for handling the incomplete data problem. If removing cases results in a loss of data that exceeds 5%, however, other methods for handling missing data may be more appropriate. See General FAQ #25: Handling missing or incomplete data for details on missing data handling methods.

Second, your sample size should be sufficiently large to ensure trustworthy parameter estimates. Nevitt and Hancock (1998) suggest a minimum sample size of 200 for SEMs that contain latent variables. Finally, the bootstrap method requires the data analyst to set the scale of latent variables by fixing a latent variable's value to 1.00 rather than by fixing the corresponding factor's variance value to 1.00 because under the latter scenario bootstrapped standard error estimates may be artificially inflated by switching positive and negative factor loadings across bootstrap samples (Hancock & Nevitt, 1999).

References

For more information about non-normal data handling in AMOS, see the following references:

Arbuckle, J., & Wothke, W. (1999). AMOS 4.0 User's Guide. Chicago, IL: Smallwaters Corporation.

Fouladi, R. T. (1998). Covariance structure analysis techniques under conditions of multivariate normality and nonnormality -
Modified and bootstrap test statistics. Paper presented at the American Educational Research Association Annual Meeting,
April 11-17, 1998, San Diego, CA.

Hancock, G. R., & Nevitt, J. (1999). Bootstrapping and the identification of exogenous latent variables within structural equation models. Structural Equation Modeling, 6(4), 394-399.

Mooney, C. Z., & Duval, R. D. (1993). Bootstrapping: A nonparametric approach to statistical inference. Newbury Park, CA: Sage Publications.

Nevitt, J., & Hancock, G. R. (1998). Relative performance of rescaling and resampling approaches to model chi-square and parameter standard error estimation in structural equation modeling. Paper presented at the American Educational Research Association Annual Meeting, April 11-17, 1998, San Diego, CA.

Roth, P. (1994). Missing data: A conceptual review for applied psychologists. Personnel Psychology, 47, 537-560.

If you have further questions, send E-mail to stats@ssc.utexas.edu.