How can I perform a confirmatory factor analysis using SAS? How does this differ from an ordinary (exploratory) factor analysis?
A confirmatory factor analysis differs from exploratory (ordinary) factor analysis in that you specify the structure of three matrices a priori (in advance) of data analysis. The three matrices to be specified are 1) the factor loading matrix, 2) the factor intercorrelation matrix, and 3) the unique variance matrix.
The chief advantage of confirmatory factor analysis is that it allows you to test hypotheses about specific factor structures. Thus, the null hypothesis is the solution you specify. If the dataset you analyze departs significantly from the null hypothesis, you reject the null hypothesis and conclude that the factor structure you propose does not fit the obtained data.
To carry out a confirmatory factor analysis in SAS, use PROC CALIS. An example of a confirmatory factor analysis program is detailed below, in which a six-item questionnaire is analyzed. An oblique two-factor solution is hypothesized. It is hypothesized that items 1 through 3 load primarily on Factor 1 while items 4 through 6 load primarily on Factor 2. The unique (error) variances are assumed to be equal and small.
If you wanted items 4 though 6 to be zero for Factor 1 and items 1 through 3 to be zero for Factor 2 (the usual type of hypothesis specified in a confirmatory factor analysis), you could modify the program shown below to impose those constraints by setting the values of appropriate matrix elements to be equal to zero rather than a parameter name with a starting value (e.g., {1,2} = 0).
* Begin Sample Program ;
TITLE ' Confirmatory FA for six-item questionnaire';
PROC CALIS METHOD = LSML ALL NOMOD ;
Var Item1-Item6;
/*
The METHOD = LSML option uses final parameter estimates from
unweighted least-squares as initial estimates for maximum-
likelihood. The ALL option requests all optional output. The
NOMOD option tells SAS not to compute the modification indices--
this option saves computation time when the ALL option is used
*/
FACTOR HEYWOOD N = 2;
/*
N = 2 specifies a two factor solution ;
Option HEYWOOD constrains the diagonal elements of the unique
variance matrix _U_ to be nonnegative
*/
MATRIX _F_
{1,1} = Item1F1 ( .80), {1,2} = Item1F2 ( .20),
{2,1} = Item2F1 ( .80), {2,2} = Item2F2 ( .20),
{3,1} = Item3F1 ( .80), {3,2} = Item3F2 ( .20),
{4,1} = Item4F1 ( .20), {4,2} = Item4F2 ( .80),
{5,1} = Item5F1 ( .20), {5,2} = Item5F2 ( .80),
{6,1} = Item6F1 ( .20), {6,2} = Item6F2 ( .80) ;
/*
The matrix being defined here is _F_, the factor loading matrix.
A MATRIX statement defines the initial values for the
parameter estimates -- any unspecified entry is set to .5.
Numbers in the braces give the location of the entry.
Parameter estimate names such as "Item1F1" are user supplied.
Numbers in parentheses are the hypothesized factor loadings
*/
Matrix _P_ {1, 1} = 1.0, {2, 2} = 1.0, {2, 1} = .60 ;
/*
The _P_ matrix defaults to an identity matrix, indicating an
oblique factor structure
*/
Matrix _U_ {1, 1} = Theta1-Theta6 6*.10 ;
/*
Matrix _U_ is the error or uniqueness matrix. Since we are
assuming equal values for each of the diagonal elements in our
matrix, we can use a shortcut: the notation n*r generates n values
of r. Here 6*.10 tells SAS to set the initial estimates of the
parameters Theta1 through Theta6 to .10.
*/
RUN ;
* End of sample program ;
The assumptions underlying confirmatory factor analysis as well as the interpretation of the output can be exceedingly complex. For more information about the SAS procedures, click on the Help button in the SAS menu bar and scroll to SAS Help and Documentation. For further information about confirmatory factor analysis, speak with a statistical consultant.
If you have further questions, send E-mail to stats@ssc.utexas.edu.