College of Natural Sciences
 
FAQs
This is for IE7 to hold div open

SAS FAQ #23: Summing variables with missing data in SAS

Question:

I have a number of variables I need to add together using SAS. Some of them have missing data. What is the best way for me to add them together, given that I have missing data?

Answer:

There are two general approaches you can take to sum variables in SAS.

1. The direct adding method.
With this method, you compute a new variable as a straight sum of the current variables you wish to add.

Newvar = Oldvar1 + Oldvar2 + Oldvar3 ;

In this example, Newvar is the sum of Oldvar1 + Oldvar2 + Oldvar3. If Oldvar1 or Oldvar2 or Oldvar3 has missing data for a given case, then the value of Newvar for that case would also be missing. In other words, if any of the variables to be summed have missing data, the new variable will also have missing data.

2. The function method.
With the function method, you use the SAS SUM (OF operator to add up a number of variables. The advantage of this method is that the syntax is much less laborious to type, especially for large numbers of variables.

Newvar = SUM (OF Oldvar1-Oldvar3) ;

Unfortunately, with this method any variable to be summed which has a missing value is treated as zero by SAS. This means, for example, that if Oldvar1 has a value of 4 and Oldvar2 is missing and Oldvar3 has a value of 3, the value of Newvar would be 7 when the SUM (OF function is used. By contrast, the value of Newvar would be missing under method (1) described previously (where you add the variables together using a plus sign).

If you have both a large number of variables to sum and missing data, what can you do? One solution (provided by Karl Wuensch over the Internet) is use the NMISS (OF function in conjunction with the SUM (OF function, like so:

IF NMISS(OF Oldvar1-Oldvar3) > 0 then Newvar = . ;
ELSE Newvar = SUM(OF Oldvar1-Oldvar3) ;

This code first calculates the number of missing values across the variables Oldvar1 through Oldvar3. If SAS finds any missing data, it sets the value of Newvar to be missing. Otherwise, the value of Newvar is set to be the sum of the Oldvar1 through Oldvar3 values which have non-missing cases.

For more information, click on the Help button in the SAS menu bar and scroll to SAS Help and Documentation.

If you have further questions, send E-mail to stats@ssc.utexas.edu.