I'm using multiple SAS datasets in the same program, and I only need to keep a few of all of the variables in the last part of my program. Is there an easy way to keep only the variables I need to use? Should I use KEEP or DROP statements?
One way is to use either the KEEP or DROP statements in your DATA steps. Which statement you should use will depend on how many of the variables in your SAS dataset you wish to keep. If you only wish to keep a few of many variables, then use the KEEP statement. If you want to drop only a few variables, use the DROP statement. The syntax of the two statements is very similiar (and simple).
The syntax for the KEEP statement is:
KEEP var1 var2 varN ;
Here is an example of using the KEEP statement in a SAS program:
DATA one ;
INPUT ssn age sex $ weight ;
CARDS ;
445768976 23 m 129
487593453 35 f 112
442345213 26 m 198
;
DATA two ;
SET one ;
KEEP age weight ;
RUN ;
Here the SAS user has read in some data into dataset ONE. Then a second DATA step creates dataset TWO. The KEEP statement is used so that only the variables AGE and WEIGHT are included in the dataset TWO.
Alternatively, the researcher could have accomplished the same goal by replacing the KEEP statement with a DROP statement and a new variable list, indicating which variables in the first SAS dataset should be dropped from the new SAS dataset.
DROP ssn sex ;
For more information about the KEEP and DROP statements, click on the Help button in the SAS menu bar and scroll to SAS Help and Documentation.
If you have further questions, send E-mail to stats@ssc.utexas.edu.