College of Natural Sciences
 
FAQs
This is for IE7 to hold div open

General FAQ #8: Comments/Codebook in an external data file

Question:

I have an external data file that I would like to read into a statistical software package, preferably SAS or SPSS. I've included a codebook at the top of the data file. How can I tell SAS or SPSS to start reading the data after skipping the first n lines of the data file?

Answer:

SPSS can perform this task with either an Excel or text file.

For Excel, open the external data file. Uncheck the box that says “Read variable names from the first row of data.” In the box labeled “Range”, specify the first and last cell of the Excel spreadsheet to be read.

For example, A5:H35 tells SPSS to begin reading data in the first column, fifth row, continuing to read data by row until reaching the cell in the eighth column, thirty-fifth row. This will eliminate the first four rows from the SPSS dataset.

Unfortunately, variable names cannot be read into SPSS using this method; they must be manually entered in the SPSS dataset.

For a text file, open the external data file. A Text Import Wizard dialog box will appear. Follow the prompts as explained below:

Step 1 – No action is necessary; click next.

Step 2 – Specify the delimiter(s) that separate the data into columns and indicate if variable names are in the first row of data. If variable names are in the top row of the text file, SPSS will use these names in the new dataset and still allow you to begin reading data from a specified line.

Step 3 - Designate the line number corresponding to the first case. The “Data preview” box at the bottom of the dialog box shows the first few lines of the dataset as a check that SPSS is reading the data correctly.

Step 4 – Indicate how the data are separated.

Step 5 – Name and format variables.

Step 6 – The file and syntax can be saved.

In SAS, use the FIRSTOBS= option in the INFILE statement. This option tells SAS which line of the infile to start reading data from.

For example, use the following syntax to begin reading data on line 21 of the external data file RAW.DAT located in the TEMP subdirectory of your C: disk drive:

INFILE 'c:\temp\raw.dat' FIRSTOBS = 21 ;

For more information on the infile statement in SAS, use the online SAS manual at http://support.sas.com/documentation/onlinedoc/base/index.html. Go to the SAS OnlineDoc under Base SAS 9.1.3 Procedures Guide and click the Index tab. You can then search for infile.

If you have more questions, send e-mail to stats@ssc.utexas.edu.