Examples to demonstrate dataset manipulation in SAS
Data is king in statistical analysis. Here provide several examples related to data input, manipulation, and merging.
Read data from a file
When using SAS, you will alsways want to read data from a existing data file. This will simplify the copy data around and waste the disk space, share a same set of data with as many as possible SAS scripts. So using data files is convenient and necessary. The following piece of code is to demonstrate how to use infile command in data statement.
data bone; infile '../multivar/BONE.DAT'; input person y1-y4; drop person; run;
Generate multiple observations from a record
Multiple observations can be generate from one data records in data file or data block. Sometime it is necessary to do that because different data arrangements are necessary for statistical procedure. If the raw data is entered and organized in one way and the data analysis should be done in another way, you should turn to this technology. The following piece of code demonstrate this technology.
data set1; input DONUT FAT1-FAT6; FAT=FAT1; REP=1; output; FAT=FAT2; REP=2; output; FAT=FAT3; REP=3; output; FAT=FAT4; REP=4; output; FAT=FAT5; REP=5; output; FAT=FAT6; REP=6; output; drop FAT1-FAT6; cards; 1 64 72 68 77 56 95 2 78 91 97 82 85 77 3 75 93 78 71 63 76 4 55 66 49 64 70 68 ; run;
Merge multiple datasets into one dataset
When data necessary for an analysis are distributed in several datasets in SAS, you cannot do the corresponding analysis unless you merge the several datasets into one dataset. There are two general classes of data merging. One is stack two datasets together and the other is merging by match.
a) Two data sets have same number of variables in same order.
DATA datsetname;
SET dataset1 dataset2;
RUN;
b) Merge two datasets by a common index in a data step ;
Here is simple example. Two datasets (data1 and data2) which have a common index field (id) can be simply merged by the following data step.
DATA dadfam ; MERGE dads2 faminc2; BY famid; RUN:
However, the two data sets may have records that do not match. If this is the case, you can use the following data step to indicate which data set the data come from.
DATA merge121; MERGE dads(IN=fromdadx) faminc(IN=fromfamx); BY famid; fromdad = fromdadx; fromfam = fromfamx; RUN;
There are a lot of good web resources that discuss data management in SAS. Here list some links.
