Skip Main Navigation Skip To Navigation Content

TechTalk

You are here: Census.govTechTalk › Rectangularize Public Use Microdata Sample CD-ROM data file
Skip top of page navigation

Rectangularize Public Use Microdata Sample CD-ROM data file

Some SAS users have experienced problems with the SAS program on the CD-ROM. This may be caused by an incompatiblity with a particular version or installation of SAS.

This program rectangularizes the PUMS file. It combines both the housing unit record and all associated person records into a single record type (vacant housing units will have no associated person records). This is done by adding the housing information to each person record. Both record types initially must be contained in the same file with the following sort : Serial Number + Record Type. The CD-ROM files are already in this format.

It is not necessary to include every variable.

The program below can be used as a template. Different versions can be created by using only a few variables of interest at one time. The result is a more manageable SAS data set. This can be done by making a copy of the program and removing references to other variables.

SAS code

Download SAS program [SAS, 10KB] for 1990 file

/* creates a comprehensive SAS data set in a rectangular ("flat") format: one
 * record for each person in the sample, containing every variable from both
 * record types (even the allocation flags), plus a record for each vacant
 * housing unit, with the Person-record variables set to "missing". */
libname pums 'c:\sas';
data pums.pumsbxde;
infile 'd:\pumsbxde.txt' lrecl=231;
input @1 RecType $ 1. @;
if RecType = 'H' then 
        do; *(Household record);
        input    @2 SerialNo $ 7.
                 @9 Sample   $ 1.
                @10 Division $ 1.
                @11 State    $ 2.
                @13 PUMA     $ 5.
                @18 AreaType $ 2.
                @20 MSAPMSA  $ 4.
                @24 PSA      $ 3.
                @27 SubSampl $ 2.
                @29 HousWgt    4.
                @33 Persons    2.
                @35 GQType   $ 1.
                @39 Units1   $ 2.
                @41 HUSFlag  $ 1.
                @42 PDSFlag  $ 1.
                @43 Rooms    $ 1.
                @44 Tenure   $ 1.
                @45 Acreage  $ 1.
                @46 CommUse  $ 1.
                @47 Value    $ 2.
                @49 Rent1    $ 2.
                @51 Meals    $ 1.
                @52 Vacancy1 $ 1.
                @53 Vacancy2 $ 1.
                @54 Vacancy3 $ 1.
                @55 Vacancy4 $ 1.
                @56 YrMoved  $ 1.
                @57 Bedrooms $ 1.
                @58 Plumbing $ 1.
                @59 Kitchen  $ 1.
                @60 Telephon $ 1.
                @61 Vehicles $ 1.
                @62 Fuel     $ 1.
                @63 Water    $ 1.
                @64 Sewage   $ 1.
                @65 YrBuilt  $ 1.
                @66 Condo    $ 1.
                @67 OneAcre  $ 1.
                @68 AgSales  $ 1.
                @69 ElecCost   4.
                @73 GasCost    4.
                @77 WatrCost   4.
                @81 FuelCost   4.
                @85 PropTax  $ 2.
                @90 PropIns    4.
                @94 Mortgage $ 1.
                @95 Mortgag3   5.
                @100 TaxIncl  $ 1.
                @101 InsIncl  $ 1.
                @102 Mortgag2 $ 1.
                @103 MortAmt2   5.
                @108 CondoFee   4.
                @112 Moblhome   4.
                @116 RFarm    $ 1.
                @117 RGRent     4.
                @121 RGRAPI   $ 2.
                @124 ROwnrCst   4.
                @129 RNSMOCPI   3.
                @132 RRentUnt $ 1.
                @133 RValUnt  $ 1.
                @134 RFamInc    7.
                @141 RHhInc     7.
                @148 RWrkr89  $ 1.
                @149 RHhLang  $ 1.
                @150 RLingIso $ 1.
                @151 RHhFamTp $ 2.
                @153 RNatAdpt   2.
                @155 RStpChld   2.
                @157 RFamPers   2.
                @159 RRelChld   2.
                @161 RNonRel  $ 1.
                @162 R18Undr  $ 1.
                @163 R60Over  $ 1.
                @164 R65Over  $ 1.
                @165 RSubFam  $ 1.
                @166 AUnits1  $ 1.
                @167 ARooms   $ 1.
                @168 ATenure  $ 1.
                @169 AAcres1  $ 1.
                @170 ACommuse $ 1.
                @171 AValue   $ 1.
                @172 ARent1   $ 1.
                @173 AMeals   $ 1.
                @174 AVacncy2 $ 1.
                @175 AVacncy3 $ 1.
                @176 AVacncy4 $ 1.
                @177 AYrMoved $ 1.
                @178 ABedRoom $ 1.
                @179 APlumbng $ 1.
                @180 AKitchen $ 1.
                @181 APhone   $ 1.
                @182 AVehicle $ 1.
                @183 AFuel    $ 1.
                @184 AWater   $ 1.
                @185 ASewer   $ 1.
                @186 AYrBuilt $ 1.
                @187 ACondo   $ 1.
                @188 AOneAcre $ 1.
                @189 AAgSales $ 1.
                @190 AElecCst $ 1.
                @191 AGasCst  $ 1.
                @192 AWatrCst $ 1.
                @193 AFuelCst $ 1.
                @194 ATaxAmt  $ 1.
                @195 AInsAmt  $ 1.
                @196 AMortg   $ 1.
                @197 AMortg3  $ 1.
                @198 ATaxIncl $ 1.
                @199 AInsIncl $ 1.
                @200 AMortg2  $ 1.
                @201 AMrtAmt2 $ 1.
                @202 ACndoFee $ 1.
                @203 AMoblHme $ 1.;
                if Persons=0 then output; *Vacant: P-rec vars will be missing;
        end;
else
        do; *(Person record);
        input    @2 SerialNo $ 7.
                 @9 Relat1   $ 2.
                @11 Sex      $ 1.
                @12 Race     $ 3.
                @15 Age        2.
                @17 Marital  $ 1.
                @18 PWgt1      4.
                @26 REmplPar $ 3.
                @29 RPOB     $ 2.
                @31 RSpouse  $ 1.
                @32 ROwnChld $ 1.
                @33 RAgeChld $ 1.
                @34 RRelChl2 $ 1.
                @35 Relat2   $ 1.
                @36 SubFam2  $ 1.
                @37 SubFam1  $ 1.
                @38 Hispanic $ 3.
                @41 Poverty  $ 3.
                @44 POB      $ 3.
                @47 Citizen  $ 1.
                @48 Immigr   $ 2.
                @50 School   $ 1.
                @51 YearSch  $ 2.
                @53 Ancstry1 $ 3.
                @56 Ancstry2 $ 3.
                @59 Mobility $ 1.
                @60 MigrStat $ 2.
                @62 MigPUMA  $ 5.
                @67 Lang1    $ 1.
                @68 Lang2    $ 3.
                @71 English  $ 1.
                @72 Military $ 1.
                @73 RVetServ $ 2.
                @75 Sept80   $ 1.
                @76 May75880 $ 1.
                @77 Vietnam  $ 1.
                @78 Feb55    $ 1.
                @79 Korean   $ 1.
                @80 WWII     $ 1.
                @82 OthrServ $ 1.
                @83 YrsServ  $ 2.
                @85 Disabl1  $ 1.
                @86 Disabl2  $ 1.
                @87 MobilLim $ 1.
                @88 PersCare $ 1.
                @89 Fertil   $ 1.
                @91 RLabor   $ 1.
                @92 WorkLwk  $ 1.
                @93 Hours    $ 2.
                @95 POWState $ 2.
                @97 POWPUMA  $ 5.
                @102 Means    $ 2.
                @104 Riders   $ 1.
                @105 Depart   $ 4.
                @109 TravTime $ 2.
                @111 TmpAbsnt $ 1.
                @112 Looking  $ 1.
                @113 Avail    $ 1.
                @114 YearWrk  $ 1.
                @115 Industry $ 3.
                @118 Occup    $ 3.
                @121 Class    $ 1.
                @122 Work89   $ 1.
                @123 Week89   $ 2.
                @125 Hour89   $ 2.
                @127 REarning   6.
                @133 RPIncome   6.
                @139 Income1    6.
                @145 Income2    6.
                @151 Income3    6.
                @157 Income4    6.
                @163 Income5    5.
                @168 Income6    5.
                @173 Income7    5.
                @178 Income8    5.
                @183 AAugment $ 1.
                @184 ARelat1  $ 1.
                @185 ASex     $ 1.
                @186 ARace    $ 1.
                @187 AAge     $ 1.
                @188 AMarital $ 1.
                @189 AHispan  $ 1.
                @190 ABirthPl $ 1.
                @191 ACitizen $ 1.
                @192 AImmigr  $ 1.
                @193 ASchool  $ 1.
                @194 AYearSch $ 1.
                @195 AAncstr1 $ 1.
                @196 AAncstr2 $ 1.
                @197 AMoblty  $ 1.
                @198 AMigStat $ 1.
                @199 ALang1   $ 1.
                @200 ALang2   $ 1.
                @201 AEnglish $ 1.
                @202 AVetS1   $ 1.
                @203 AServPer $ 1.
                @204 AYrsServ $ 1.
                @205 ADisAbl1 $ 1.
                @206 ADisAbl2 $ 1.
                @207 AMoblLim $ 1.
                @208 APerCare $ 1.
                @209 AFERTIL  $ 1.
                @210 ALabor   $ 1.
                @211 AHours   $ 1.
                @212 APOWSt   $ 1.
                @213 AMeans   $ 1.
                @214 ARiders  $ 1.
                @215 ADepart  $ 1.
                @216 ATranTme $ 1.
                @217 ALstWrk  $ 1.
                @218 AIndustr $ 1.
                @219 AOccup   $ 1.
                @220 AClass   $ 1.
                @221 AWork89  $ 1.
                @222 AWks89   $ 1.
                @223 AHour89  $ 1.
                @224 AIncome1 $ 1.
                @225 AIncome2 $ 1.
                @226 AIncome3 $ 1.
                @227 AIncome4 $ 1.
                @228 AIncome5 $ 1.
                @229 AIncome6 $ 1.
                @230 AIncome7 $ 1.
                @231 AIncome8 $ 1.;
                output;
        end;
/* Need Retain statement for H-record variables only: don't want them set to
 * "missing" each time you read a P-record, but do want P-rec vars to be
 * "missing" in records for vacant housing units */
retain SerialNo--AMoblHme;
* You may want to add a Keep or Drop statement to eliminate unneeded variables
  (or just delete them from the above lists) and a Length statement (default=3
  except for income, rent, cost vars, etc.) to make the saved data set smaller;
run;

Source: U.S. Census Bureau | Administrative and Customer Services Division | (301) 763-7710 |  Last Revised: February 13, 2013