Public Use Microdata Sample (PUMS) files are now also available from the American Community Survey (ACS) as well as from traditional decennial censuses of population and housing (shown below). The American Community Survey is a nationwide survey that replaced the decennial census long form and is a critical element in the Census Bureau's reengineered 2010 census plan. Questions about ACS PUMS should be directed to the Data Analysis and User Education Branch (301-763-9801) in the American Community Survey Office.
ASCII text data file processing
The data files do not contain a header record (first record or row consisting of fieldnames) and also do not contain field delimiters (commas, tabs etc.), therefore, they can not be automatically parsed by any software, however, SAS™ or SPSS™ code is available from many state data centers (SDCs). If you do not already have one of these programs or a similar program, you may want to use DataFerret (see below) or our software enhanced disc in place of the original ASCII text data files.
Each data file contains two record types: housing unit and person. The first character of each record identifies its type. Each housing unit record is immediately followed by all associated person records (there will be zero in the case of a vacant housing unit). The variable SERIALNO (Housing/Group Quarters (GQ) Unit Serial Number) is the only explicit link between the housing unit and person records. SERIALNO is unique within state only. The housing unit and person records are individually weighted (see the hweight and pweight fields respectively) and contain 266 and 314 characters of data respectively. All of the geographic identifier fields (region, division, state, metropolitan statistical area (MSA), Public Use Microdata Area (PUMA) etc.) are contained in the housing unit records only.
You can extract subsets (selected fields, variables, areas etc.) of the ASCII text data (1990 and 2000 PUMS 1% and 5% samples available) in several output file formats by using the DataFerret data extraction tool. DataFerret is a menu-driven program you download, install, and run that extracts a subset of the PUMS 1% microdata from the internet.
ASCII text data file processing
Same as 2000, however, the following differences may require minor adjustments. There is no state code in the data files so you will need to add a state identifier if you are combining multiple state data files. Also, the shorter record type is padded with blanks to equal the length of the longer record type.
ASCII text data files - 5% (A sample), 1% (B sample) (see Document folder for code lists)
Code lists - 5% PUMA [ZIP, 355 KB], 1% PUMA [ZIP, 377 KB]
Software Integrated Microcomputer Processing System (IMPS)
Revised Texas 5% Data file [EXE, 452 KB] This file contains the 13,853 records missing from the end of PUMSAXTX.TXT on the February 1995 re-release of the 1990 5% PUMS. These records are included on the subsequent release in December 1995.
Some users reported difficulty with the SAS program on the CD-ROM. This may be caused by an incompatiblity with a particular version or installation of SAS. Rectangularizing the data file may help. More information.
Note Records were not individually weighted until 1990.
File Documentation [, 11MB]
Data from earlier (before 1980) PUMS files is available from the National Archives and Records Administration . Copies of the file documentation for these files should be available for browsing from many university libraries.