Sample Design

SIPP uses a complex sample design that has important implications for the estimation of standard errors. Because the SIPP design is not a simple random sample, the standard errors reported by most off-the-shelf statistical software will underestimate the true standard errors of estimates from SIPP. (See Chapter 7 of the SIPP Users' Guide). detailed description of the SIPP sample design and standard error calculations can be found in the third edition of the SIPP Quality Profile, 1998.

The Census Bureau employs a two-stage sample design to select the SIPP sample. The two stages are (1) selection of primary sampling units (PSUs) and (2) selection of address units within sample PSUs.


The frame for the SIPP is the Master Address File (MAF), which is maintained by the U.S. Census Bureau and is the source of addresses for the American Community Survey, other demographic surveys, and the decennial census. The MAF is updated using the U.S. Postal Service’s Delivery Sequence File and various automated, clerical, and field operations.

Selection of Primary Sample Units (PSUs)

PSUs are formed from one or more contiguous counties. Larger populated PSUs are identified as self-representing (SR) PSUs, while the remaining PSUs are identified as non-self-representing (NSR). SR PSUs are in the SIPP sample with certainty while the NSR PSUs are stratified and selected with a probability proportionate to their size. During the stratification process NSR PSUs are grouped together within the same state to form strata. During the PSU selection process, two NSR PSUs are selected from each stratum.

Selection of Addresses in Sample PSUs

The universe of addresses within the sample PSU is divided into two strata, one with a higher concentration of low income households and the other with a lower concentration of low income households. Addresses are sorted by geographic and demographic variables and a systematic selection of units is taken from each stratum. A higher sampling rate is used in the stratum with the higher concentration of low income households, thereby resulting in an oversample of low income households.

Related Information