The SIPP sample universe is the civilian, noninstitutionalized population of the United States. The survey samples housing units from the current Master Address File (MAF), which is maintained by the U.S. Census Bureau and is the source of addresses for the American Community Survey, other demographic surveys, and the decennial census. The MAF is updated using the U.S. Postal Service’s Delivery Sequence File and various automated, clerical, and field operations.
SIPP uses a complex sample design rather than a simple random sample to determine which households are interviewed. Specifically, SIPP uses a two-stage sample design to select its sample: (1) the selection of primary sampling units (PSUs), and (2) the selection of address units within sample PSUs. This complex sample design has important implications for the estimation of standard errors. Because the SIPP design is not a simple random sample, the standard errors reported by most off-the-shelf statistical software will underestimate the true standard errors of estimates from SIPP.
PSUs comprise one or more contiguous counties. Single counties are used if the county has a population of 7,500 or more. When the population threshold is not met, adjacent counties are combined. Larger populated PSUs are identified as self-representing (SR) PSUs, while the remaining PSUs are identified as non-self-representing (NSR). Generally, PSUs with 100,000 or more housing units are classified as SR. SR PSUs are in the SIPP sample with certainty, while the NSR PSUs are stratified and selected with a probability proportionate to their size. During the stratification process, NSR PSUs are grouped according to their similarity on specified poverty measures. Given that SIPP uses a state-based sample design, all strata are formed within state boundaries. During the PSU selection process, two NSR PSUs are selected from each stratum with their probability proportionate to their size in relation to the entire stratum in which they belong.
The universe of addresses within each sample PSU is divided into two strata, one with a higher concentration of low-income households and the other with a lower concentration of low-income households. Addresses are sorted by geographic and demographic variables, and a systematic selection of units is taken from each stratum. A higher sampling rate is used in the stratum with the higher concentration of low-income households, thereby resulting in an oversample of low-income households.
Please see the SIPP Users’ Guide specific to your year or panel of analysis for more information on SIPP’s sampling.