Processing SIPP Data
There are two phases to the processing of SIPP data. At the conclusion of each wave of
interviewing, the data collected during that wave are processed, creating the core wave and
topical module files. That is the first phase of processing. Then, at the conclusion of the final
wave of interviews, core data from all waves are linked and a new set of edit and imputation
procedures is applied to the resulting full panel file. That is the second phase of
processing.
Figure 4-1 illustrates the steps that generate the Census Bureau's internal core wave and full
panel files.
a Most Type Z records in the 1996 Panel were not handled in a separate process.
|
Phase 1 Summary
There are six steps in the first phase of SIPP data processing:
- As each wave of interviewing is completed, core data collected during the wave are edited
for internal consistency.
- Following data editing, the statistical matching and hot-deck procedures described later in
this chapter are used to impute missing data from the core wave file.
- A public use version of the core wave file is then created from the resulting internal core
wave file. The public use file is the same as the Census Bureau's internal file except that it
has certain information suppressed or topcoded to protect the confidentiality of survey
respondents (see sections on Topcoding and Suppression of Geographic Information, at the
end of this chapter).
- On a separate production track from the core data, data from the topical module file
administered with the wave are edited for internal consistency. The extent of data editing
varies across the topical modules, and some topical modules receive almost no editing.
- Next, hot-deck procedures are used to impute missing data in the topical module. The extent
of imputation varies across the topical modules; some topical modules have no missing data
imputed.
- A public use version of the topical module file is created from the resulting internal file. As
with the public use core wave files, the public use topical module files have certain
information suppressed to protect the confidentiality of survey respondents.
These steps are repeated at the conclusion of each wave of interviews. Prior to the 1996 Panel,
each wave was processed independently of other waves of data. Thus, when multiple core wave
files are linked, apparent changes in a respondent's status could be due to different applications
of data edits and imputations to the files being combined (file linkage is the subject of Chapter
13 of the SIPP Users' Guide). With the 1996 data, the hot-deck procedure was redesigned to rely on historical information
reported in prior waves. In addition, other forms of longitudinal imputation, such as carryover
methods, were adapted.
Phase 2 Summary
At the conclusion of the panel, the Census Bureau creates a full panel file containing core data
from all waves. There are four steps to this process.
- Core data from all waves are linked. Those data have already been subjected to the Phase 1
edit and imputation procedures.
- A series of longitudinal edits are applied to the full panel file. Unlike the core wave edit
procedures, these edits are designed to create longitudinally consistent records for each
person. Both reported values and values that were imputed during the first phase of
processing are subject to change. Thus, the data in a full panel file may differ from the data in
the core wave files from which the full panel file was constructed.
- A missing wave imputation procedure is then applied. Data are imputed when a sample
member was absent for one or two consecutive waves but was present for the two adjacent
waves. Data for the missing wave(s) are interpolated on the basis of information from the
fourth month of the prior wave and the first month of the subsequent wave. The missing
wave imputation procedure was introduced with the 1991 Panel. Earlier panels were not
subjected to this procedure.
- A public use version of the full panel file is created from the resulting internal file. The
public use file has certain information suppressed to protect the confidentiality of survey
respondents.
The balance of Chapter 4 of the SIPP Users' Guide
describes in greater detail the full sequence of data edit and imputation procedures
applied to SIPP data files. Most of the material contained in that chapter are
taken from Pennell (1993).
Types of Missing Data
Missing Data Problems
Handling Missing Data
Data Editing and Imputation Goals
Effects of Imputed Data on Analysis
Confidentiality Procedures
|
Main |
Introduction to SIPP |
SIPP Survey Content |
Technical Information |
Using & Linking Files |
SIPP Publications |
|
Access SIPP Data |
SIPP Users' Guide |
SIPP Tutorial |
User Notes/ListServe/News |
SIPP Help |
Page Last Modified: May 9, 2006
Skip this navigation