Survey of Program Dynamics 1997 Bridge survey and Current Population Survey 1997 March supplement: Evaluating selected demographic data by Gregory Fant, Kenneth Bryson, Loretta Bass, and Barbara Downs Fertility and Family Statistics Branch Population Division Bureau of the Census U.S. Department of Commerce Presentation to the American Statistical Association Joint Statistical Meeting August 812, 1999 Baltimore, Maryland
This paper reports the results of research undertaken by employees of the Census Bureau. The views expressed are attributable to the authors, and not to the Census Bureau, U.S. Department of Commerce, or the Federal Government. Survey of Program Dynamics 1997 Bridge survey and Current Population Survey 1997 March supplement: Evaluating selected demographic data by Gregory Fant, Kenneth Bryson, Loretta Bass, and Barbara Downs Key words: demographic variables, nonparametric statistics, panel survey, Survey of Program Dynamics (SPD), Current Population Survey (CPS) (word count: 5846) Abstract (abstract word count: 200) This paper evaluates the quality of selected demographic characteristics from the 1997 Survey of Program Dynamics (SPD) with the March Supplement of the 1997 Current Population Survey (CPS). The SPD, a longitudinal sample survey, was designed to evaluate the impact of the 1996 national, welfare reform legislation by studying its effect, or influence, on a panel of survey respondents over a tenyear period (19922002) for specific estimates of demographic, social, and household economic characteristics. The purpose of this paper is to determine the usefulness and limits of the SPD for different types of survey analysis. Age, race, educational attainment, marital status, relationship to householder, and geographic regions were the basic demographic characteristics selected for this evaluation study. We developed a pair of hypotheses and used them to evaluate the study variables using the KruskalWallis Test (ChiSquare Approximation) (alpha=0.05) for nonparametric data analysis. Our analysis of selected demographic variables did not produce statistically significant results to establish that either the unweighted or weighted distributions of the demographic variables studied from SPD or CPS were different from each other. So, we conclude that for selected demographic variables the 1997 SPD bridge survey data are comparable with 1997 CPS March supplement survey data. This paper reports the results of research undertaken by employees of the Census Bureau. The views expressed are attributable to the authors, and not to the Census Bureau, U.S. Department of Commerce, or the Federal Government. Introduction This paper evaluates selected demographic characteristics from the 1997 Survey of Program Dynamics (SPD)Bridge survey with the corresponding March Supplement of the 1997 Current Population Survey (CPS), a crosssectional sample survey. The SPD, a longitudinal, sample survey, was designed to evaluate the impact of the 1996 national, welfare reform legislation by studying its effect on a panel of survey respondents over a tenyear period (19922002) for specific demographic, social, and household economic characteristics. Our paper has four parts. In the first, we briefly sketch the legislative history and purpose of SPD and compare the 1997 SPD Bridge survey with the 1997 CPS March supplement. The second part outlines the methods used in this study while part three presents our study results. In the final part, we briefly discuss our results and the implications of this study for the analysis of longitudinal transitions experienced by American families in the final section. Background to the SPD Public Law 104193, the Personal Responsibility and Work Opportunity Reconciliation Act of 1996, created a new program entitled ATemporary Assistance for Needy Families@ (TANF). The Act changed in several ways the availability of public assistance to those requesting public help:
The Survey of Program Dynamics (SPD) was a result of enabling language in Public Law (P.L.) 104193. The Act directed the Census Bureau to develop and execute a survey to collect data that would permit researchers to evaluate the impact and effectiveness of changes to public assistance activities through TANF. Legislators supporting this law envisioned research and evaluation projects using SPD data to study the factors contributing to program participation and the longterm impact of welfare reform on the wellbeing of TANF recipients, their families, and their children. Additional issues such as outofwedlock births, welfare dependency, the beginning and end of welfare spells, recidivism, and the status of children were, also, intended to be monitored with data available from the SPD. This legislation directed the Census Bureau to collect subsequent data from persons who participated in the 1992 and 1993 panels of the Survey of Income and Program Participation (SIPP) on topics concerning changes in program participation, employment, earnings, and measures of adult and child wellbeing. The data collected in SPD is derived from three sources (Huggins and King 1998):
The 1997 SPD Bridge survey, a modified version of the 1997 CPS March supplement, was intended to update, or bridge, the data collected from the earlier SIPP panels to the upcoming SPD interviews. The 1997 CPS March supplement and the 1997 SPD Bridge survey are different in several fundamental ways. These differences were relevant in developing a context for identifying an appropriate research method and in the discussion of findings for this paper. The salient features of the two surveys are compared in Table 1. Table 1: Comparison1997 CPS March Supplement and 1997 SPD Bridge Survey
The Census Bureau modified the March CPS survey as a platform to build quickly the SPD survey questionnaire instrument and collect essential 1996 data related to the welfarereform experience. As such, some content of the SPD questionnaire survey was not familiar to field interviewers. They had to undergo additional training specific to the SPD instrument. Furthermore, another point of difference is that the sample for the 1997 SPD Bridge survey included households in the population universe that had participated in the Census Bureau=s SIPP for more than two years. These former SIPP household respondents were previously informed at the end of their commitment with the original SIPP survey that they would not be recontacted to participate in another Census Bureau survey. Since the 1997 SPD Bridge survey used a modified format of the 1997 CPS March supplement, a natural question to ask is the following, AHow alike are the data collected through the 1997 SPD Bridge survey and the 1997 CPS March supplement?@ To answer this question, we compared survey results from the two surveys for selected demographic variables. By reviewing data for selected demographic variables between the two surveys, one outcome of our effort may be to suggest modifications in either collection design or wording for selected demographic variables from the 1997 SPD data set. Only after reviewing and evaluating SPD survey data for validity can the survey data be used confidently and effectively in political and governmental decisionmaking. Study Methods Since the principal purpose of this study was to ascertain the similarities of the survey data sets for selected variables, we were interested in whether or not the unweighted and weighted survey data results compared well between two surveys. To address this research aim, our data were evaluated using a pair of hypotheses for the distribution of both the unweighted and weighted data: Ho: The survey data for each study variable in the 1997 SPD were not different from the survey data from the corresponding study variable in the 1997 CPS March supplement. Ha: The survey data for each study variable in the 1997 SPD were different from the survey data from the corresponding study variable in the 1997 CPS March supplement.
Based on a comparison of the two surveys (see Table 1), we did not expect the demographic variables studied here to compare very well between the 1997 SPD Bridge survey and the 1997 CPS March supplement. The two surveys were designed and fielded to accomplish different purposes. In addition, we expect significant attrition in the SPD between 1995 and 1997 to effect adversely the representativeness of the remaining sample, especially in comparison with the CPS which is a nationallevel, crosssectional survey. Our study is a retrospective, posttest research design. Six, demographic variables from each survey were selected for further analysis and should be thought of as the units of analysis. Each variable is described in detail at the following URL address at the Census Bureau web site (http://www.census.gov/main/www/glossary.html): Age. Age classification was based on the age of the person at her/his last birthday. Race. The population was divided into five groups by race: White, Black, American Indian/Aleutian Eskimo, Asian or Pacific Islander and other races. Educational Attainment. Educational attainment applied only to progress in Aregular@ school and represents the highest degree or years of school completed. Marital Status. The marital status at time of interview categorized into four major categories: Asingle (never married),@ Amarried,@ Awidowed,@ and Adivorced.@ The category Amarried@ was further divided into Amarried, civilian spouse present,@ Amarried, Armed Force spouse present,@ Amarried, spouse absent,@ Amarried, Armed Force spouse absent,@ and Aseparated.@ Household relationship. How each person is related to the householder or the person who owns or rents the housing unit. Geographic regions. The four, Census Bureau regions are Northeast, Midwest, West, and South and are composed of individual states on an adjacent geographical basis.
Because of the relatively small sample size (38,000 households) of the 1997 SPD Bridge survey, the state is the smallest geographic unit identified. The state was used as the smallest geographic unit instead of metropolitan/nonmetropolitan areas of a state because some states had such low numbers of units in particular areas that adequate safeguards protecting against disclosure of confidential data could not be assured and remain in compliance with rules established by the Census Bureau=s Disclosure Review Board. The processed data from the 1997 SPD Bridge survey and the 1997 CPS March supplement were extracted from the Census Bureau=s online data extraction system, FERRET. This extraction system permitted access to unweighted and weighted data for the selected demographic variables. We delimited our research by studying the 1997 unweighted and weighted data from the two surveys and calculated the difference between a pair of data points, the absolute value of the data points, and KruskalWallis (ChiSquare approximation) test statistics. The statistical procedures in the software package SAS are used to calculate the appropriate test statistic, degrees of freedom and pvalue for the nonparametric KruskalWallis Test (ChiSquare approximation). We report our detailed findings (also see Appendix 1) and the decision to either accept or reject the null hypothesis. Using standard rules for testing hypotheses with nonparametric statistical tests, the AKruskalWallis Test (ChiSquare Approximation)@ was used to evaluate the hypotheses in terms of the rank value differences (significance level=0.05 (2tails); significance level=0.025 (1tail)). The research design and statistical test selected have several concerns that must be acknowledged. The posttest only research design has an advantage and a few disadvantages (Nachmias and Nachmias 1987). A major advantage of the posttest only research design is that it controls for factors that may negatively influence external validity and internal validity where validity referred to the concept of whether or not the researcher actually measured what was intended. External validity addresses the concern of whether a study conceptually measures the demographic variables of interest; the concern for internal validity is a practical matter of whether the demographic variables in the survey are being adequately quantified. A disadvantage of this design is that the demographic variables that were included in the surveys may not be adequate to provide a comprehensive, demographic comparison between the two surveys. Another disadvantage of this design is that we do not have a control group as understood in a classical, experimental design.
A nonparametric test, the KruskalWallis Test (ChiSquare Approximation), was selected to examine the differences between the categorical data in two surveys, namely the differences between the distributions generated from the CPS and the SPD (Cody and Smith 1997; Walker 1997; Stokes, Davis, Koch 1995; Mendenhall and Beaver 1991). Although we are sure that some of the variables approximated the conditions required for a normal distribution, the KruskalWallis Test (ChiSqure Approximation), like other nonparametric tests, was appropriate to use since the parameters such as mean or standard deviation for the demographic variables were not, as in our study, discernable for all the demographic data that were collected. Where the test has a df=1 and shows significant findings, the KruskalWallis Test (ChiSquare Approximation) is identical to the normal approximation used for the Wilcoxon RankSum Test (Walker 1997). Mendenhall and Beaver (1991, p. 611) remind the student of the essential assumptions of the KruskalWallis Test (ChiSquare Approximation): 1. All sample sizes are greater than or equal to 5. 2. Ties assume the average of the ranks that they would have occupied if they had not been tied.
The data for our study were available for frequency, percentage, amount of difference between paired values, and the sign of the known differences. The KruskalWallis Test (ChiSquare Approximation) has been shown to produce findings similar to the parametric analysisofvariance FTest for normally distributed data with a measure of central tendency (Mendenhal and Beaver 1991). Results and Discussion The KruskalWallis Test (ChiSquare Approximation) scores were calculated from survey data extracted from FERRET (see Appendix 1). The tables in Appendix show the possible response categories for each unweighted and weighted variable on the left side of each table. The frequency for each response category, and the calculated percentage difference (difference) and absolute difference (difference) between the distributions where 1 denotes the smallest absolute difference between the two distributions. We present aggregate findings in a summary table (see Table 2) and offer a discussion of the study results.
Table 2: Summary table of the comparison of 1997 SPD and 1997 March CPS
(alpha level = 0.05)
As shown in the above table, we did not find ChiSquare Approximation pvalues to be significant, that is less than or equal to alpha=0.05. Contrary to our earlier expectations, we were unable to reject the null hypothesis that the distribution of survey data for each variable studied from SPD and CPS were not different from each other.
Although we could not find a statistical reason to reject our null hypothesis using the KruskalWallis Test (ChiSquare Approximation), we did find less than ideal instances where the difference between SPD and CPS variable response categories were large (i.e., over 2 absolute value units from the zero; see Appendix 2) and this may have affected the reported ChiSquare values. We selectively performed SIGMA tests for the CPS March response categories that showed a large difference with the corresponding SPD response categories. After determining the confidence intervals for the CPS response category, we, then, examined whether or not the SPD value fell within the CPS confidence interval for the same response category. We found, for example, that the CPS confidence interval (CI) for the response category married civilians for the variable marital status (weighted) to be {CI 52.4% " 0.4} while the corresponding SPD value (54.8%) fell outside this confidence interval. We are unsure whether this difference and the others found are noteworthy because the Census Bureau has not established confidence intervals for the SPD variables. Furthermore, a procedure that may influence the differences identified between the distributions after weighting for the variables from SPD and CPS is the assignment of a zero weight to persons who entered the original SIPP households after the first interview in 1992 or 1993. For example, Appendix Table A1b shows the change in distribution by age between the SPD and CPS. Clearly, the large difference in the age category under 5 was due to the assignment of a zero weight to persons who entered the SIPP/SPD household after the initial interview. These findings serve to illustrate two ideas. First, the KruskalWallis Test (ChiSquare Approximation) is a robust estimator that is insensitive to departures from ideal conditions. Second, the demographic makeup of the SPD sample may have changed over time (Huggins and King 1998, p. 10). This seems to correspond with an earlier expectation (see p. 8). Interestingly, the ttest would be used if we changed our research design from a posttest design to a design where repeated measurements were collected. The second design would be applicable for the annual administration of the SPD survey instrument. If ideal conditions could be maintained in a repeatedmeasure design and the ttest employed, then the statistical results reached using the ttest may be similar for the unweighted and weighted demographic variables identified in this project but during subsequent years. In reality, however, we cannot expect ideal conditions to be maintained for the completion of the SPD over the life time of the survey. Census Bureau analysts have discussed survey results for the Survey of Income and Program Participation (SIPP)the parent, longitudinal survey of SPD (Huggins and King 1998). Two findings from the SIPP experience may affect the SPD demographic data. First, Huggins and King (1998, p. 10) report that as the number of interviews increased, the SIPP lost disproportionately more people if they were in poverty at the last completed interview than if they were not in poverty. Since the SPD sample population was derived from SIPP (see Table 1), additional research may show that this problem will persist over the life of the SPD panel. Second, Huggins and King (1998, p. 10) conclude, AAs attrition increases over the life of a panel, differential nonresponse and the effectiveness of the weighting adjustments may interact differently.@ A future, research topic may include building a model of the conditions necessary for predicting SPD survey attrition. The predictive model might include, in the interest of parsimony, a term for nonresponse, an interaction term, and selected demographic variables, perhaps some of those studied here. Bogen (1996) and Word (1997) discuss findings that contribute to respondent nonresponse and attrition. By paying attention to the these findings, the resulting insight may lead to improvements in both SPD data collection procedures and the quality of SPD demographic survey data. Federal statistical agencies, like the Bureau of the Census, play a central role for collecting large, national data in the welfare reform policy arena and in several other public policy arenas of interest to national decisionmakers (Norwood 1995). These same agencies, similarly, occupy an important role in critiquing the data collected by the sponsoring agency. It is to this latter issue that we have attempted to evaluate the basic demographic characteristics of the 1997 SPD Bridge Survey as to its representativeness to the Nation in 1997. Overall, we find the SPD data quite usable as judged by its similarity with the 1997 March CPS on several demographic indicators, although sample loss over time seems to have produced slightly lower proportions of Anevermarried@ persons and young children. This could be the result of family disruptions and failure to obtain interviews for these households as they move and, thereby, present problems in tracking them. We will continually monitor these remaining SPD households for changes in the viability of this sample for demographic research. Works Consulted Huggins, Vicki and Karen King. AThe Survey of Program Dynamics: sample design, weighting and attrition issues.@ Draft: June 27, 1998. Washington, DC: U.S. Bureau of the Census. Blalock, Hubert. Social Statistics. New York: McGraw Hill, 1992. Bogen, Karen. AThe effect of questionnaire length on response ratesa review of the literature.@ 1996 Proceedings of the Section on Survey Research Methods. Alexandria, VA: American Statistical Association, pp. 10201025. Budnick, Lawrence. AStatistics@ in Preventive Medicine and Public Health second edition. Malvern, PA: Harwal Publishing Company, 1992. Cody, Ronald and Jeffrey Smith. Applied Statistics and the SAS Programming Language fourth edition. Upper Saddle River, NJ: Prentice Hall, 1997. Knapp, Rebecca and M. Clinton Miller. Clinical epidemiology and biostatistics. Malvern, PA: Harwal Publishing Company, 1992. Bureau of Labor Statistics and Bureau of the Census. AWeighting. CPS Annual Demographic Survey March supplement.@ http://www.bls.census.gov/cps/ads/1995/swgting.htm (17 May 1999). Mendenhall, William and Robert Beaver. Introduction to Probability and Statistics eighth edition. Boston: PWSKent Publishing Company, 1991. Nachmias, David and Chava Nachmias. Research methods in the social sciences. New York: St. Martin=s Press, 1987. Norwood, Janet. Organizing to Count: Change in the Federal Statistical System. Washington, DC: The Urban Institute Press, 1995. SAS Institute. SAS/STAT User=s Guide, Version 6, Fourth Edition, Volume 1. Cary, NC: SAS Institute, Inc., 1989. Stokes, Maura and Charles Davis, Gary Koch. Categorical Data Analysis using the SAS system. Cary, NC: SAS Institute Inc., 1995. Walker, Glenn. Common Statistical Methods for Clinical Research with SAS Examples. Cary, NC: SAS Institute Inc., 1997. Weinberg, Daniel and Vicki Huggins, Robert Kominski, Charles Nelson. AA Survey of Program Dynamics for evaluating welfare reform.@ November 18, 1997. Washington, DC: U.S. Bureau of the Census. Weinberg, Daniel and Pat Doyle, Arthur Jones, Jr., Stephanie Shipp. AMeasuring the impact of welfare reform with the Survey of Program Dynamics.@ Draft: August 6, 1998. Washington, DC: U.S. Bureau of the Census. Word, David. AWho responds/who doesn=t? Analyzing variation in mail response rates during the 1990 Census.@ July 1997. Population Division Working Paper No. 19. Washington, DC: U.S. Bureau of the Census.
Appendix 1 Table A1a: Age, unweighted
[Total respondents, persons: SPD, 77,630; CPS, 131,854]
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.01602 DF=1 CHISQ pvalue=0.8993 _________________ *As the magnitude of difference increases, rank values generally will increase. Table A1b: Age, weighted
KruskalWallis Test (ChiSquare Approximation) CHISQ=0 DF=1 CHISQ pvalue=1.0000
Table A2a: Race, unweighted
[Total respondents, persons: SPD, 77,630; CPS, 131,854]
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.08333 DF=1 CHISQ pvalue=0.7728
Table A2b: Race, weighted
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.08333 DF=1 CHISQ pvalue=0.7728
Table A3a: Educational attainment, unweighted (persons 15 years and over)
[Total respondents, persons: SPD, 60,761; CPS, 101,229]
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.02877 DF=1 CHISQ pvalue=0.8653 Table A3b: Educational attainment, weighted (persons 15 years and over)
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.00142 DF=1 CHISQ pvalue=0.9699 Table A4a: Marital status, unweighted (persons 15 years, over)
[Total respondents, persons: SPD, 60,761; CPS, 101,229]
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.00408 DF=1 CHISQ pvalue=0.9491
Table A4b: Marital status, weighted (persons 15 years and over)
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.03673 DF=1 CHISQ pvalue=0.8480 Table A5a: Household relations, unweighted (persons 15 years and over)
[Total respondents, persons: SPD, 60,761; CPS, 101,229]
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.00751 DF=1 CHISQ pvalue=0.9310 Table A5b: Household relations, weighted (persons 15 years and over)
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.36894 DF=1 CHISQ pvalue=0.5436
Table A6a: Geographic regions, unweighted
[Total respondents, persons: SPD, 77,630; CPS, 131,854]
KruskalWallis Test (ChiSquare Approximation) CHISQ=0.08333 DF=1 CHISQ pvalue=0.7728
Table A6b: Geographic regions, weighted
KruskalWallis Test (ChiSquare Approximation) CHISQ=0 DF=1 CHISQ pvalue=1.0000 Appendix 2: Absolute Value and Distance The absolute value and standard deviation describe a similar phenomenon: both terms describe how far a point in question is from a standard reference point. However, the two pervious terms are not the same. Specifically, the absolute value describes how distant a point (x,y) is on a graph is from the origin, (0). The standard deviation, by contrast, is the square root of the variance and describes the spread some event value may be within the mean of a normal distribution function for some phenomenon. In the figure below, the absolute value is formally defined (www.treasuretroves.com/math/absolutevalue.html): 

Contact: (dsd.survey.program.dyanmics@census.gov)
 Introduction to SPD 
Survey Design & Content 
Data Editing  Finding
SPD Info  Sampling &
Weighting 


Census
2000  Subjects
A to Z  Search
 Product
Catalog  Data
Access Tools  FOIA
 Privacy
· Policies  Contact
Us  Home
