Working Paper Number SEHSD-WP1989-13 or SIPP-WP-92
Rameswar P. Chakrabarty
The survey of Income and Program Participation (SIPP) will undoubtedly become a major source of data on many socioeconomic aspects of the nation's households, families, and individuals. Therefore, it is anticipated that many researchers in areas of sociology, economics and public policy will be interested in analytical studies using SIPP data to enhance our understanding of complex socioeconomic phenomena. These analytical studies will usually be done by various multivariate analyses and modeling such as multivariate regression analysis, factor analysis, logistic regression analysis, discriminant analysis, and survival analysis and hazards modeling. Standard statistical packages like BMDP, SAS and SPSS have programs to carry out such analyses. These packages use statistical methods strictly applicable to samples from infinite populations and thus are not generally suitable for analysis of data from complex survey samples. These packages, however, have often been used by researchers for analysis of survey data mainly because appropriate statistical packages are not available. This practice of using a standard statistical package for analysis of survey data raises the following basic questions. What effects can the sample design have on methods of multivariate analysis? How should such effects be taken into account? In recent years, many researchers have attempted to answer these questions by demonstrating the adverse effects of survey designs on standard statistical methods and developing procedures that take the survey design into account. General methods of variance estimation for complex surveys and procedures that may be appropriate for variance computation by users of SIPP micro-data files will be discussed in another report. This report deals with multivariate analysis of data from complex surveys. At this time, methods of multivariate analysis that take account of the survey design, and computer software to implement such methods are at an early stage of development. Significant progress seems to have been made only for the analysis of categorical data from sample surveys. A review of research done for multivariate analysis and modeling and analysis of categorical data from complex surveys follows. We also indicate appropriate methods that users of SIPP micro-data files can use for data analysis and modeling.

