U.S. Census Bureau
 Housing Vacancies and Homeownership (CPS/HVS)




Annual Statistics: 2004

APPENDIX B.  SOURCE AND ACCURACY OF ESTIMATES
REDESIGN OF THE CURRENT POPULATION SURVEY/Housing Vacancies and Homeownership
(CPS/HVS)

Methodology Changes in 2004

	Population controls that reflect the results of the 2000 decennial census are used in the CPS/HVS
estimation process for the first time in the first quarter 2003.  This change has a slight effect on 
vacancy and homeownership rates, as described below.  As a final additional step in 
our estimation process, the estimates are now controlled to independent housing counts used for the 
first time in order to produce a more accurate estimate of housing units.  This should make the CPS/HVS 
estimates of housing units more consistent with other Census Bureau housing surveys.  The new housing 
controls will affect the count of all housing units in the sense that both occupied and vacant units 
will be ratio estimated to the new control total.  Vacancy rates and homeownership rates are not be 
affected by this change.  

The CPS/HVS began computing first-stage factors (used for weighting purposes) based on year-round and 
seasonal counts of housing units from the 2000 decennial census, beginning in the first quarter 2003.  
From 1980 to 2002, the CPS/HVS first-stage factors were based on year-round estimates only.   We believe 
that this improves our counts of year-round and seasonal units.  The shift from 1990-based to 2000-based 
population controls (including the weighting revision) had a very slight effect on vacancy rates and 
homeownership rates.  Research has shown that the new 2000-based controls dropped the rental vacancy rate 
in the first quarter 2002 from 9.14 percent to 9.08 percent---a difference of less than 1/10 of one percent.  
The homeowner vacancy rate was revised from 1.67 percent to 1.65 percent, while the homeownership rate was 
revised from 67.82 percent to 67.81 percent. The questions on race on the CPS are modified beginning in the 
first quarter 2003 to comply with the revised standards for federal statistical agencies.  Respondents may 
now select more than one race.  The Hispanic/Nonhispanic origin question continues to be asked separately.




REDESIGN OF THE CURRENT POPULATION SURVEY/HOUSING VACANCY SURVEY (CPS/HVS)

	Major changes related to the Current Population Survey/Housing Vacancy Survey (CPS/HVS) were effective 
beginning with the first quarter 1994 data.  First, a new weighting procedure was implemented based on the 
1990 decennial census.  The 1990-based weighting produces, on average, estimates of the total housing 
inventory that are about 0.1 percent lower than the 1980-based weighting.  Revised data are provided in the 
historical tables for 1993 to show the effect of this change.  Generally, the vacancy rates are only 
minimally affected, while the homeownership rate is about one-half of a percentage point lower with the new 
weighting procedures.

	A second change is that the CPS/HVS has become a totally computerized survey with the implementation of 
the Computer Assisted Survey Information Collection (CASIC).  The CASIC tools consist of state-of-the-art 
computer-assisted modules for data collection and processing.  Although the concepts, definitions, and 
questionnaire items remain the same, the shift to CASIC may affect vacancy rates and homeownership rates.  
We are unable to determine the quantitative effects of the use of CASIC on the vacancy and homeownership 
rates.  Data users should use caution when comparing 1994 and later data with earlier data.

 
SOURCE OF DATA 
 
	The estimates presented in this report are based on data obtained from two surveys conducted by the 
Bureau of the Census. Data concerning vacancy rates and tenure of occupied housing units are from the 
monthly sample of the Current Population Survey/Housing Vacancy Survey (CPS/HVS).  Characteristics of 
occupied housing units are from the American Housing Survey (AHS).  


CPS AND AHS DESIGNS 
 
	Since the inception of the CPS in 1940, the sample has been redesigned several times to upgrade the 
quality and reliability of the data and to meet changing data needs.  Beginning in April 1984, the 
current design was phased in through a series of changes that were completed in July 1985.  Prior 
to the redesign, sample cases were selected from the 1970 census frame. 

	The CPS/HVS sample is spread over 729 sample areas, which represent 1,973 geographic areas in the 
United States.  The metropolitan/nonmetropolitan data shown in this report reflect 1990 census definitions.  
The CPS/HVS began a major geographic redesign in April 1994.  The survey gradually replaced sample cases 
selected from the 1980 census over a 15-month phase-in period with new sample cases drawn from the 1990 
census.  The sample was fully phased in by June 1995.  For the transitional period (1995), we have 
converted the remaining 1980 sample cases to reflect 1990 metropolitan/nonmetropolitan definitions.  For 
1996 data and beyond, the data will reflect 1990 definitions.

	In 1986, vacant seasonal mobile homes were included in the count of vacant seasonal units.  This change 
resulted in a 12 percent increase in the number of vacant seasonal housing units.   


	Beginning in 2002, the size of the CPS/HVS sample increased to approximately 72,000 housing units.  
This expansion was one of the Census Bureau's plans to meet the requirements of the State Children's 
Health Insurance Program (SCHIP) legislation.  Of the 72,000 housing units contained in the CPS/HVS sample, 
approximately 61,200 are eligible for interview each month; of this number, 3,900 occupied units, on the 
average, are visited but interviews are not obtained because occupants are not found at home after repeated 
calls or are unavailable for some other reason.  In addition to the 61,200, there are also about 10,800 
sample units in an average month which are visited but are found to be vacant or otherwise not to be 
interviewed.  About half of the 10,800 are vacant and interviewed for the HVS.   

	The CPS estimation procedure for occupied units involves the inflation of the weighted sample results 
to independent estimates of the total civilian noninstitutional population of the United States by age, 
race, sex, and Hispanic/non-Hispanic categories. These independent estimates are based on statistics from 
the decennial censuses of population; statistics on births, deaths, immigration, and emigration; and 
statistics on the strength of the Armed Forces. 

	The HVS estimation procedure for vacant units is similar to that used for occupied units.  Weighted 
sample results are adjusted at the state level using 2000 census vacant counts.  A second adjustment 
inflates these results based on the CPS coverage of occupied units by geographic areas. 

	Data shown in all tables (except table 2) on vacancy rates and tenure of occupied units for 2004 are 
from the CPS and are averaged over the 12 months of the year.  The data concerning the distribution of 
characteristics for occupied housing units, shown in table 2, are obtained primarily from the AHS national 
sample.  Distributions of characteristics of occupied housing units from the AHS estimates are applied to 
CPS current housing inventory independent estimates to obtain the characteristics of occupied housing units 
used in this report.  The Survey of Construction (SOC) and the Consumer Price Index also are used to 
improve estimates of the rent distribution.

	The 2001 AHS sample is spread over 394 sample areas comprising 878 counties and independent cities 
with coverage in each of the 50 States and the District of Columbia.  Of the 61,050 housing units both 
occupied and vacant contained in the AHS sample, 53,150 were interviewed and 5,650 were classified as 
"Type A noninterviews" for various reasons. 2,250 units were visited but were not eligible to be interviewed 
for the purposes of AHS. A detailed description of the AHS sample design and estimation procedure can be 
found in the H-150 report for 2001.  


COMPARABILITY WITH CENSUS OF HOUSING DATA 
	 
	Most of the concepts and definitions are the same for items that appear in both the 1980 and 1990 
censuses and the Housing Vacancy Survey.  However, there is one minor difference in the housing unit 
definition between the CPS/HVS and the 1980 and 1990 decennial censuses.  The difference is that, in 
the CPS/HVS, living arrangements containing five or more persons, not related to the person in charge, 
were classified as group quarters; for the 1980 and 1990 census, the requirement was raised to nine or 
more persons not related to the person in charge.  There were some differences in what has been counted 
as housing units between the earlier censuses and the CPS/HVS.  Descriptions of the differences between 
earlier censuses and the CPS/HVS appear in the 1985 and earlier reports of this series. 

	Prior to 1990, there were significant differences between the CPS/HVS and the decennial censuses.  
The 1980 and 1990 decennial censuses included vacant mobile homes as housing units, whereas prior to 
1986 the CPS/HVS did not.  However, beginning in 1986, vacant seasonal mobile homes were counted as 
housing units in the CPS/HVS.  In addition, year-round vacant mobile homes were counted as housing units, 
beginning in 1990 in the CPS/HVS.  Another difference in the housing unit definition between the CPS/HVS 
(prior to 1986) and the 1980 and 1990 censuses was that the CPS/HVS required units to be separate living 
quarters and have direct access or have complete kitchen facilities.  For the 1980 and 1990 decennial 
censuses, the complete kitchen facilities alternative was dropped with direct access required of all units.  
However, beginning in 1990, the CPS/HVS requirement for complete kitchen facilities was dropped with direct 
access required of all units.  Thus, the earlier definitional differences were eliminated.  

	In addition, there are differences between the methodologies used to collect data for the CPS/HVS and 
the censuses.  These differences include interviewing procedures, staff experience and training; differences 
in processing procedures and sample designs; the sampling variability associated with the CPS/HVS and the 
sample data from the census; and the nonsampling errors associated with the CPS/HVS and census data. 

	Research has shown that the CPS/HVS and the 1990 census produced significant differences for vacancy 
characteristics.  The rental vacancy rate from the April 1990 census was 8.5 percent, whereas, the CPS/HVS 
reported the rental vacancy rate of 7.2 percent for the first half of 1990.  The April 1990 census had a 
homeowner vacancy rate of 2.1 percent, while the CPS/HVS had a vacancy rate of approximately 1.7 percent 
for the first half of 1990.  For occupied housing, the April 1990 census produced a homeownership rate of 
64.2 percent, while for the first half of 1990 the CPS/HVS produced a rate of 63.9 percent.    These 
differences illustrate that, for these characteristics as well as others, caution should be used when making 
comparisons between the 1990 census and the CPS/HVS.
	
	Further research has shown that the CPS/HVS and the 2000 decennial census produced significant differences 
for vacancy characteristics.  The rental vacancy rate from the April 2000 census was 6.8 percent, whereas the 
CPS/HVS reported the rental vacancy rate of 7.9 percent for the first half of 2000.  The April 2000 census has 
a homeowner vacancy rate of 1.7 percent for the first half of 2000.  For occupied housing, the April 2000 
census produced a homeownership rate of 66.2 percent, while for the first half of 2000, the CPS/HVS produced 
a rate of 67.2 percent.  These differences illustrate that, for these characteristics as well as others, 
caution should be used when making comparisons between the 2000 census and the CPS/HVS.


COMPARABILITY WITH EARLIER DATA 
 
	As stated earlier in this report, beginning in 1994 new weighting procedures based on the 1990 decennial 
census were implemented.  In addidition, the survey data collection procedures became totally computerized.  
Caution should be used when comparing current data with unrevised data prior to 1994.

	In 1989, new edit procedures were implemented in the Current Population Survey/Housing Vacancy Survey 
(CPS/HVS).  These new procedures were used to allocate cases that would have been classified as "not 
reported" under previous procedures.  

	In 1990, year-round vacant mobile homes were included for the first time as part of the year-round vacant 
count of housing units.  This change was made to make the composition of the housing unit inventory for the 
CPS/HVS similar to the decennial census and other surveys, which count all mobile homes as housing units when 
occupied or vacant (available for occupancy on the site).  Research has shown that the inclusion of year-round 
vacant mobile homes increases the vacancy rate significantly in some cases.  All of the 1989 data in this 
report have been updated to include year-round vacant mobile homes.  Caution should be used when comparing 
unrevised vacancy data prior to 1990 to data for later years.

	In addition to the above mentioned design and estimation changes, caution should be used in comparing data 
for 1980 and beyond in this report with data from 1979 and earlier years. Starting in 1980, several changes 
were implemented in the survey to improve the reliability of the data presented.  These included adding a 
supplemental sample, refining the estimation procedures, and changing the source of occupied characteristics 
from the Quarterly Housing Survey (QHS) to the AHS. 

	Although the above mentioned changes have resulted in more reliable estimates, data for 1980 and later in 
this report are not completely comparable to data for 1979 and previous years, as published in Housing 
Vacancies reports, series H-111.  Furthermore, unrevised data prior to 1990 are not completely comparable to 
1990 data and beyond, due to the inclusion of year-round vacant mobile homes, beginning in 1990.  Thus, 
particular caution should be observed in drawing conclusions about trends that extend from before 1980 to 
1980 and beyond, and also trends from before 1990 to 1990 and later.  For comparative purposes, 1979 data in 
this report have been revised to incorporate all changes made in 1980, and 1989 data have been revised to 
incorporate all changes made in 1990. Unrevised 1989 and 1979  data are provided to show the magnitude of the 
various changes.  

	The revised 1979 vacancy estimates are higher than the original 1979 estimates.  The increase in vacancy 
rates was not the result of locating additional vacant units, but reflects the increase in sample size and 
refinements in the estimation procedure.  It is safe to assume that prior to the implementation of these 
new procedures (1955 through 1978) HVS produced underestimates of vacant units.  Earlier reports in this 
series give more complete descriptions of the original CPS sample, the QHS sample, and estimation procedures. 



CAUTION IN USING VACANCY RATES FOR CHARACTERISTICS IN TABLE 2 

	Vacancy rates in table 2 are based in part on forecasts of occupied housing units.  These forecasts are 
periodically revised to incorporate more recent data and improved forecasting procedures.  Data shown for 
2004 and 2003, shown on table 2 are based on the 2001 AHS.

	For the occupied unit forecasts for the monthly rent categories, we update the AHS data quarterly to 
reflect the rise in the cost of renting through the use of the residential rent index, and the latest 
available asking rent data for newly constructed rental units.  
  

CAUTION IN USING SEASONAL VACANT DATA 

	Analysis of seasonal vacant data prior to the 1987 has shown that estimates for these characteristics 
have been underestimated by approximately 28 percent.  The estimates beginning in 1987 are adjusted to 
reflect this.  This revision has an effect on other categories (especially the percentage occupied) in 
addition to seasonal vacant units in the distributions shown in tables 7. 


ACCURACY OF THE ESTIMATES 
 
	Since the CPS/HVS estimates are based on a sample, they may differ somewhat from the figures that would 
have been obtained if a complete census had been taken using the same questionnaires, instructions, and 
enumerators.  There are two types of errors possible in an estimate based on a sample survey:  sampling and 
nonsampling.  The accuracy of a survey result depends on both types of errors, but the full extent of the 
nonsampling error is unknown.  Consequently, particular care should be exercised in the interpretation of 
figures based on a relatively small number of cases or on small differences between estimates.  The standard 
errors provided for the CPS/HVS estimates primarily indicate the magnitude of the sampling error.  They also 
partially measure the effect of some nonsampling errors in responses and enumeration; but do not measure any 
systematic biases in the data.  (Bias is the difference averaged over all possible samples, between the 
estimate and the desired value.) 


NONSAMPLING VARIABILITY 
 
	Nonsampling errors can be attributed to many sources, e.g., inability to obtain information about all 
cases in the sample, definitional difficulties, differences in the interpretation of questions, inability or 
unwillingness on the part of respondents to provide correct information, inability to recall information, 
errors made in collection such as recording or coding the data, errors made in processing the data, errors 
made in estimating values for missing data, and failure to represent all units with the sample (undercoverage).  
Undercoverage in the CPS/HVS results from missed housing units and misclassifying housing units. Ratio 
estimation to independent controls, as described previously, partially corrects for the bias due to survey 
undercoverage.  However, biases exist in the estimates to the extent that missed households have different 
characteristics than interviewed households.  
 

SAMPLING VARIABILITY 
 
	The standard errors given in the following tables are primarily measures of sampling variability, 
that is, of the variations that occurred by chance because a sample rather than the entire population 
was surveyed.  The sample estimate and its standard error enable one to construct confidence intervals; 
ranges that would include the average results of all possible samples with a known probability.  For 
example, if all possible samples were selected, each of these being surveyed under essentially the same 
general conditions and using the same sample design, and if an estimate and its standard error were 
calculated from each sample, then approximately 90 percent of the intervals from 1.6 standard errors 
below the estimate to 1.6 standard errors above the estimate would include the average result of all 
possible samples. 

	The average estimate derived from all possible samples is or is not contained in any particular 
computed interval.  However, for a particular sample, one can say with specified confidence that the 
average estimate derived from all possible samples is included in the confidence interval. 

	Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing 
between population parameters using sample estimates.  The most common types of hypotheses appearing in 
this report are:  (1) the population parameters are identical, and, (2) the population parameters are 
different.  An example of this would be comparing the vacancy rate in MA's versus the vacancy rate 
outside MA's.  Tests may be performed at various levels of significance, where a level of significance 
is the probability of concluding that the characteristics are different when, in fact, they are identical. 

	To perform the most common test, let x and y be sample estimates for two characteristics of interest.  
Let the standard error on the difference x-y be SEDIFF.  If the ratio R = (x-y)/SEDIFF is between -1.6 
and +1.6, no conclusion about the difference between the characteristics is justified at the 0.10 level of 
significance.  If, on the other hand, this ratio is smaller than -1.6 or larger than +1.6, the observed 
difference is significant at the 0.10 level.  In this event, it is a commonly accepted practice to say that 
the characteristics are different. Of course, sometimes this conclusion will be wrong.  When the 
characteristics are, in fact, the same, there is a 10 percent chance of concluding that they are different.  
All statements of comparison in the text have passed a hypothesis test at the 0.10 level of significance or 
better.  This means that, for most differences cited in the text, the estimated difference between 
characteristics is greater than 1.6 times the standard error of the difference. 

	Comparisons of characteristics of vacancies for 1990 (which include year-round vacant mobile homes as 
part of the year-round vacant inventory for the first time) with previous unrevised years reveal significant 
differences in some cases.  Thus caution should be used when comparing current data with previous unrevised 
data prior to 1990.


ILLUSTRATION OF THE USE OF TABLES OF STANDARD ERRORS 
 
	Standard errors are used to: 1) measure the accuracy of the survey estimates, and 2) draw inferences from 
the survey data.  For example, Table B-1 of this report shows that the percent of for-rent units outside MAs 
for 2004 is estimated to be 2.2 percent.  Table B-1 also shows the standard error of this estimate to be 
approximately 0.1 percentage points. Consequently, the 90-percent confidence interval as shown by these data 
is from 2.0 to 2.4; i.e., the interval 2.2 +/- (1.645 x 0.1) percentage points.  Thus, one can say with about 
90-percent confidence that the average percent of for-rent units derived from all possible samples is included 
in this confidence interval.  Statements about differences are made only when the 90-percent confidence interval 
on the estimated difference does not include zero.  The 90-percent confidence intervals are shown in the text 
for selected items.  The standard errors for other figures in this report are given in the tables. In addition 
to sampling error, the figures in this report, both the estimates and their standard errors, are also subject 
to rounding error.   


ILLUSTRATION OF THE COMPUTATION OF THE STANDARD ERROR OF A DIFFERENCE
 
	Table B-1 shows the rental vacancy rate for units in the South is 12.5 percent and 7.7 percent in the West.  
Thus, the apparent difference between the two rates is 4.8 percent.  The standard error of 12.5 percent and the 
standard error of 7.7 percent are both 0.2 as shown in table B-1.  Therefore, the standard error of the 
estimated difference of 4.8 percent is about 0.3 percent.     
                                                    
                                 _________________   
                         0.3 = \ |(0.2)2 + (0.2)2
                                 
      Consequently, the 90 percent confidence interval for the 4.8 difference is from 4.3 to 5.3 percent; 
i.e., the interval 4.8 + (1.6 x 0.3) percentage points.  Thus, one can say with about 90 percent confidence 
that this interval includes the actual value that would have been obtained by averaging the results from 
all possible samples of this type.  Thus, we can conclude with 90 percent confidence that the rental 
vacancy rate in the South is higher than the rate in the West. 

Go to Housing Vacancies and Homeownership Annual Statistics: 2004

Contact Bob Callis or Linda Cavanaugh at (301)763-3199 or visit ask.census.gov for further information on the Housing Vacancy Survey.

Source: U.S. Census Bureau, Housing and Household Economic Statistics Division
Last Revised: February 17, 2005