Skip Main Navigation Skip To Navigation Content

TechTalk

You are here: Census.govTechTalk › Census 2000 Public Use Microdata Sample (PUMS) Disc Product Support
Skip top of page navigation

Census 2000 Public Use Microdata Sample (PUMS) Disc Product Support

ASCII text data file processing support page

  1. Overview
  2. Disc menu
  3. Creating summary tables
  4. Change estimate weight
  5. Recode variable
  6. Software issues and updates
  7. Using DemoShield and Beyond 20/20 Publication Browser to Access 2000 PUMS [PDF, 2.6 MB]
  8. File documentation [PDF, 4.32 MB] (also contained on disc)

[Back to Top]1. OVERVIEW

Data file format

The data files on disc are in Ivation Beyond 20/20 extract format. A Beyond 20/20 extract is a set of files created in the Beyond 20/20 Builder (not on disc) optimized for rapid table creation with the Beyond 20/20 Browser (on disc). This file set includes a dimension (variable) definition file (*.ivd) for each dimension plus one extract file (*.ivx).

Software features

Users can create summary data tables and then export them to various output file formats including Ivation Table (*.ivt), Excel (*.xls), Comma Separated Variable (*.csv), Text (*.txt), HTML (*.htm*), Lotus Worksheet (*.wks) and (*.wk1), dBase (*.dbf) and Aremos TSD (*.tsd).

Users can also output selected fields from the original microdata (detail data in this case) to any of the following output file formats by clicking on the "File", "Save Extract As" menu sequence; Beyond 20/20 extract (*.ivx), dBase (*.dbf), SAS (*.sas), SPSS (*.sps) and Text (*.txt). Weight field(s) must be selected along with data items to produce estimates. There are no record selection options for this feature, so all records will be exported to the output data file.

A code list for any variable can be shown by, dragging the variable with the mouse from the variable list on the right to the row, column or the upper left part of the white table area and then clicking on the View, Dimension menu sequence.

Microdata files (Beyond 20/20 extracts) on disc

Geography

The 1% CD-ROM disc contains two extracts for the United States (fifty states plus the District of Columbia combined into one data set) and two more extracts for Puerto Rico based on the PUMS 1% sample. The 1% and 5% DVD-ROM disc, additionally, also includes two extracts for each state and Puerto Rico based on the PUMS 5% sample.

Extract Types

There are two extract types. The first type, the housing unit extract, was created from the original data files' housing unit records (both occupied and vacant) only. The housing unit weight is set as the default weight for this extract.

The second type, the person plus housing unit extract, was created from the original data files' person records with the housing unit record associated with those person records attached. Housing unit records for occupied (non vacant) housing units only are included in this extract. Note that because the attached housing unit information repeats for each person in an occupied housing unit, the housing unit weight was zeroed out for all persons except for the householder (defined as p_relate="01"). The person weight is set as the default weight for this extract.

More information

A prefix of h_ or p_ has been added to variable names from the housing unit or person records respectively to separate the order of appearance of housing unit and person record variables in the person plus housing unit extract.

Non-weighted sample counts can be obtained by disabling the set weight or selecting this option from a list of available mathematical functions.

About Beyond 20/20

Minimum software and hardware requirements

  • Windows 95, Windows 98, or Windows NT
  • Pentium-class processor recommended
  • At least 16 MB of available RAM after all other system resources are loaded

Beyond 20/20, the Beyond 20/20 slogan, the Beyond 20/20 Builder, the Beyond 20/20 Browser, the Beyond 20/20 Distributor, ChartBrowse and MapBrowse are all trademarks of Ivation Datasystems Inc. Windows is a trademark of Microsoft Corp.


[Back to Top]2. DISC MENU

The menu below will appear when the disc is inserted in the disc drive if the drive's autoplay function is on. If this menu doesn't appear, it can be loaded by running launch.exe from the root directory of the disc. Click on the Launch button to start the Beyond 20/20 program and load data. You will then be prompted to select 'United States' or 'Puerto Rico', and then select one of two different extracts.

Disc main menu
Disc main menu

snapshot of 'select extract' dialog box
Select an extract


[Back to Top]3. CREATING SUMMARY TABLES

Example 1 - Create table Sex by State for Universe: Total population from person plus housing unit extract

Summary tables are created by performing the following two steps. First, an individual variable is selected by first dragging it from the right panel and then dropping it to one of the table areas (row, column, dataitem (lower right), absolute upper left (to view one category at a time)). Next, the table is populated (filled with values) by clicking on the traffic light (Load data) icon on the toolbar.

snapshot of initial view of Beyond 20/20 person plus housing unit extract
Initial view of Beyond 20/20 person plus housing unit extract

snapshot of create summary table example
Create summary table example

snapshot of summary table results
Summary table results

Want to create a summary table with more than two variables?

Summary tables with more than two variables can be created by adding additional variables to the row or column areas. This is done by dragging a new variable over a vertical line left or right of an existing row variable or by dragging a new variable over a horizontal line above or below an existing column variable. Note that the cursor will change shape. The mouse button should be released at this time to drop the new variable.

A variable can also be dragged to the upper left corner of the table area. Only one category is displayed at a time when this is done. A different category can be displayed by highlighting the variable name and then clicking on the left and right arrows on the toolbar.

Example 2 - Create table - Average Income by State where the Unit is dollars per year.

A variable of numeric type has been dragged to a different area of the screen in order to perform a mathematical function on it (see below). A numeric variable is indicated graphically by a small green pound sign (#). Next, a prompt appears to select a mathematical function.

snapshot of creating table Average Income by State : Step 1
Step 1: Creating table Average Income by State

Snapshot of Step 2: Select mathematical function
Step 2: Select mathematical function

Snapshot of Step 3
Step 3

Snapshot of Step 4
Step 4


[Back to Top]4. CHANGE ESTIMATE WEIGHT

Click on the Data - Set Weight Field menu sequence to change the weight currently being used for estimates. The only weight fields are h_hweight and p_pweight, however, all numeric fields will appear in a drop down list. Make sure Use Weight Field is not checked if you want unweighted sample counts.

Snapshot of setting weight field
Set the weight field

Select new weight for creating estimates
Select new weight for creating estimates


[Back to Top]5. RECODE VARIABLE

The list of categories for a variable can be changed by following the steps outlined below.

Right click on an existing variable and then select Define Recode as shown below. Next, click Yes to make a copy of the entire extract on another drive*.

The next step is to create new categories that include one or more pre-existing categories. Please note that the OK button is originally disabled. Each category from the original variable must be included in one of the new categories created for the recoded variable before the OK button is enabled. You may want to create a category named "Other" for the recoded variable that includes all of the remaining categories from the original variable that you are not interested in.

Click the "Add" button after highlighting one or more pre-existing categories and entering a name for the new category. If you click on the "Use As Is" button, no text label will be applied to the new category. Click on the OK button when all of the new categories have been created. The new recoded variable can now be used in place of the original variable.

A new variable in the example shown below has been created called p_pob_1 which has two categories (native-born and foreign-born) instead to the 325 categories in the original variable p_pob1.

* A copy of the extract or part of it must be made because the disc is a read-only medium. It is possible to copy a part of the extract by clicking on File, Save Extract As from the main menu and then selecting variables of interest plus one or both of the weight fields before getting started with the recode variable process.

snapshot for define recode
Define recode

Add recode one
Add recode one

Add recode two
Add recode two

Use recoded variable
Use recoded variable


[Back to Top]6. SOFTWARE ISSUES AND UPDATES

Income total variable category (note added on 10/03/2005)

The variable p_inctot includes a category incorrectly labeled $50,000 to $64,999. This category actually contains the number of people whose total income is $55,000 to $64,999. This can be seen by clicking on Dimension, then "Change Label" from the Beyond 20/20 browser menu twice.

Save extract in SPSS format notes

This option can be extremely slow if you are exporting a large number of variables (dimensions) at one time. No record selection criteria is available, so the number of records exported depends entirely upon the extract selected. You should be able to succesfully create SPSS output with about one or two dozen fields from any one of the PUMS 5% single state extracts or the PUMS 1% United States extracts relatively quickly. The actual runtime should be much less than an hour for most extracts if your PC is a relatively recent model. You may want to first try creating a small extract that includes just the variables you need for a specific cross tabulation plus any necessary weight field(s).

"Not enough memory to complete operation" message

The number of data items in a table is the product of the total number of categories for each dimension (variable). Attempting to create a table with a large number of data items can result in a "Not enough memory to complete operation" message. One option is to try creating the table on another PC with more RAM. A second option is to recode one or more variables.


[PDF] or PDF denotes a file in Adobe’s Portable Document Format. To view the file, you will need the Adobe® Reader® Off Site available free from Adobe. This symbol Off Site indicates a link to a non-government web site. Our linking to these sites does not constitute an endorsement of any products, services or the information found on them. Once you link to another site you are subject to the policies of the new site.
Source: U.S. Census Bureau | Administrative and Customer Services Division | (301) 763-7710 |  Last Revised: May 10, 2013