Download DOS software version 1 (Enter parameters on command line) Note: doesn't run in Windows XP
Download DOS software version 2 (Prompts user to enter parameters)
Background
Some ASCII text data files copied to recordable CD-ROM by the Census Bureau's Tape to CD-ROM service do not contain a carriage return/line feed (0D/0A) character sequence to delimit records. Many Windows programs require that these codes be present to determine when a record ends. Otherwise, they will attempt to read the entire data file as one record. This utility makes a copy of a data file with record terminators inserted. Before using this utility, verify the absence of both of these codes (see the Tape File Math section below for information on one method).
Tape File Math
This section applies to files with a fixed record length.
Go to a DOS prompt and run the DIR command and then perform the following three calculations.
Viewing the data file
with the following DOS command or a stand-alone utility may also be useful. If there appear to be periodic breaks in the same place, do not use crlf.exe.
Type d:\filename.txt | more
where filename.txt is the data file name and d is the disc drive letter.
Usage
Reading data files into SAS and SPSS
The following instructions are intended to enable a data file with or without record separators to be read as-is 2. The instructions below are especially intended for SAS or SPSS users who do not have enough diskspace to store a copy of the CD-ROM file.
|
SAS Users
The Statistical Analysis System (SAS) can read files with unterminated records. Use the following technique:
|
SPSS Users
SPSS can read files with unterminated records. Use the following command
Example: Please report your experience using this technique to ask.census.gov |
Footnotes
1. Another optional solution is available in this case. The data file can be compressed (if it isn't already) with PkZip or compatible program and then uncompressed with a program that will convert a carraige return (typical of a Macintosh OS) or a line feed character (typical of a UNIX OS) to a carriage return and line feed character sequence. PKZIP for Windows version 4.00 (or a more recent version) can be used following the steps outlined below. This PKZIP shareware can be downloaded from pkware.com. After installing PKZIP, do the following:
Another option is to use a third party DOS utility such as todos.exe. An internet search on the above file name should allow you to access a copy of this software.
2. Many Census Bureau data files, such as Current Population Survey (CPS), contain a line feed character only at the end of each record. One character needs to be added to the record length in the technical documentation whenever the record length is explicitly declared by use of the lrecl parameter (see Reading data files into SAS and SPSS above). Add two to the value of the lrecl parameter if a carriage return and line feed character sequence is present or leave out the lrecl parameter altogether.
A shift from record to record can occur when a carriage return and/or line feed character is present at the end of each record and the lrecl parameter is too small. Check for one or two character shifts in your datasets from record to record and adjust the recl parameter accordingly.