Skip Main Navigation Skip To Navigation Content

TechTalk

You are here: Census.govTechTalk › Delimiting records in ASCII text data files: Carriage Return/Line Feed Program
Skip top of page navigation

Delimiting records in ASCII text data files: Carriage Return/Line Feed Program

DOS software version 1 [EXE, 43 KB] (Enter parameters on command line) Note: doesn't run in Windows XP

DOS software version 2 [EXE, 39 KB] (Prompts user to enter parameters)

Background

Some ASCII text data files copied to recordable CD-ROM by the Census Bureau's Tape to CD-ROM service do not contain a carriage return/line feed (0D/0A) character sequence to delimit records. Many Windows programs require that these codes be present to determine when a record ends. Otherwise, they will attempt to read the entire data file as one record. This utility makes a copy of a data file with record terminators inserted. Before using this utility, verify the absence of both of these codes (see the Tape File Math section below).

Tape File Math

This section applies to files with a fixed record length.

    Go to a DOS prompt and run the DIR command and then perform the following three calculations.

  • DOS byte size / record length (from documentation or record layout) - If this expression divides cleanly (has no remainder), there are no carriage return (CR) and/or line feed (LF) characters added to the end of each record. The DOS program from this page should be used only in this case.
  • DOS byte size / record length + 1 - If expression b divides cleanly (has no remainder), there is one control character (a CR or an LF character) added to the end of each record. 1
  • DOS byte size / record length + 2 - If expression c divides cleanly (has no remainder), there are two control characters (usually a CR followed by an LF) added to the end of each record.

Viewing the data file

with the following DOS command or a stand-alone utility may also be useful. If there appear to be periodic breaks in the same place, do not use crlf.exe.

Type d:\filename.txt | more

where filename.txt is the data file name and d is the disc drive letter.

Usage

  • CRLF.EXE input-name output-name input-length [TRIM]
  • Include three parameters on the command line, input-name, output-name, and input-length. The optional TRIM parameter removes trailing blanks from each record.
  • The drive and complete path can be included in the parameters input-name and output-name.
  • The input file (if not compressed) can be read directly from the CD-ROM by the CRLF program.
Examples
  • CRLF input.txt output.txt 80 trim
  • CRLF input.txt output.txt 80

Reading data files into SAS and SPSS

The following instructions are intended to enable a data file with or without record separators to be read as-is.2 The instructions below are especially intended for SAS or SPSS users who do not have enough diskspace to store a copy of the CD-ROM file.

SAS Users

The Statistical Analysis System (SAS) can read files with unterminated records. Use the following technique:

  • data
  • infile 'input-file-name' lrecl=record-size recfm=F
  • input ...
  • run

SPSS Users

SPSS can read files with unterminated records. Use the following command

  • FILE HANDLE command
  • MODE subcommand
  • IMAGE option

Example:

file handle sipp /name='e:/sipp93w9.per' /mode=image lrecl=1460.

Please report your experience using this technique to ask.census.gov


Footnotes

1 Another optional solution is available in this case. The data file can be compressed (if it isn't already) with PkZip or compatible program and then uncompressed with a program that will convert a carraige return (typical of a Macintosh OS) or a line feed character (typical of a UNIX OS) to a carriage return and line feed character sequence. PKZIP for Windows version 4.00 (or a more recent version) can be used following the steps outlined below. This PKZIP shareware can be downloaded from pkware.com Link to a non-federal Web site. After installing PKZIP, do the following:

  • Compress the data file.
  • Select the Extract option on the tool bar
  • Select the Options button at the bottom of the Extract page
  • Under the Miscellaneous section, select "DOS - convert to CR/LF"

Another option is to use a third party DOS utility such as todos.exe. An internet search on the above file name should allow you to access a copy of this software.

2 Many Census Bureau data files, such as Current Population Survey (CPS), contain a line feed character only at the end of each record. One character needs to be added to the record length in the technical documentation whenever the record length is explicitly declared by use of the lrecl parameter (see Reading data files into SAS and SPSS above). Add two to the value of the lrecl parameter if a carriage return and line feed character sequence is present or leave out the lrecl parameter altogether.

A shift from record to record can occur when a carriage return and/or line feed character is present at the end of each record and the lrecl parameter is too small. Check for one or two character shifts in your datasets from record to record and adjust the recl parameter accordingly.


This symbol Off Site indicates a link to a non-government web site. Our linking to these sites does not constitute an endorsement of any products, services or the information found on them. Once you link to another site you are subject to the policies of the new site.
Source: U.S. Census Bureau | Administrative and Customer Services Division | (301) 763-7710 |  Last Revised: February 13, 2013