Citations for Restricted-Use Data

As part of its Open Census initiative, the Census Bureau provides guidance on how to cite public data, tools, technical documents, and research. The Federal Statistical Research Data Centers (FSRDC) program has access to a wide range of restricted-use data—data only provisioned for use in specific, approved projects. While restricted-use data are not publicly available, accurate citations of restricted-use data help others know what data are utilized and potentially find more information about the underlying data and/or how to access the data. This webpage describes the suggested format for citing restricted-use Census Bureau datasets and provides a list of suggested citations for the more than 300 Census Bureau datasets listed on the Standard Application Process website (SAP), a single portal to discover and apply for access to confidential data assets in the federal statistical system (FSS).

When using restricted-use Census Bureau data for research conducted in an FSRDC, please refer to the downloadable list of recommended citations for specific restricted-use datasets. Please note that the SAP website has the definitive list of restricted-use Census Bureau datasets available to potential FSRDC researchers. Datasets are added and removed from the SAP portal as their availability status changes, which may render the citations spreadsheet inconsistent with what is on the SAP website between spreadsheet updates. If you cannot find a dataset in the recommended citations list, please consult your FSRDC Administrator.

Researchers can request a list of recommended citations from their FSRDC Administrator which will include key details about each dataset made available to their project. Researchers are strongly encouraged to cite all underlying datasets used in research, including restricted-use datasets.

As a reminder, please be sure to add the appropriate disclosure avoidance review disclaimer on all cleared research findings in circulated written materials (slides, posters, tables, papers, articles, dissertations, etc.). Please consult the disclosure release email you received for the disclosure avoidance review disclaimer. For questions regarding preparing and delivering these items contact your FSRDC Administrator. 

The Census Bureau uses The Gregg Reference Manual (Eleventh Edition) by William A. Sabin (New York: McGraw-Hill, 2011) for citation formats. The restricted-use dataset citation examples shown below and in the suggested dataset citation list follow this style. However, the focus of this guidance is not on any specific citation style but on the components a data citation for restricted-use data should include. Researchers submitting papers to journals may need to adapt the suggested citations to follow journal style guides.

Data citations are only part of the data documentation process. Researchers frequently cite other articles or working papers that describe data creation, data linkages, and/or data preparation. This does not replace the need to also cite the underlying datasets.

The following list includes the basic recommended citation formats for Census Bureau restricted-use datasets, followed by definitions, explanations, and examples.

Dataset Source, Full Dataset Name (SAP [Number]) [Restricted-use data], Vintage <SAP URL>.

Dataset Source, Full Dataset Name (SAP [Number]) [Restricted-use data], Vintage (Version Number) <SAP URL>.

Dataset Source, Full Dataset Name [Commercial Data] (SAP [Number]) [Restricted-use data], Vintage <SAP URL>.

 

Top of Section

  • Dataset Source: Generally, this is the agency, state, or commercial entity that produced the dataset. However, if restricted-use data from a Census Bureau statistical product (e.g., a survey) also has a sponsor(s), include both agencies, listing the sponsor first and then the Census Bureau. For example, for the American Housing Survey, use “U.S. Department of Housing and Urban Development and U.S. Census Bureau” as the Dataset Source. For restricted-use data that has been processed by the Census Bureau (e.g., attaching linking variables such as MAFID or PIK), list the source first and then add the U.S. Census Bureau. 
  • If the dataset was provided by a third-party commercial enterprise, then add “[Commercial Data]” after the Full Dataset Name.
  • SAP Number: This is the number assigned to the dataset on the SAP website. If restricted-use datasets do not have an SAP number, for example, researcher provided data that is subject to restrictions spelled out in an agreement with an external entity, then the researcher should follow that entity’s citation policy, if available. Otherwise, researchers should cite those data following the same general principles laid out on this page and skip the SAP number and SAP URL portion.
  • Vintage: This is a list of the years of the dataset used by researchers.
    • The string “YYYY” in the recommended citations is a placeholder where researchers should insert vintages
    • If multiple vintage years are used, it is suitable to put the vintages in a range (e.g., 2017-2023) only if consecutive years are used. For example, “2017-2023” for consecutive vintages 2017, 2018, 2019, 2020, 2021, 2022, 2023.
    • If multiple vintage years are used that are not consecutive, the individual years should be listed out. Examples are “2018, 2019, 2021, 2022” or “2000 and 2010”.
    • If the dataset has different versions, add the version number. For example, until recently, the Longitudinal Employer-Household Dynamics (LEHD) datasets had version years (i.e., 2008, 2011, 2014), which are not listed on the SAP website but should be added to the LEHD citations if they were used by researchers. (Note that not all restricted-use datasets have a version number.)
  • Use an URL whenever possible. The ideal URL points to a webpage that is persistent and includes a description of the data, links to technical documentation, and information on how to request access to the restricted-use data. Often this will be the dataset’s landing page on the SAP website. If no appropriate URL can be found, URLs can be excluded.
  • For data derived from multiple sources, only a citation to the unified file is needed.
  • Use the dataset name as referenced in the body of the research paper. If unsure, use the dataset name listed on the SAP website, if available. 
Top of Section

Examples:

U.S. Census Bureau, American Community Survey (SAP 691) [Restricted Data], 2023 <https://www.researchdatagov.org/product/691>.  

National Center for Science and Engineering Statistics and U.S. Census Bureau. Research and Development Surveys (SAP 793) [Restricted Data], 2023 <https://www.researchdatagov.org/product/793>.

U.S. Census Bureau, Foreign Trade Data – Import (SAP 693) [Restricted Data], 2019, 2021, 2022, and 2023, < https://www.researchdatagov.org/product/693>.

U.S. Census Bureau,  Longitudinal Employer-Household Dynamics (LEHD)  [Restricted Data],  (Version: 2014 Snapshot), < https://www.researchdatagov.org/product/667>.

Department of Housing and Urban Development and U.S. Census Bureau, HUD PIC and TRACS (SAP 782) [Restricted data], 2015-2019 <https://www.researchdatagov.org/product/782>.

Cotality, CoreLogic Buildings [Commercial Data] (SAP 815) [Restricted Data], YYYY <https://www.researchdatagov.org/product/815>.

Top of Section

Insert the following language in the research document (either in the main body of the document or as a footnote/endnote), “The Census Bureau’s Data-Use Agreement with the data provider prohibits revealing of the name of the data provider and/or the product name.”

Top of Section

A data availability statement or data access statement is a brief statement for published articles that informs readers where the data used in the study can be found and how it can be accessed. If data availability statements are required, researchers should check journal-specific guidelines and also  consult their FSRDC administrator. External guidance on creating data availability statements for restricted-use data can be found at:

Top of Section

Suggested Dataset Citations

A list of suggested citations for the more than 300 Census Bureau datasets listed on the Standard Application Process website (SAP), a single portal to discover and apply for access to confidential data assets in the federal statistical system (FSS).

 

Page Last Revised - September 2, 2025