Disclosure Limitation Review

Guidelines for Maintaining Respondent Privacy and Anonymity

A contractual obligation of researchers who qualify for access to restricted data from the Health and Retirement Study is to maintain respondent anonymity. Disclosure limitation review is the method by which HRS can prevent disclosure of confidential information, reduce the likelihood of respondent re-identification, provide useful data to researchers, and ensure the results of the review process are acceptable to both the researcher and the provider(s) of the restricted data.

Methods Used to Protect Confidentiality in HRS Data Products

  • All HRS public and restricted files are directly or indirectly based on sample survey methodology
  • Public file variables containing indirect identifiers such as industry, occupation, and geographic information have been collapsed
  • Microdata files derived from SSA administrative data (e.g., Earnings, Benefits, and SSI records) have been subjected to rounding and top-coding in accordance with the governing Memorandum of Understanding
  • Direct respondent identifiers such as name, address, SSN, Medicare/Medicaid identifier, place of birth, etc. have been removed from all public microdata products, and limitations have been placed on access to geographic detail information
  • Data items at the respondent level related to sample design, such as segment, and line, are not distributed to the public

Protecting Confidentiality During Analysis

  • Researchers should only publish statistical summary values (frequency tabulations, magnitude tabulations, means, variances, regression coefficients, and correlation coefficients) that do not permit the identification of any individual person, family, household, employer, or benefit provider
  • File(s) that result from any merge process which includes restricted data input should be treated as restricted
  • Researchers should not publish the results of any analysis that can potentially identify respondents, either directly or inferentially
  • Researchers are prohibited from publishing results that identify geographic areas below the level of Census Region/Division
  • When producing tabulations for distribution, the following guidelines should be employed:
    • Magnitude Data: Ensure that no cells/strata with n < 3 are produced
    • Frequency Data: Apply a marginal threshold of n >= 5 and cell threshold of n >= 3 to all tabulations
    • Minimum and maximum values, and SD are not permitted to be reported
  • For CMS data users, when producing tabulations for distribution, the following guidelines should instead be employed:
    • Magnitude & Frequency Data: Ensure that no cells/strata with n < 10 are produced
  • Certain types of cross-category merges (e.g., State-level geographic data with Social Security Administrative data) are not allowed under traditional restricted licensing agreements. Geographic information may only be used in conjunction with files derived from Social Security administrative data (1) after executing a MiCDA Data Enclave data use agreement and (2) obtaining written permission from the HRS Project Director
  • Analysis results containing merged area data based on geographic information may be reported if there is no direct identification of geographic areas, or if geographic areas are reported using the same grouping characteristics as public files. When using geocodes to link respondent information to area data, make sure that respondent privacy is not inadvertently compromised by reporting unique area data values (e.g., including census tracts with unusual environmental characteristics in data analysis reports)
  • Researchers may wish to recode or collapse certain high visibility variables such as Cause of Death or Medical Condition before reporting analysis results using such variables
  • All published research resulting from restricted data analysis should be reviewed according to the terms of the Agreement For Use of Restricted Data From the Health and Retirement Study

Data Export Rules (VDI)

  • VDI users may export only statistical summary information (frequency tabulations, magnitude tabulations, means, variances, regression coefficients, and correlation coefficients) that does not permit the identification of any individual person, family, household, employer, or benefit provider
  • Export of microdata files or of analysis output containing information at the respondent level is not allowed
  • Tabulations may be exported, but are subject to the following rules:
    • Magnitude Data: Ensure that no cells/strata with N < 3 are produced.
    • Frequency Data: Apply a marginal threshold of N >= 5 and cell threshold of N >= 3 to all
      tabulations
  • Users may not remove any analysis output that can potentially identify respondents, sampling information, or geographic areas below the level of Census Region/Division, either directly or inferentially. Analysis results containing merged area data based on geographic information may only be exported if there is no direct identification of geographic areas or if geographic areas are reported using the same grouping characteristics as public files
  • High visibility variables such as certain Cause of Death and Medical Condition codes must be recoded or collapsed before being exported
  • All analysis output is subject to disclosure review by HRS staff members who have ultimate authority over whether a given set of analysis results may be exported

Export Procedures

  1. All materials proposed for export from the enclave are subject to disclosure review.
  2. Review of intermediate results should be done in the MiCDA Enclave. Export requests should only include presentation/publication ready files.
  3. The review procedure will be completed within 5 business days, if possible.
  4. Present analysis results in presentation ready format in a .pdf or similar format; total of < 50 pages per submission.
  5. Users should place the file to be reviewed in a folder labeled “Export-MM-DD-YYYY” on their U: drive (or shared folder in the case of multiple user projects).
  6. User emails completed request form HRS Data Disclosure Rules and Checklist to the HRS reviewers at hrsrdadisclosure@umich.edu.
  7. The reviewers vet the export file(s) using the rules outlined in the HRS Data Disclosure Rules and Checklist document. Once the process is complete, the researcher will receive a response via email.
  8. When the request is approved, the reviewer will copy the output to the researcher’s sftp folder.
  9. The researcher should now connect to the sftp folder and download the reviewed documents.

Import Procedures

  1. You may request import of statistical code and non-HRS public datasets required for your analysis in the enclave.
  2. You may request import of restricted datasets required for your analysis with permission of your data supplier. You must provide the DUA indicating approval of merge with HRS data and storage in the SDE.
  3. You may request import other supporting documents. You must include a description of the file and justification of need.
  4. All materials proposed for import to the enclave are subject to disclosure review.
  5. The review procedure will be completed within 5 business days, if possible.
  6. Users should place the file(s) for review in a folder labeled “Import-MM-DD-YYY” in their SFTP space.
  7. User emails completed request form HRS DISCLOSURE REVIEW CHECKLIST-IMPORT to the HRS reviewers at hrsrdadisclosure@umich.edu.
  8. The reviewers vet the import file(s) and import the requested files to the user’s destination folder.
  9. Once the process is complete, the researcher will receive a response via email.
  10. File(s) will be available at next login.