Skip to Main Content

Sensitive Research Data

This guide walks through how researchers can get support for sensitive or restricted-access data at UNCW.

What is Sensitive or Restricted-Access Data?

Sensitive data are data that can be used to identify an individual, species, object, or location that introduces a risk of discrimination, harm, or unwanted attention.

- Australian Research Data Commons

Data can include direct identifiers, such as social security numbers, that can directly identify an individual from the data. Data can also include indirect identifiers, such as gender and occupation, that in themselves are not sensitive but could be combined to be able to identify the individual.

You may hear sensitive data called microdata because this refers to the geographic level of the dataset. Rather than a dataset representing group(s) at a more aggregated level (such as demographic classes by county), each row could be the data from an individual participant or the data from a single zip code. Offering data at a rawer level in this way is more sensitive because it can become easier to identify individuals.

By law, HIPAA regulations has 18 identifying data variables that must be especially protected if included in HIPAA qualifying datasets (such as medical records you request from the hospital, not medical details you receive from individuals), but these identifying data variables can still be a good guide for considering the sensitivity for datasets that aren't HIPAA regulated.

Sensitive data is very often considered with human subjects data but can certainly occur with other types of data. For example, sensitive ecological data could include identifying details about vulnerable species that could become at risk if found, and business data could include commercial information that reveals company trade secrets.

So what is Restricted-Access data?

Though it's safest to release data without the sensitive details included, having access to sensitive data can be very valuable or necessary for your research (depending on your research question). What a data owner can do is to publicly post a record of the sensitive data's existence, hide the actual sensitive dataset(s), and provide an avenue for researchers to contact them to request the data.

Restricted-Access data is data that can be provided to researchers to use for their research upon a satisfactory request and if the researcher agrees to certain data security conditions. These requests are approved if the data owner agrees that the sensitive data will indeed be valuable for the researcher's particular research use case, and given promises for proper handling of the data in secure infrastructure settings so it won't accidentally be utilized or accessed for unintended purposes or unapproved users.

Sometimes data is restricted-access because it is proprietary data more so than having sensitive identifiers. This data tends to be consolidated by a data vendor or company who has made the data convenient for research use and is not intended to be publicly available for free reuse. The data vendor will provide access to a researcher upon request, stipulate security conditions for proper access and handling, and will typically require a fee in addition.

Researchers who collect their own sensitive data can share their own data in a restricted-access way in a data repository as well. Depending on the type of participant involved in the data, reusing restricted-access data rather than collecting your own sensitive data from those participants may be better for the participant. 

How To Get Access to Restricted-Access Data?

Be aware that getting access to restricted data can sometimes take months and might cost money, so give yourself time to procure it as well as to digest how to use it once you've got access to it!

1. Identify the dataset you are interested in

This step is about checking for various dataset conditions that meet your research need, such as time frame of data collection, variables, sample, geographic level, and data collection method.

 

Contact Lynnee Argabright (from the library) to get help finding a dataset as well as how to submit the initial request for access. We can also look into whether there are any public access or alternative datasets that you could use instead.

2. Include details about the dataset in your grant application (if you are requesting sponsored research)

This step would include describing the dataset and how you'd manage it in the Data Management Plan and including it in your direct costs grant budget--for labor and any possible fees to procure the dataset.

Contact Dana Bell (from SPARC) to get help considering how to incorporate it into your grant and for a university compliance review.

Contact Lynnee Argabright (from the library) to get help adding it into your Data Management Plan.

3. Draft the Data Use Agreement and Data Security Plan.

This step will include agreeing to a plan with the data owner/vendor about how you will keep this dataset physically and digitally safe and secure. This will include details on topics such as:

  • data storage location
  • computer workstation location
  • backup of data files
  • computer security update patches
  • firewalls and network security
  • encryption and password protection

Contact Zerek Olson (from Risk Assessment in Campus ITS) to review the Data Use Agreement contract so he can check or negotiate the data owner/vendor's expectations about UNCW's responsibilities for the dataset. He will also help you come up with a Data Security Plan outlining your computational security setup for handling the dataset, that will meet your working condition needs as well as comply with the data owner/vendor's requirements. You may be interested in utilizing the Secure Data Room to access and handle your data. See more for details about the Secure Data Room at UNCW.

Contact Dana Bell (from SPARC) to sign the Data Use Agreement on your behalf. Restricted-access data typically will require an institutional authority to sign the Data Use Agreement rather than the actual researcher who will be working with the dataset. 

4. Include details about the dataset in your IRB application.

This step is required by the data owner/vendor to see that you have human subjects ethical protections in place. The Data Use Agreement will ask for the IRB Number that has been approved. For timing, be mindful about the monthly dates in which the IRB applications are due in time for review by IRB Full Board meetings.

Contact the Research Integrity Office for questions about submitting an IRB or suggested methods for securing human subjects protection for this type of data. See more about the Research Integrity Office.

5. Submit an application to use special UNCW research infrastructure (for example, if you are planning to use the Secure Data Room)

This step will grant you access to infrastructure supported by UNCW that meets your storage, processing, and security needs--this would be already identified in your Data Security Plan. For example, you may be interested in requesting access to use the Secure Data Room located in the library to work on your data. This space will grant you physical and digital security that meets NIST and ISO information security compliance standards. See more for details about the Secure Data Room at UNCW. You may alternatively be interested in secure cloud infrastructure. You would need to submit a TAC ticket if you need to get access to any special infrastructure. The process to request access to these spaces will take only a few days at most.

Contact Lynnee Argabright (from the library) for questions about the Secure Data Room.

Contact Chris Jones (from Research Computing in Campus ITS) for questions about UNCW-supported cloud infrastructure.

5. Receive the dataset and begin work

Depending on the data owner/vendor, you may be sent the dataset in a specific secure method, such as via mailing a CD with the data on it, or receiving it through a FTP link for download. However you receive it, make sure that you download the data in a manner that complies with your Data Security Plan so that the data is located somewhere secure. For example, if you had agreed not to work with your data at home or on your personal computer, wait until you are on university premises and use a secure computer to download the data from the link sent to you.

When you are finished, the data owner/vendor may require you to submit to them your final output so they can conduct a risk disclosure assessment to verify that the output you are planning to use from the data is not re-identifiable.

Contact Lynnee Argabright (from the library) for guidance on how to understand the dataset using the data owner/vendor's documentation, how to utilize software for data analysis, how to deidentify data, and how to recognize what you're allowed to report or graphically visualize from your final results. If you are interested in data ethics training, she can also recommend resources for you.

6. Destroy the dataset

Depending on the stipulations of the data owner/vendor, you will need to discard the dataset after you are finished using it. You may be asked to digitally wipe the harddrive or mail back the CD that the raw original data came on. You may be asked to get a notorized signature stating that you destroyed the data. 

Contact Lynnee Argabright (from the library) if you have questions about how to securely get rid of the raw data and what to do with any derivative data (such as paper notes or analysis files)

7. Cite the data source in your publication

The data owner/vendor may have specific language for how the use of the data should be acknowledged in any dissemination you make.

Contact Lynnee Argabright (from the library) for help creating a data citation.

How to Properly Handle Sensitive Data?

In your research design, Identify risks and formulate countermeasures. Consider

  • confidentiality
  • integrity
  • transparency
  • unlinkability
  • availability
  • intervenability

During data collection:

  • Community the purpose of the research, and the security and privacy measures for your research with participants, before they sign the informed consent form.
  • Act with data minimization in mind: Only generate and use data that is relevant for the purpose of your research.
  • Use a computer with an encrypted hard drive, and encrypt your sensitive data. Use safe and secure file storage and sharing.

During data processing and analysis:

  • Anonymize/pseudonymize the data and work with the de-identified data. Use differential privacy.
  • Don't leave printouts on the printer or desk, don't use public wifi, don't work where others can easily watch your screen or hear you talk

After analysis:

  • Consider sharing your data if de-identified data sharing is allowed. Curate the data with documentation and format the datasets in an organized and clear manner. Deposit the data in a data repository.
  • Destroy your data if de-identified data sharing is not allowed.