Administration
Research Projects
Transcription
Translation
Platform Navigation
Security and Confidentiality
Pricing and Payment
Helpful Articles
Quick Guide to Data De-Identification
De-identification involves the removal of personally identifying information in order to protect personal privacy.
In terms of health information, data is considered de-identified under the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule when a number of specified data elements are removed.
What is De-Identified Data?
Data is de-identified when
● All 18 HIPAA-specific direct and indirect identifiers have been removed (Safe Harbor method)
● Data is determined by expert opinion to have a low probability of re-identification.
Example
Interviewer: When was the first time you heard about the [Organization]?
Interviewee: Ten years ago. I was a student at [University] and one of my professors told us about [Organization] and their work.
To view a complete example click on the link below.
What is PHI?
Individually identifiable health information, including demographic information, that is created or received by a covered entity and that relates to the past, present, or future physical or mental health of an individual, provision of healthcare to an individual, or past, present, or future payment for the provision of healthcare to an individual.
The presence of at least one of 18 HIPAA-designated direct and indirect identifiers in a data set makes the whole data set Protected Health Information.
1. Name
2. Social Security numbers
3. Telephone numbers
4. Addresses and all geographic information smaller than a state
5. All elements of dates (except year), including date of: birth, admission, discharge, and death; and all ages over 89
6. Fax numbers
7. E-mail addresses
8. Medical record numbers
9. Health Plan Beneficiary numbers
10. Account numbers
11. Certificate/license numbers
12. Vehicle identifiers and serial numbers, including license plate numbers
13. Device identifiers and serial numbers
14. Web Universal Resource Locators (URLs)
15. Internet Protocol (IP) addresses
16. Biometric identifiers, including finger and voice prints
17. Full face photographic images and comparable images
18. Any other unique identifying number, characteristic, or code: Any code or other means of record identification that is derived from PHI that must be removed in order for the data to be considered de-identified per the Safe Harbor method.
My research involves the use of PHI, what steps do I take?
Entities covered by HIPAA may share a limited data set for research purposes permitted by the Privacy Rule under one indisputable condition. All recipients must bound by a data use agreement with the originator of the data.
If you are a researcher seeking to access, obtain, or use PHI from a HIPAA covered entity for research purposes, then you may require a signed authorization for that use from the patient/participant, or otherwise justify an exception from that requirement.
In either case, you will be required to have an IRB-approved protocol.
Tips for de-identifying participants
- Plan or apply editing at time of transcription except: longitudinal studies – de-identify when data collection complete (linkages)
- Avoid blanking out: use pseudonyms or replacements • Avoid over-anonymising: removing / aggregating information in text can distort data, make them unusable, unreliable or misleading
- Consistency within research team and throughout project
- Show replacements, e.g. with [brackets]
- Keep a log of all replacements, aggregations or removals made – keep separate from de-identified data files
- Text anonymisation helper tool can help you find disclosive information to remove or pseudonymise in text files
- MS Word macro to find and highlight numbers and words starting with capital letters in text, which are often disclosive, e.g. as names, companies, birth dates, addresses, educational institutions and countries
Regulatory Resources
Health Insurance Portability and Accountability Act of 1996 (HIPAA) (Pub. L. No. 104-191, § 264 (1996), codified at 42 U.S.C. § 1320d; Standards for Privacy of Individually Identifiable Health Information, 45 C.F.R. § 160 (2002), 45 C.F.R. § 164 subpts. A, E (2002).
super-embed:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<style>
.button-previous {
background-color: #312e81;
color: #fff;
border: none;
padding: 10px 20px;
font-size: 16px;
cursor: pointer;
border-radius: 5px;
margin: 5px;
text-decoration:none;
}
.button-previous:hover {
background-color: #3730a3;
}
</style>
<body>
<a href="https://docs.thelai.com/importing-documents-in-nvivo10-in-3-minutes" class="button-previous"> ← Previous </a>
</body>
</html>
super-embed:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<style>
.button-next {
background-color: #312e81;
color: #fff;
border: none;
padding: 10px 20px;
font-size: 16px;
cursor: pointer;
border-radius: 5px;
margin: 5px;
text-decoration:none;
}
.button-next:hover {
background-color: #3730a3;
}
</style>
<body>
<a href="https://docs.thelai.com/quick-guide-to-data-de-identification" class="button-next"> Next → </a>
</body>
</html>