The prevalent practice of using patients' medical records for chart review research, without consent from patients, does not engender much criticism when compared to biospecimen research, where worries surrounding genomic privacy inspired proposed revisions to the Federal Policy for Protection of Human Subjects, or the Common Rule, to require consent for such research.1 With the proliferation of electronic medical records (EMRs), an immense amount of patient data is now available for researchers to analyze and search for insights into factors that might influence and predict health outcomes.2 Further, these factors may be found in the genome, the encoded genetic representation of a person. As gene sequencing evolves from an expensive tool used by researchers to a more affordable and routine clinical screening test used in direct patient care, more patients are likely to have their genomes fully digitized, immeasurably growing the already impressive accumulation of electronic health data currently housed in EMRs.3
Because of the relative ease of acquiring a great deal of data, increasing numbers of genomic researchers will seek out EMRs as a low-cost source of populationwide genome data, thereby making patients unwitting subjects of genomic study. In this way, EMR-based records research will pose genetic privacy risks analogous to those of biospecimen research, yet current federal regulations still allow researchers to call gene sequence data de-identified, removing such data from the protection of the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule4 and the Common Rule.5 Therefore, this chart review research will likely happen outside of patients' knowledge, much less their consent, in a research environment where the privacy risks are regularly minimized and security practices can be uncertain.3 For patients' privacy rights and expectations and for the research community, the ethical implications loom large.
Regulators have permitted the nonconsensual use of chart review research on the grounds of one essential assumption: stripping common identifiers such as names, social security numbers, and addresses from the data essentially removes the risk of harm. Once deemed anonymization, this method is presently called de-identification, a statement on the probability of identification.
While de-identification may protect other forms of health information by minimizing the risk of re-identification, a person's DNA sequence is a unique combination of coding for physical traits that could create a partially or fully identifying profile just from the gene sequence itself.6 Further, as geneticists learn more about the genome, the sensitivity of any particular genome and the re-identification risk from the gene information will grow. Similarly, as private and public genomic databases (or reference databases) continue to proliferate,7 the frequently discussed method of re-identification—comparing anonymous sequences with reference databases—will be an increasing risk.
Advocates of the current regulatory regime minimize the risk of re-identification by contending that the risks are overstated, or even if re-identifying genetic information is possible, the incentive to do so is less compared to other information stores such as financial data because genomic information is less meaningful, valuable, and damaging and thus less attractive as a target of attack.8 On the other side are scientists who demonstrate re-identification risks by continually re-identifying purportedly anonymous participants in research databases.9-11 Further, genomic information is indeed sensitive and potentially stigmatizing because knowledge about future disease risk, familial relationships and shared susceptibilities, and an uncertain amount of yet-to-be-discovered information about the relationship between genes and health is embedded in the genome.12 Certainly, not everyone's genome will constitute a “future diary”13 or contain sensitive and revealing information; still, many genomes will, and few if any individuals for whom that is true will have knowledge beforehand of that fact.
Genomic information is particularly, if not uniquely, sensitive health information with a high potential for re-identification, and some have maintained that it can never truly be anonymized.14 When reference genomic databases of identifiable data are burgeoning and forensic scientists are studying how to decrease the time it takes to sequence DNA onsite for use as a biometric identifier,15 the contention that a whole or partial gene sequence is not an identifier that could put connected demographic or health information at risk is becoming increasingly tenuous.16 Consequently, federal regulations should no longer allow HIPAA-covered entities to regard large genomic datasets as de-identified health information.
Irrespective of the risk of re-identification or whether genomic information should be considered an identifier in its own right, a significant question remains at the core of this problem: Is it ethically permissible for the research and provider communities to continue to ignore the amassing evidence from numerous studies that patients want and expect to be informed of and in control of if and how their genomic and health information is used in research?17,18 Because while it is clear to see why researchers value records-based research containing genomic data and the benefits that can accrue from these studies, in the absence of policy or a practice change that promotes disclosure, one day the general public will realize that sensitive information that was expected to be confidential and protected is in fact used widely in research that the subjects neither consented to nor were aware of.
To avoid the ethical consequences and the subsequent loss of trust in research and science if the public finds out about widespread genomic research without patient consent, the research community should reform current standards and norms surrounding records-based genomic research. Patients should be given meaningful notice of medical records review research occurring at organizations. This notice could take the form of an electronic database listing studies, investigators, and their contact information, thus allowing patients and institutions to hold investigators accountable for data security and privacy.
When offering genetic diagnostic testing, providers should inform patients that records-based research is a common and important avenue for discovering new associations between genes and health and that the resulting knowledge improves the quality and cost-effectiveness of care. However, providers should also make clear that acquiring consent for records-based research is not possible in every instance. For example, a provider cannot retroactively follow a patient's preferences if the patient's genomic information is no longer under the control of the provider's institution.
If the research community stops asserting that it is reasonable to consider genomic data to be anonymous, de-identified, or not readily identifiable,19 then the protections of the Privacy Rule and Common Rule would apply, requiring researchers to acquire patients' consent or an institutional review board (IRB) waiver of consent to proceed with research.20 Currently, records-based research that contains identifiable information has these consent requirements, and IRBs often waive consent on the grounds that contacting thousands of patients to obtain consent is impracticable and the biases between those who consent and those who do not would undermine the validity of the study.21 With the advent of EMR patient portals, where providers and patients exchange secure and encrypted messages and information, communicating with patients is much easier. Institutions use these portals to allow patients to respond to satisfaction surveys, indicate preferences for care, and input demographic data.22 The portal could be used to educate patients about genomic research and to give patients a way to provide broad consent or to opt out. Because the portal and medical record are housed in the same system, researchers looking at medical records could easily identify the patients who did not consent to medical records review research, thus giving patients more control over the use of their data and the ability to consent to records-based research.
Records-based research is not going anywhere, but the time has come for the research community to stop treating large genomic datasets as de-identified information. Patients expect that this information is kept in confidence, and when it is revealed that this genomic data is commonly studied, at some risk, in records-based research, the fallout will be immense. To avoid this occurrence, transparency and disclosure should improve in the research environment, and providers should give patients notice before releasing clinical genomic data for research, allowing, where possible, some degree of choice and control.
- © Academic Division of Ochsner Clinic Foundation