A modern privacy and big data framework must address the use of both health-related data as well as seemingly innocuous data on behaviors, consumer preferences, voting, employment history, credit rating, and social media use. Biometrics, health records, new technologies like facial recognition bring new ethical issues that require expanded frameworks. Privacy is vulnerable due to the newness of certain technologies, which are not yet addressed by laws and best practices, and the hackability in the cybersecurity arena.

Risks of Abuse of the Data: Marketing and Discrimination

In the marketing arena, corporate free speech propelled the commercial value of deidentified information including physician prescribing practices, data sets correlating disease predictors with heretofore seemingly unrelated things (like what television station a person watches or where they shop), and genetic, physical, psychological health records. If a data set indicates a person listens to NPR, watches MSNBC, shops at Whole Foods, we can guess they voted for Joe Biden. What is unexpected is that big data also is being used to predict their pharmaceutical needs. The risk of reidentification, inappropriate gathering of data, or abuse (disease creep or upselling) could be weighed against the benefits. For much of the use, the benefit runs to pharmaceutical sales and marketing departments, not to the people whose data was collected. For large health data sets, the benefit runs both to public health initiatives and individual care. The issue of which uses of the data are ethical, whether the data should be available for all uses, and whether the people whose data it is should have any say in the matter (by creating the regulations or opting to keep their data from being collected) are the backbone of the framework.

Large data sets are used for predictions of the future health of people with certain characteristics. Large data sets prejudice employers and lenders against groups of people with certain traits, who engage in certain behaviors, or simply by their habits and preferences. For example, data correlates both bike shop customers and people who vote in midterm elections with good health. (Zarsky) While that seems interesting from a public health perspective, in the hands of an employer, lender, or someone marketing products to the unhealthy, the data can be used unethically. The likelihood that people and groups face discrimination already based on big data, some of which seemingly has nothing to do with actual health records, is concerning.

Large searchable databases open the possibility of reidentification (identifying a person with the deidentified health data) leading to possible discriminatory treatment based on genetics or medical history. The ADA does not address the use of predictive data.

In facial recognition, special issues include whether the patient knows and understands how an image can be used to identify genetic disorders, whether the algorithms are accurate and what human error can impact the use of images, and whether facial images and templates can be deidentified. The Genetic Nondiscrimination Act (GINA) does not cover facial recognition as it is not categorized as a specifically genetic test. (see Martinez-Martin.)

Black-box Medicine and Data in Clinical Care

The lack of certainty surrounding whether a causal relationship exists when large data sets provide information can create ethical dilemmas. Black-box medicine pulls together huge datasets to predict what medicines might work, primarily based on patterns rather than an understanding of disease mechanisms. (Price) The patterns can be based on genetic and biological data, can be beneficial to health, and can lead a hypothesis later tested. That is, research eventually may answer the why and confirm causality. There are vast benefits to black-box medicine but it is unclear how to evaluate the duty of the doctor to a patient for whom the recommendation fails or what to do when the recommendation the algorithm produces violates the known standard of care. Does black-box data equate to the best medical judgment of the doctor or replace it? The algorithm choice does not corroborate a care choice (it does not supplement expertise), it supplants expertise. The level of risk (the side effect profile and risk of foregoing the standard of care based on the condition) is an important consideration. Risk 1: Following the computer-generated advice. Risk 2: Forgoing the standard treatment. Risk 3: Who is responsible if it turns out bad (negligence, malpractice, no consequences)? Risk 4: Can the developers be held responsible? Risk 5: The regulations do not keep up with the technology.

Data-Generating Patents

Data generating patents require a new ethical approach as well. Intellectual property rights are expanding. Data-generating patents can preclude other ways of obtaining, collecting, or generating the same type of data. The data generated is protected as a trade secret. The patents provide a windfall of market share in the data market which is not the market of the technology or biomedical device patented. (e.g., if a patented search engine of social media outlet collects data from millions of people, trade secret law protects the actual data; when a way of testing for BRCa is patented this way, the company has access to all of the data of those tested.) In Association for Molecular Pathology v. Myriad Genetics, Inc. Myriad’s patents were mostly upheld (the Court distinguished natural sequences of DNA (gDNA) (unpatentable) from cDNA that is synthetic and therefore patentable). 569 US 576 (2013). Myriad continues to use trade secret law to protect the database of patient information. (Simon and Sichelman) Trade secret law does not have an end date so the ability to create a monopoly, barrier of entry to competing businesses, or to use big data as an advantage in marketing and producing other products is great. The market control can inhibit innovation and access to data for the public good or public health, hurting consumers and the public.

What Does Privacy Mean?

Does modern privacy mean nothing is actually private yet many things remain protected from government interference. There may be a paradigm shift in privacy law. While many regulations speak to cybersecurity, hacking is increasingly sophisticated. While slander and libel cover things that are not true, the argument that something is private may face limitations. It is not unusual for private information to be leaked, stolen, held for ransom, and made public. 

FBI follows data wrongdoers, recognizing that hacking a health database can lead to extortion and can influence politics and careers.

Big Picture

Anne Zimmerman copyright Creative Commons CC-BY-NC

The framework requires considerations: weighing the value of privacy, possibility of data-based “precision” medicine, and the benefits of data-based public health approaches against the risk of de-identification (violation of privacy, individual discrimination by employers, insurers, and lenders, public embarrassment, extortion, or other crimes against individuals), discrimination against groups based on correlations, a system of passed blame rather than malpractice claims, and limits to innovation based on patents and data-generating patents, and hacking (access to large data storage, financial risks, large data sets used by bad actors, national security).

Some systems could help: Blockchain transparency (with less ability to identify individuals), edge computing (keeps data closer to home compared to cloud), and the ever-increasing cybersecurity software upgrades. Keeping ahead of hackers is always going to be difficult making all data collection risky. The framework has to weigh the option of not collecting or storing the data at all, sacrificing the benefits of pooled data. Regulating cybersecurity, data storage, searchable public or private large datasets, the companies creating black-box data algorithms, hospital and doctor use of data not proven to evidence causal relationships, while a large job, is the necessary ethical response to the fast-moving new discoveries changing the delivery of care and the storage of data. Moving beyond the four principles allows bioethicists to harness the sophisticated language of big data, view the broad discriminatory and criminal potential without the constraints of “justice”, and avoid phrases more conducive to the doctor patient relationship.

Take the Quiz

Created on

Big Data, Tech Ethics, and Privacy

A short quiz on big data

1 / 15

Some issues caused by big data include

2 / 15

When a new invention or technology collects data and the patent-holder of the technology has rights to the data produced by its use, that entity holds a

3 / 15

Black-box medicine refers to the use of big data in treating disease. Usually diagnosis or treatment decisions using big data rely on

4 / 15

Ways to deidentify data for HIPAA compliance include

5 / 15

In Olmstead v. United States (1928) which Supreme Court justice described a "right to be let alone"?

6 / 15

Which of the following laws are relevant to student medical records held by a school?

7 / 15

Privacy as confidentiality differs from the right to be left alone or to make personal decisions without government intrusion. HIPAA cover which type of privacy?

8 / 15

Ethical surveillance should require

9 / 15

Algorithms outperform clinicians

10 / 15

Facial recognition technology can be used to determine

11 / 15

Most data privacy laws and resolutions are based on principles including all of the following except

12 / 15

Most people expect that their data can be sold for profit by medical institutions that collected it.

13 / 15

Informed consent offers a person some agency over their person, medical procedures, and personal data. Yet informed consent does not protect

14 / 15

Technology has changed the work environment and caused job loss. Which of the following ethical considerations is the least supported by the evidence?

15 / 15

Principles and goals of tech ethics and ethical AI include which of the following

Your score is

The average score is 89%


Responsible Technology Posts

Animal Ethics

J. Zhang, B. Chen, Y. Zhao, X. Cheng and F. Hu, “Data Security and Privacy-Preserving in Edge Computing Paradigm: Survey and Open Issues,” in IEEE Access, vol. 6, pp. 18209-18237, 2018, doi: 10.1109/ACCESS.2018.2820162 (Privacy is outsourced opening vulnerability to security breaches; the four layer architecture can present data security and privacy challenges.)

Emily Mullin, “Bad Actors Getting Your Health Data Is the FBI’s Latest Worry,” leapsmag, February 25, 2019. Mullin Article

Martinez-Martin N. What Are Important Ethical Implications of Using Facial Recognition Technology in Health Care? AMA J Ethics. 2019 Feb 1;21(2):E180-187. Martinez-Martin Article

Mulligan, J., Esq, & VonderHaar, M., Esq. (2016). Health Hackers: Questioning the Sufficiency of Remedies When Medical Information is Compromised. The Health Lawyer, 29(1), 29-37.  PDF: Mulligan PDF

Loi, M., Christen, M., Kleine, N., & Weber, K. (2019). Cybersecurity in health – disentangling value tensions. Journal of Information, Communication & Ethics in Society, 17(2), 229-245.  Cybersecurity in Health Article “5.4 Prioritizing non-maleficence and (privacy-related) autonomy at the expense of beneficence and autonomy Consider a system of medical health records optimized to promote privacy and safety. The most extreme form of this would be a system minimizing data collection, data sharing, communication and networking. Such a system may be able to avoid privacy breaches and impersonation and denial of service attacks, thus avoiding device malfunctions. It would be responsive to the principle of non-maleficence and also of autonomy, i.e. it protects privacy, which is crucial for autonomy.

Such a design, however, could not be used for providing data intensive services, which may involve a sacrifice in quality and/or cost-effectiveness. This is contrary to the principle of beneficence. In the context of implantable medical devices, maximizing privacy and safety leads to sacrificing certain aspects of usability (e.g. no wireless monitoring) with implications on autonomy.” (Article applies four principles; stretching autonomy to be the principle relevant to privacy. Autonomy as self-direction and even liberty does not, to me, perfectly grasp the gravity or the type of control over data that one wants, nor does it way the benefits of added choice of care or possible autonomy-like benefits from improved information.)

Perakslis, E. D. (2014). Cybersecurity in health care. The New England Journal of Medicine, 371(5), 395-397. Perakslis Article

Zimmerman, A. (2020). Marketing madness: The disingenuous use of free speech by big data and big pharma to the detriment of medical data privacy. Voices in Bioethics6Zimmerman, Marketing Madness

Brenda M. Simon, Ted Sichelman, “Data-Generating Patents,” Northwestern University Law Review, Vol 111, Issue 2 (2017) Data-Generating Patents

Kathleen Charlebois, Nicole Palmour, Bartha Maria Knoppers “The Adoption of Cloud Computing in the Field of Genomics Research: The Influence of Ethical and Legal Issues,” October 18, 2016 . PLoS ONE 11(10):
e0164347. The Adoption of Cloud Computing  (Empirical research identifying ethical issues and finding a balance between data sharing and privacy.)

Sharona Hoffman, “Big Data’s New Discrimination Threats,” Big Data, Health Law, and Bioethics, Chapter 6, edited by I. Glenn Cohen, Holly Fernandez Lynch, Effy Vayena, Urs Gasser, Cambridge University Press, 2018.

W. Nicholson Price, III, “Medical Malpractice and Black-Box Medicine,” Big Data, Health Law, and Bioethics, Chapter 20.

Tal Zarsky, “Correlation versus Causation in Health-Related Big Data Analysis” Big Data, Health Law, and Bioethics, Chapter 3.

Ashish M. Bakshi, Gene patents at the Supreme Court: Association for Molecular Pathology v. Myriad Genetics, Journal of Law and the Biosciences, Volume 1, Issue 2, June 2014, Pages 183–89, Bakshi Article

Singer, Natasha, “The government protects our food and our cars. Why not our data?” New York Times, Nov. 2, 2019. Beckerman, Michael, “Americans will pay a price for state privacy laws,” New York Times, Oct. 14, 2019. Singer Article

Whitney, Jake, “Big (Brother) Pharma,” New Republic, August 29, 2006. Whitney Article

Kaplan, Bonnie, “Selling Health Data: De-identification, Privacy, and Speech,” ISPS Bioethics Working Paper, Yale Interdisciplinary Center for Bioethics, Oct. 7, 2014. Kaplan Paper  (notes the Court’s failure to evaluate Sorrell as a constitutional privacy case.  Kaplan says that “the State deciding which speech is permitted and which data users are favored over others is detrimental to both personal freedom and the marketplace of ideas.”

Rothstein, Mark A. “Is deidentification sufficient to protect health privacy in research?.” American Journal of Bioethics, Vol. 10,9 (2010): 3-11. doi:10.1080/15265161.2010.494215 Rothstein Article

Booth, Katie, “The all-or-nothing approach to data privacy: Sorrell v. IMS Health, Citizens United, and the future of online data privacy legislation,” JOLT Digest Harvard Law School, Aug. 7 2011.  Booth Article (Arguing there is a role for corporate responsibility.)