Big Data: A Modern Privacy Framework

A modern privacy and big data framework must address the use of both health-related data as well as seemingly innocuous data on behaviors, consumer preferences, voting, employment history, credit rating, and social media use. Biometrics, health records, new technologies like facial recognition bring new ethical issues that require expanded frameworks. Privacy is vulnerable due to the newness of certain technologies, which are not yet addressed by laws and best practices, and the hackability in the cybersecurity arena.

Risks of Abuse of the Data: Marketing and Discrimination

In the marketing arena, corporate free speech propelled the commercial value of deidentified information including physician prescribing practices, data sets correlating disease predictors with heretofore seemingly unrelated things (like what television station a person watches or where they shop), and genetic, physical, psychological health records. If a data set indicates a person listens to NPR, watches MSNBC, shops at Whole Foods, we can guess they voted for Joe Biden. What is unexpected is that big data also is being used to predict their pharmaceutical needs. The risk of reidentification, inappropriate gathering of data, or abuse (disease creep or upselling) could be weighed against the benefits. For much of the use, the benefit runs to pharmaceutical sales and marketing departments, not to the people whose data was collected. For large health data sets, the benefit runs both to public health initiatives and individual care. The issue of which uses of the data are ethical, whether the data should be available for all uses, and whether the people whose data it is should have any say in the matter (by creating the regulations or opting to keep their data from being collected) are the backbone of the framework.

Large data sets are used for predictions of the future health of people with certain characteristics. Large data sets prejudice employers and lenders against groups of people with certain traits, who engage in certain behaviors, or simply by their habits and preferences. For example, data correlates both bike shop customers and people who vote in midterm elections with good health. (Zarsky) While that seems interesting from a public health perspective, in the hands of an employer, lender, or someone marketing products to the unhealthy, the data can be used unethically. The likelihood that people and groups face discrimination already based on big data, some of which seemingly has nothing to do with actual health records, is concerning.

Large searchable databases open the possibility of reidentification (identifying a person with the deidentified health data) leading to possible discriminatory treatment based on genetics or medical history. The ADA does not address the use of predictive data.

In facial recognition, special issues include whether the patient knows and understands how an image can be used to identify genetic disorders, whether the algorithms are accurate and what human error can impact the use of images, and whether facial images and templates can be deidentified. The Genetic Nondiscrimination Act (GINA) does not cover facial recognition as it is not categorized as a specifically genetic test. (see Martinez-Martin.)

Black-box Medicine and Data in Clinical Care

The lack of certainty surrounding whether a causal relationship exists when large data sets provide information can create ethical dilemmas. Black-box medicine pulls together huge datasets to predict what medicines might work, primarily based on patterns rather than an understanding of disease mechanisms. (Price) The patterns can be based on genetic and biological data, can be beneficial to health, and can lead a hypothesis later tested. That is, research eventually may answer the why and confirm causality. There are vast benefits to black-box medicine but it is unclear how to evaluate the duty of the doctor to a patient for whom the recommendation fails or what to do when the recommendation the algorithm produces violates the known standard of care. Does black-box data equate to the best medical judgment of the doctor or replace it? The algorithm choice does not corroborate a care choice (it does not supplement expertise), it supplants expertise. The level of risk (the side effect profile and risk of foregoing the standard of care based on the condition) is an important consideration. Risk 1: Following the computer-generated advice. Risk 2: Forgoing the standard treatment. Risk 3: Who is responsible if it turns out bad (negligence, malpractice, no consequences)? Risk 4: Can the developers be held responsible? Risk 5: The regulations do not keep up with the technology.

Data-Generating Patents

Data generating patents require a new ethical approach as well. Intellectual property rights are expanding. Data-generating patents can preclude other ways of obtaining, collecting, or generating the same type of data. The data generated is protected as a trade secret. The patents provide a windfall of market share in the data market which is not the market of the technology or biomedical device patented. (e.g., if a patented search engine of social media outlet collects data from millions of people, trade secret law protects the actual data; when a way of testing for BRCa is patented this way, the company has access to all of the data of those tested.) In Association for Molecular Pathology v. Myriad Genetics, Inc. Myriad’s patents were mostly upheld (the Court distinguished natural sequences of DNA (gDNA) (unpatentable) from cDNA that is synthetic and therefore patentable). 569 US 576 (2013). Myriad continues to use trade secret law to protect the database of patient information. (Simon and Sichelman) Trade secret law does not have an end date so the ability to create a monopoly, barrier of entry to competing businesses, or to use big data as an advantage in marketing and producing other products is great. The market control can inhibit innovation and access to data for the public good or public health, hurting consumers and the public.

What Does Privacy Mean?

Does modern privacy mean nothing is actually private yet many things remain protected from government interference. There may be a paradigm shift in privacy law. While many regulations speak to cybersecurity, hacking is increasingly sophisticated. While slander and libel cover things that are not true, the argument that something is private may face limitations. It is not unusual for private information to be leaked, stolen, held for ransom, and made public.

FBI follows data wrongdoers, recognizing that hacking a health database can lead to extortion and can influence politics and careers.

Big Picture

Anne Zimmerman copyright Creative Commons CC-BY-NC

The framework requires considerations: weighing the value of privacy, possibility of data-based “precision” medicine, and the benefits of data-based public health approaches against the risk of de-identification (violation of privacy, individual discrimination by employers, insurers, and lenders, public embarrassment, extortion, or other crimes against individuals), discrimination against groups based on correlations, a system of passed blame rather than malpractice claims, and limits to innovation based on patents and data-generating patents, and hacking (access to large data storage, financial risks, large data sets used by bad actors, national security).

Some systems could help: Blockchain transparency (with less ability to identify individuals), edge computing (keeps data closer to home compared to cloud), and the ever-increasing cybersecurity software upgrades. Keeping ahead of hackers is always going to be difficult making all data collection risky. The framework has to weigh the option of not collecting or storing the data at all, sacrificing the benefits of pooled data. Regulating cybersecurity, data storage, searchable public or private large datasets, the companies creating black-box data algorithms, hospital and doctor use of data not proven to evidence causal relationships, while a large job, is the necessary ethical response to the fast-moving new discoveries changing the delivery of care and the storage of data. Moving beyond the four principles allows bioethicists to harness the sophisticated language of big data, view the broad discriminatory and criminal potential without the constraints of “justice”, and avoid phrases more conducive to the doctor patient relationship.

Loading Viewer...

Take the Quiz

Created on December 23, 2021

Big Data, Tech Ethics, and Privacy

A short quiz on big data

1 / 15

Some issues caused by big data include

Discrimination

Too much marketing

Privacy risk

Unfairness in profits

All of the above

2 / 15

When a new invention or technology collects data and the patent-holder of the technology has rights to the data produced by its use, that entity holds a

trade secret

data-generating patent

trademark

plant patent

3 / 15

Black-box medicine refers to the use of big data in treating disease. Usually diagnosis or treatment decisions using big data rely on

doctor's expertise

causation

clinical examination

correlation

4 / 15

Ways to deidentify data for HIPAA compliance include

expert determination and a small risk of reidentification

remove 18 identifiers

either of the above

5 / 15

In Olmstead v. United States (1928) which Supreme Court justice described a "right to be let alone"?

Antonin Scalia

Louis Brandeis

Benjamin Cardozo

Sandra Day O'Connor

6 / 15

Which of the following laws are relevant to student medical records held by a school?

HIPAA

FTCA

FERPA

FCRA

CCPA

7 / 15

Privacy as confidentiality differs from the right to be left alone or to make personal decisions without government intrusion. HIPAA cover which type of privacy?

confidentiality

freedom from government intrusion

both

neither

8 / 15

Ethical surveillance should require

probable cause and a warrant

informed consent

9 / 15

Algorithms outperform clinicians

True

False

10 / 15

Facial recognition technology can be used to determine

emotional state

genetic data

patient identification

weight and body mass index

all of the above

11 / 15

Most data privacy laws and resolutions are based on principles including all of the following except

portability of data

benefit of data use in scientific research

accessibility to data and ability to correct data

facilitate international flow of data

protect privacy

ensure fair compensation for data

12 / 15

Most people expect that their data can be sold for profit by medical institutions that collected it.

True

False

13 / 15

Informed consent offers a person some agency over their person, medical procedures, and personal data. Yet informed consent does not protect

people from surveillance

national security from cyberthreats

the right to be compensated for data

the right to control deidentified data

all of the above

14 / 15

Technology has changed the work environment and caused job loss. Which of the following ethical considerations is the least supported by the evidence?

jobs have an intrinsic and instrumental value

workplaces are a source of community

progress is an intrinsic and instrumental good

all unemployment is the fault of the person who did not have a strong enough work ethic

15 / 15

Principles and goals of tech ethics and ethical AI include which of the following

transparency

fairness

social responsibility

incorporating diversity and bias prevention

privacy and protection from threats

minimizing harm

all of the above

Your score is

The average score is 80%

Responsible Technology Posts

Realistic concerns about AI: Are misplaced existential worries detracting from the important issues?

Some articles in mainstream media and academics could be paraphrased as statements that begin “I am concerned about AI proliferation…

by Anne Zimmerman July 18, 2023July 18, 2023

Wrongful (but not wrong) Prediction

There was an episode of Welcome Back, Kotter that aired in 1976. Juan Epstein, a character in Kotter’s infamous classroom…

by Anne Zimmerman December 5, 2022December 5, 2022

Big Data as Collective Judgments

People put a lot of stock in data. Data is essentially information. It often becomes the basis for an assumption.…

by Anne Zimmerman June 30, 2022July 20, 2022

Can AI Commit a Crime? A Look at Intent

In criminal law, the mens rea is the criminal intent. It is a mindset – most crimes require both an…

by Anne Zimmerman May 22, 2022May 22, 2022

Blockchain and Health: Will Healthcare NFTs be the Next Bored Ape?

I don’t think so, but I do wonder whether the coins or tokens that yield healthcare access or access to…

by Anne Zimmerman February 27, 2022February 27, 2022

Brain Activity & Thoughts: Should Neuro-Rights Look Beyond the Individual?

Neuro-rights may protect people from certain harms due to neurotech advances. Neurotech has potential to improve medical treatments and revolutionize…

by Anne Zimmerman December 14, 2021January 20, 2022

Fair Compensation for Data: Privacy, Blockchain, Ethics, and Data Science Converge

When we look at privacy, many goals converge. I separate constitutional privacy and protection from government surveillance from personal confidentiality.…

by Anne Zimmerman October 30, 2021October 30, 2021

Hackable: Children’s Digital Literacy and Voluntary Disclosure

(Part 3 of series) Children and young adults spend significant time online using apps that collect massive amounts of information,…

by Anne Zimmerman September 17, 2021December 18, 2021

Hackable: Schools and Children’s Private Medical Records

Part 2 in a series on privacy The ethics literature on cybersecurity rarely focuses specifically on children’s data stored by…

by Anne Zimmerman August 27, 2021December 19, 2021

Hackable: The New Privacy Ethics

(a six-post series) Privacy & Disclosure of Personal Data As people spend more time online and using apps that collect…

by Anne Zimmerman August 13, 2021August 13, 2021

Facial Recognition Technology in Medicine: A Use-Based Ethical Framework

Facial recognition technology is everywhere. Pew Research found more than half of adults trust law enforcement with facial recognition but…

by Anne Zimmerman July 27, 2021October 13, 2021

Bioethics, Robots, and The Future of Work

Self-driving cars, warehouse robots, EZ-pass, do-it-yourself check-outs, and ATMs threaten the future of work. Work and its many components including…

by Anne Zimmerman July 3, 2021July 3, 2021

Big Data: Reconciling Privacy, Antitrust, and Data-Generating Patents

Data-Generating Patents require a broad ethical approach that incorporates business ethics. Ethics should that adhere to the spirit behind antitrust…

by Anne Zimmerman June 27, 2021December 18, 2021

Animal Ethics

Animals, Surveillance, and Privacy: Navigating the Ethical Collection and Use of Animal Data

New ethical standards will help discern what is or is not morally (and legally) owed animals now that techno-science affects…

by Anne Zimmerman May 27, 2021January 8, 2024

J. Zhang, B. Chen, Y. Zhao, X. Cheng and F. Hu, “Data Security and Privacy-Preserving in Edge Computing Paradigm: Survey and Open Issues,” in IEEE Access, vol. 6, pp. 18209-18237, 2018, doi: 10.1109/ACCESS.2018.2820162 (Privacy is outsourced opening vulnerability to security breaches; the four layer architecture can present data security and privacy challenges.)

Emily Mullin, “Bad Actors Getting Your Health Data Is the FBI’s Latest Worry,” leapsmag, February 25, 2019. Mullin Article

Martinez-Martin N. What Are Important Ethical Implications of Using Facial Recognition Technology in Health Care? AMA J Ethics. 2019 Feb 1;21(2):E180-187. Martinez-Martin Article

Mulligan, J., Esq, & VonderHaar, M., Esq. (2016). Health Hackers: Questioning the Sufficiency of Remedies When Medical Information is Compromised. The Health Lawyer, 29(1), 29-37. PDF: Mulligan PDF

Loi, M., Christen, M., Kleine, N., & Weber, K. (2019). Cybersecurity in health – disentangling value tensions. Journal of Information, Communication & Ethics in Society, 17(2), 229-245. Cybersecurity in Health Article “5.4 Prioritizing non-maleficence and (privacy-related) autonomy at the expense of beneficence and autonomy Consider a system of medical health records optimized to promote privacy and safety. The most extreme form of this would be a system minimizing data collection, data sharing, communication and networking. Such a system may be able to avoid privacy breaches and impersonation and denial of service attacks, thus avoiding device malfunctions. It would be responsive to the principle of non-maleficence and also of autonomy, i.e. it protects privacy, which is crucial for autonomy.

Such a design, however, could not be used for providing data intensive services, which may involve a sacrifice in quality and/or cost-effectiveness. This is contrary to the principle of beneficence. In the context of implantable medical devices, maximizing privacy and safety leads to sacrificing certain aspects of usability (e.g. no wireless monitoring) with implications on autonomy.” (Article applies four principles; stretching autonomy to be the principle relevant to privacy. Autonomy as self-direction and even liberty does not, to me, perfectly grasp the gravity or the type of control over data that one wants, nor does it way the benefits of added choice of care or possible autonomy-like benefits from improved information.)

Perakslis, E. D. (2014). Cybersecurity in health care. The New England Journal of Medicine, 371(5), 395-397. Perakslis Article

Zimmerman, A. (2020). Marketing madness: The disingenuous use of free speech by big data and big pharma to the detriment of medical data privacy. Voices in Bioethics, 6. Zimmerman, Marketing Madness

Brenda M. Simon, Ted Sichelman, “Data-Generating Patents,” Northwestern University Law Review, Vol 111, Issue 2 (2017) Data-Generating Patents

Kathleen Charlebois, Nicole Palmour, Bartha Maria Knoppers “The Adoption of Cloud Computing in the Field of Genomics Research: The Influence of Ethical and Legal Issues,” October 18, 2016 . PLoS ONE 11(10):
e0164347. The Adoption of Cloud Computing (Empirical research identifying ethical issues and finding a balance between data sharing and privacy.)

Sharona Hoffman, “Big Data’s New Discrimination Threats,” Big Data, Health Law, and Bioethics, Chapter 6, edited by I. Glenn Cohen, Holly Fernandez Lynch, Effy Vayena, Urs Gasser, Cambridge University Press, 2018.

W. Nicholson Price, III, “Medical Malpractice and Black-Box Medicine,” Big Data, Health Law, and Bioethics, Chapter 20.

Tal Zarsky, “Correlation versus Causation in Health-Related Big Data Analysis” Big Data, Health Law, and Bioethics, Chapter 3.

Ashish M. Bakshi, Gene patents at the Supreme Court: Association for Molecular Pathology v. Myriad Genetics, Journal of Law and the Biosciences, Volume 1, Issue 2, June 2014, Pages 183–89, Bakshi Article

Singer, Natasha, “The government protects our food and our cars. Why not our data?” New York Times, Nov. 2, 2019. Beckerman, Michael, “Americans will pay a price for state privacy laws,” New York Times, Oct. 14, 2019. Singer Article

Whitney, Jake, “Big (Brother) Pharma,” New Republic, August 29, 2006. Whitney Article

Kaplan, Bonnie, “Selling Health Data: De-identification, Privacy, and Speech,” ISPS Bioethics Working Paper, Yale Interdisciplinary Center for Bioethics, Oct. 7, 2014. Kaplan Paper (notes the Court’s failure to evaluate Sorrell as a constitutional privacy case. Kaplan says that “the State deciding which speech is permitted and which data users are favored over others is detrimental to both personal freedom and the marketplace of ideas.”

Rothstein, Mark A. “Is deidentification sufficient to protect health privacy in research?.” American Journal of Bioethics, Vol. 10,9 (2010): 3-11. doi:10.1080/15265161.2010.494215 Rothstein Article

Booth, Katie, “The all-or-nothing approach to data privacy: Sorrell v. IMS Health, Citizens United, and the future of online data privacy legislation,” JOLT Digest Harvard Law School, Aug. 7 2011. Booth Article (Arguing there is a role for corporate responsibility.)

BigDataPowerpoint