Patient Data is Like Cocaine

What the history of a miracle cure can teach us

Feb 27, 2024

Who wouldn’t want more?

Imagine opening the door to a 19th-century pharmacy and finding "Coca Wine”, which could relieve “fatigue of mind and body”. That sounds nice, doesn’t it? Most days I could probably use something like that, if only it weren’t actually cocaine mixed into wine. This wasn't a fringe product; it was mainstream, endorsed by the likes of Thomas Edison and even Pope Leo XIII. Freud prescribed it for depression and even for his own migraines. Cocaine wasn't just in tonics; it was in toothache drops, dandruff remedies, and yes, the early recipes of a certain famous cola.

I often marvel about the first spinal anesthetic, which was performed in 1898 with cocaine. It seems like a terrible idea to stick a needle into someone’s spinal fluid and inject cocaine, but it worked extremely well and allowed for surgeries that previously would have been impossible.

Dr. Bier, spinal anesthesia pioneer

The turn of the century brought a growing awareness of cocaine's addictive potential and its social consequences. What followed was a gradual tightening of regulations, culminating in the Harrison Narcotics Tax Act of 1914, which marked the beginning of cocaine's transition from a household remedy to a Class 1 controlled substance.

The Parallels with Patient Data Regulation

Fast forward to the digital age, in which data is bought and sold to feed AI models’ insatiable appetite. This data drawn from patient records holds the key to predictive analytics, personalized medicine, and beyond. However, much like the regulation of cocaine in the 1800s, clear regulation is not recognized as being necessary.

The regulatory framework governing patient data in the United States is a patchwork quilt of state and federal regulations, with HIPAA as the major legal touchstone. However, HIPAA's focus on privacy and security doesn't address a critical issue – patients, the very subjects of this data, do not own their medical records.

This places the United States apart from many other nations grappling with digital privacy. The European Union's GDPR (General Data Protection Regulation), for instance, gives individuals considerably more control over their personal data. In the US, healthcare providers and tech companies navigate a complex web of consent forms and data-sharing agreements, often leaving patients in the dark about where and how their information is used.

Historical context

Historically, patients were seen as the owners of the information contained within their medical records, with healthcare providers owning the physical records themselves. This worked while paper records were the primary source of medical information. However, the advent of electronic health records (EHRs) has significantly muddied these waters. Currently, there is no federal law directly addressing the ownership of medical records. New Hampshire stands out as the only state that explicitly recognizes patients as the owners of their medical information.

Graphic from Who Owns Medical Records: 50 State Comparison | Health Information & the Law

The Dichotomy of Data Ownership Philosophies

The debate surrounding patient data ownership is polarized between two predominant philosophies: the Privatization Postulate (PP) and the Communization Postulate (CP). The former advocates for private ownership as a means to ensure individual control over one's data, privacy, and property, while the latter argues for treating individual-level health data as a common good, emphasizing the benefits of open science. Most of the papers I saw argued that this dichotomy is not a helpful framework and usually gets in the way of more productive conversations. The reality is that there needs to be some middle ground between the two extremes, but the technological advances have challenged some of the fundamental assumptions about when the risk of sharing data.

Currently, almost all healthcare data is de-identified and sold to huge data brokers, which we’ll discuss in more depth next week. But I will point out now that nowhere in the private or communal data philosophies does it mention “selling a bunch of information to the highest bidder”, which is actually what is happening. Somehow we didn’t decide on either ideal and ended up with a worse middle ground.

How should we inform patients about their data?

Let’s (very generously) assume that data brokers actually wanted patients to understand how their data might be used in the future. We know from studies about healthcare consent forms that about half of American adults read below a 6th grade level and the same percentage are unable to understand and synthesize information in a complicated document. A tool designed to test understanding of consent forms found that all 25 hospitals studied did poorly.

A recent conference proceedings from Harvard and partners about healthcare AI “strongly supported the use of “opt-out” — that is, the default is for patient data to be included — rather than “opt-in,” which could lead to further disparities in data collection because of historical mistrust of some populations toward medical science.” I agree with that to some extent. Many medical databases already woefully underrepresent and don’t benefit many populations, and more data from those populations will improve AI’s possible benefits.

However, I think this oversimplifies the very real and understandable trust issue of many patient populations that are disadvantaged or historically mistreated by the medical system. Do we really want to be less than completely clear about where their information might be going, who might be buying it, and who might be profiting from it? “Trust us, we know what’s best” has hardly been true in the past for most of these patient populations. That seems like a recipe for further alienation from the healthcare system to me.

Does anyone actually read and understand these forms?

Just like people didn’t really know what they were agreeing to when they drank coca wine, patients sign consent forms all the time, some of which may contain some hidden provision about how their health data might be used.

The Agency for Healthcare Research and Quality publishes a sample consent form with no mention of reselling data:

Sample Combined Informed Consent and Authorization Forms | Agency for Healthcare Research and Quality

This sentence about “Practice will have to send my medical record information to my insurance company” fails to include the fact that the insurance company will then likely sell that information to a data broker.

No one really knows who owns the data

A survey published in the Journal of Medical Internet Research reveals a divided perception of healthcare data ownership, with 53% of patients believing they own their data, whereas 43% of providers think the ownership rests with them. Another 20% of both groups say they have no idea. Even after writing this post, I still don’t know who exactly owns my data or my kids' data. Is it a data broker? Is it Epic? Is it me? Is it the health system? Is it all of them, in various levels of detail and de-identification?

Our responsibility as a profession

As the ability to re-identify patients improves, these distinctions about who has the data and what the patients understand about where their data goes becomes more relevant. We have an obligation to be knowledgeable stewards of the private information patients entrust to us, which is a standard I don’t think we’re meeting as a field right now. Claiming that it’s too complicated to figure out is a disservice to our patients. I hope we can advocate for a better understanding of the effects of data aggregation on patients to determine whether the status quo is acceptable, just as some enterprising physicians took it upon themselves to determine the effects of Coca Wine.

Sarah’s Substack