Are deidentified healthcare records truly anonymous? A recent article in Nature Communications exposed how easy it is to combine certain deidentified healthcare data with publicly available demographic data and reidentify the individual patients. The study showed that by examining just 15 consumer characteristics, researchers were able to reidentify a patient with a 99.98% level of accuracy. Using just four characteristics, they could reidentify a patient with 79.4% accuracy.
Under HIPAA, organizations can use deidentified medical records for a wide variety of uses, including academic research, population health studies, marketing and more. While there is tremendous value in examining anonymous health data across large populations, it is vital to make sure that individual patient information stays private.
Implications for Healthcare Marketers
In the healthcare industry, the ability to combine healthcare with non-healthcare data is a powerful new tool allowing researchers, analysts and marketers to uncover new insights into patient behavior, create audience profiles and perform predictive analytics. Technically, it is not difficult to connect deidentified health data to consumer or demographic data by taking multiple datasets and combining them using a match key. Several companies have dipped their toes in the water by combining health information with basic demographic information. This approach provides some value, but it is limited. The real utility comes when combining full datasets together.
While these data combinations can drive significant positive value in healthcare, doing so while protecting privacy is key. As the Nature Communications article shows, even by appending just a few variables to health data, there is a real risk of reidentification.
Distributed Analytics
Crossix has innovated in this area with a privacy-safe way to combine these types of data using a distributed (or federated) data approach. With this technique, the sensitive healthcare data stays in its original location, behind privacy firewalls.
Crossix SafeMine™ technology allows us to use this distributed approach to perform analyses in segregated and secure environments behind privacy firewalls, in the distributed data environment. After the data is matched and combined, Crossix uses a privacy-by-design approach leveraging technology to control outputs and to ensure only population-level, certified deidentified data is extracted out of the system and used in analytics. When combined with other non-health data, no patient-level data ever leaves the secure environment of the covered entity. Other approaches where “deidentified” patient-level health data is combined with non-health data at a data aggregator environment inherently introduce real risks of patient reidentification only protected by agreements not to do so.
In addition to preventing patient reidentification by never appending other data to deidentified health data available outside the distributed data network, SafeMine protects patient privacy by using query scope encryption and hashing techniques which produce different outputs for every query made to the system. Most crucially this ensures that patient ID’s are never persistent across queries and thus outputs of queries are not directly comparable. On the other hand, a centralized deidentified patient database inherently utilizes persistent patient IDs, allowing for the possibility of linking patients across results of multiple queries. This greatly increases risk of reidentification in the case of a data breach or actions of a nefarious actor.
It’s incredibly important that brands view privacy as the number one consideration when evaluating marketing measurement and analytics partners. There’s so much positive value to extract from the proliferation of data, and all players in the ecosystem should be committed to policing against privacy risks. If a privacy breach occurs, the scrutiny is not just going to fall upon the vendor. It will fall upon the pharma brands using the data and the industry at large, which is why we all have a vested interest in ensuring responsible data use. Most importantly, we must honor reasonable expectations of privacy and maintain patient trust.
Interested in learning more about the changing privacy landscape? Contact us