TrueVault recommends doing data de-identification and re-identification client side (within your web or mobile app logic). This makes two interesting flows:
De-Identification
- Data is entered into your application by the user.
- Your application (e.g. JavaScript code in a browser or Swift code on an iOS device) will split that data into what is identifying (names, phone numbers, etc) and what is not identifying (other health data).
- The identifying information is sent directly to TrueVault from the client application through the TrueVault API. Ideally the end-user is authenticated as a TrueVault user, so the API permissions can be granularly defined. The API returns an opaque identifier that can be associated with the de-identified data and stored on your server.
- The de-identified health data (along with the TrueVault id) is sent to your server. Your server is not handling identifying data & health data, so it doesn't fall under the purview of HIPAA regulations.
Re-identification
Let's say your end-users want to search for data based on criteria stored on your server (e.g. blood pressure) but they want to see identifying data in the result set. This re-identification should happen on the client. Something like:
- Client device makes a request to your server to retrieve information matching the query.
- Client device pulls TrueVault ids out of the server results. These ids are opaque, they are not identifying, but they can be used to retrieve identifying data from the TrueVault API (if properly authenticated).
- Client device requests identifying data from TrueVault API corresponding to the TrueVault ids returned by your server. Ideally the end-user is authenticated as a TrueVault user, so the API permissions can be granularly defined.
- Client device merges the datasets so the UI shows a re-identified view of the data.
Compliance
Following these steps, you can create a seamless user experience. There's no reason the end-user should know that you're de-identifying and re-identifying data.
This process is designed to keep your server infrastructure free from HIPAA Regulations, so you can host this de-identified data in a non-compliant manner if you chose. This is all above-board with HHS, they even have specific guidance on it: https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html.
By avoiding storing Identified PHI on your servers, you can skip the physical and technical safeguards that apply to server-side infrastructure. Please understand that you don't escape the administrative requirements of HIPAA entirely. For example, in the system described above there are still human beings looking at Identified PHI. Those humans must be trained and their access must be limited to the minimal amount needed for them to perform their jobs. All of this work is a hard requirement if you're making an application that must be HIPAA Compliant, whether you use TrueVault or not. What TrueVault saves you is all of the work to have secure and compliant infrastructure for storing data, maintaining audit logs, secure backups, high availability, controlling access, managing users, and lots more.