Customer-Owned S3 Buckets, Regulatory Compliance (HIPAA), and Open Source Software

cwheeler · August 2019

I want to tell everyone about a new feature that I am really excited about. When we first started writing the helpdesk button software, we knew that we were going to have to deal with a lot of very sensitive data from computers all over the country (little did we know at the time: actually all over the world). That’s why, very early on, we put a lot of thought into security and compliance. We reached out to a HIPAA (Health Insurance Portability and Accountability Act) compliance attorney before we ever showed this software to anyone because our own MSP manages mostly medical practices. We knew that we needed to have certain very specific features to comply with the most strict data security laws. Features like per-user access restrictions, audit logging, and consent-based data transmission were always built-in.

At our MSP we mostly have to deal with HIPAA, but there are other similar laws which cover other sectors than healthcare and take similar approaches. The financial sector in the US, for example, has SOX (Sarbanes-Oxley Act). The point is that we expected that our customers would have a need for regulatory compliance when using our software because of the screenshots that may contain very sensitive information. We therefore set out to design a system which can address these regulatory hurdles in an all-encompassing way.

It may come as no surprise that our platform runs on Amazon Web Services (AWS). The data that gets submitted to create the reports we make for you is stored on AWS’s Simple Storage Service (S3). Up until now, all the data for all the reports has been stored on our own S3 bucket. We knew from the beginning that we didn’t like that idea. We feel like we shouldn’t have to be responsible for all this sensitive data. So we set out to build in the ability for that data to be stored on customer-controlled S3 Buckets. That was easy, but we felt it wasn’t really enough. AWS supports sharing buckets across accounts, and we could have just had a simple logic switch in our existing code that says “if (accountID=X) {use S3 bucket Y}”. But if we have access to your bucket, even if it’s on your own account, we still have access to that data.

In our minds, the ideal system would be one in which nobody but the customer has access to their data. Even we, ourselves, would be unable to access any of the data; even if we wanted to. It would be such that even if our systems were completely compromised and an attacker had root access to everything, all of the customers’ data would be safe.

I have spent a lot of my time working with cryptocurrency over the years, and I have written a lot of code around Bitcoin. This “Trust Nobody” and “In Cryptography We Trust” mindset is deeply ingrained. I knew that there was a way to pull this off, but it would be tricky.

What we came up with is the S3_gatekeeper.

The S3 Gatekeeper is a piece of code that now sits at the heart of v0.4 of the helpdeskbuttons software. The job of the gatekeeper is to cryptographically verify every request that passes through it. It is the sole means by which data is sent to the S3 buckets and by which data leaves the S3 buckets. We took some pages out of Bitcoins’ playbook and implemented Miltisig. Each transaction (Either a GetObject or PutObject) requires two digital signatures. One of the signatures is generated by us, on our servers. The other one is generated by the gatekeeper, which sits on the AWS account owned by the customer. We decide whether to sign the request based on the authentication to our website. The gatekeeper decides based on a user-configurable ACL. The ACL supports IP based whitelisting and blacklisting.

The report page previously loaded all of the content from our S3 bucket server-side and rendered the page based on that content. We could have switched to just pulling that content from the gatekeeper, but then your data would have needed to flow though our servers to get rendered. That would just not do; we should not access your data ever. So we redesigned the report page so that all of the communication with the gatekeeper is done client-side and fetched with JavaScript. JavaScript then renders the page.

The end result is that you can blacklist even OUR servers IPs and everything continues to function as it should. Moreover, every transaction that takes place on the gatekeeper is put into a searchable audit log database that the customer has full control of in their AWS account.

We have open-sourced the gatekeeper codebase on our GitHub page because we want encourage peer-review of this vital piece of security software and we want the customers to know without a doubt that there are no loopholes and that their data is as safe as it should be. We also think the gatekeeper concept can help other products from other companies keep customer data safe, and help meet regulatory compliances around the world.

The last piece of the HIPAA puzzle is the BAA. And we are happy to announce that we are now accepting entry into a BAA between us, helpdeskbuttons.com LLC, and you. Furthermore, once you have your own AWS bucket set up, Amazon will enter into a BAA with you too. For instructions on how to do that, see this video

An interesting thing about this is that using our software checks the compliance box even though your ticket system may not be compliant. It’s part of the reason why we don’t just dump all this data into your ticket system. Instead we chose to add a link to our site where you view the sensitive information. This is because we’re not sure your ticket system can really be trusted with all that data. Did your ticket system vendor sign a BAA with you? Maybe not, we don’t know; but we will.

If you are interested in moving to your own bucket, or entering into a BAA with us, please contact support. [email protected]

Chris Wheeler, CTO, HelpdeskButtons