Security and privacy are a big concern for us at Colloq and we do take it seriously. We want to make Colloq a safe and secure platform and are taking certain measures to help ensure it is and stays this way. Because of that, our platform features an advanced password validity check that we want to talk about. Colloq is using a yet pretty rarely seen NIST1-compliance feature, that we want to explain in a little more detail.
Handling Account Security
When offering an online service there are two risks to user accounts: Firstly, the service itself can be compromised by an attacker. Secondly, a user’s password could be obtained by an attacker and then be used to target our platform by injecting malicious content or abusing the account.
Once you create an account on Colloq, we require you to verify your account via the email address provided. This is a common and relatively reliable way to prevent random signups and to ensure that a human initiated the signup.
Until you have verified your email address, your account exists, but you won’t gain all the privileges of a registered and verified user.
The Password Problem
One of many challenges for an online service is to make sure users don’t use weak passwords. While this should be ensured, a user should still be able to choose and provide a password that’s easy to remember.
With Colloq we decided to require a minimum of 8 characters for passwords. Shorter passwords won’t be accepted by our servers. However, we recommend a longer password with at least 12 characters, ideally containing special characters, upper- and lowercase characters and numbers in it. It’s also a good practice to avoid using a sole word that exists in a dictionary—they’re relatively easy to crack. For more information about how to choose a good password, there are a couple of blog posts like this one here.
The second issue is that most users today still don’t use a password manager, but instead have one or a few “master” passwords which they use for multiple services. The main issue with this approach is that if only one of the services is hacked, all other services with the same password are potentially affected as well. If a hacker has one password, they now only need to obtain information about for which services a user might use the same password.
Checking For Leaked Passwords
This year, the NIST announced some new rules for governmental services regarding password security. They’re pretty interesting because they mention one specific measurement to improve users’ safety:
When processing requests to establish and change memorized secrets, verifiers SHALL compare the prospective secrets against a list that contains values known to be commonly-used, expected, or compromised. For example, the list MAY include, but is not limited to:
- Passwords obtained from previous breach corpuses.
- Dictionary words.
- Repetitive or sequential characters (e.g. ‘aaaaaa’, ‘1234abcd’).
- Context-specific words, such as the name of the service, the username, and derivatives thereof.
If the chosen secret is found in the list, the CSP or verifier SHALL advise the subscriber that they need to select a different secret, SHALL provide the reason for rejection, and SHALL require the subscriber to choose a different value. > — NIST Guidelines
In essence, this means that governmental services should check whether a password provided by the user has been leaked in a known data breach. In such a case, we warn the user and suggest to use a different password.
We’re not required to do this check by law but in our opinion it makes a lot of sense to perform this check for any online service. If a user’s password appears in a public data leak, an attacker can use the information and abuse the account for their purposes. While it’s not the fault of the service, we want to proactively protect our users and the content on our platform.
How We Check for Leaked Passwords
When you set a password on Colloq, we check its SHA-1hash and compare it to the massive dataset of the HIBP-Project to see if the hash exists there. If we find a match, we’ll require you to either choose a different password, if you don’t have one yet (fresh sign up), or to change the password if you already have a verified account.
Note: We don’t store this simple SHA-1 hash of your account password, this hash is only used for the temporary leak lookup. Your hashed account password is securely stored by using our programming language’s default password functions.
We don’t want to share sensitive data such as password hashes with a third-party, so we manually cloned the dataset and set it up on our own Postgres database server to do the required check. This allows us to decide about our database structure and how we want to perform the required queries.
Importing 43GB of Data…
The initial data import was pretty costly, but running the check is impressively fast. The dataset to import is an extremely large dataset of text files containing unsorted, not deduplicated leaked password hashes.
Creating this initial import that works within our environment did require some tests and trial runs. We did start to create this dump on a local iMac, but the space requirement made us realize that we need a better, and more flexible approach that could be reproduced by anyone in the team. The flexible pricing of Digital Ocean allowed us to boot up a more powerful machine, create the dump and destroy it after we successfully copied the data to a backup server. Within about a day we have been able to import the database and now have a separate, ready to use Postgres database with all the leaked passwords available for our application.
The Performance of Querying 320 Million Password Hashes
Checking for a leaked password is actually quite fast. The query adds some little additional time of ~8ms to the average request response on our production servers, which are relatively small boxes with 1GB memory and 2 dedicated vCPUs.
So, Is It Worth It?
You might wonder if it’s worth going through all of this effort for a platform like ours. While importing the data definitely is a challenge, running the service is nearly a no-brainer.
To see why it was important enough for us to implement a system like this, let’s see what happens if a user account is hacked: For a user it doesn’t matter who’s responsible for the hack — whether it’s the service being hacked or the user’s account is compromised because they re-used a leaked password. It’s certainly no fun to hear that your user’s account has been hacked. As a platform provider you always need to check if someone gained access to private data or did some other bad things (e.g. adding unwanted content to the service). Either action always comes back to bite you and we certainly can’t blame our users for it.
Being able to limit the ways how accounts can be hacked, or manipulate any data on our platform in a harmful way, is enough reason for us to implement such a solution. We are aware that this can cause some inconvenience when choosing a password, but ultimately we care about the safety of our service and our users and hence, consider this step worth it.
National Institute of Standards and Technology ↩︎