Unmasking Masks | Circadian

You can't go long in the hacking world until coming across masking. Masking, much like dictionary attacks, makes the infeasible entirely possible.

Once again, the clever alignment of computing power with already-known patterns of user behaviour has the power to create a very serious headache for the corporate whose data has just gone awol.

A re-hash
We talked about keyspace before, but a quick refresh: a one letter password (not recommended!) takes up 95 spaces (the number of possibilities that our letter/password could inhabit); a two letter password a space of 95 x 95 etc. and of course after N letters this becomes a very big possibility space (and big spaces take longer to search, even for computers).

Timing is everything: search wisely
The key idea behind masking is to marshal our search space in accordance with our beliefs about users' actual passwords choices. Instead of 'bruting' where we attack (i.e. simulate passwords) the key-space in a random way, we instead tailor (and/or order) our search to those sectors most likely to yield success. In other words, better to generate simulations such as 'myship123' than 'i92v@5G'.

This way we generate results faster. And time is a precious commodity in any game of cat and mouse.

Hashcat explains all of this brilliantly here.

Alphabet Soup
So, the general approach is that we consider each character in our password, reducing each sequentially to its base alphabet, either L (lower), U (upper), D (digit), or S (symbol), with a final catch-all, A (anything).

In other words, entries such as 'myship123' reduce to LLLLLDDD. As you can already imagine, this L+D+ (where the + means 1 or more) pattern is very popular with users. The LDDLSUU (representing i92v@5G password) pattern, less so.

I'll have an L please. Followed by a D, followed by a U. No S's thank you.
Let's use some hard data to get an idea of what this all might mean in practice. We empirically know that most passwords are between 6-8 characters so let's not include any masks greater than length 8. We also know that a typical mask distribution looks something like this -

So, the first 66 buckets contribute the vast majority of all passwords, and they do so in strongly decreasing importance. We might then decide, for speed, that we only want to look the first 15 buckets thereby allowing us to search a keyspace that is likely to pick up c.60% of all potential passwords.

My view might be that Bucket 1 is LLLLLL, Bucket 2 LLLLLLL, and Bucket 3 LLLLLLLL (i.e. 6, 7 or 8 lowercase) , Bucket 4 DDDDDD (6 digits) etc. I used the term "might be" at the start of the preceding sentence only because that is my data-informed view today, but that view may (read: will) change as more data (read: data breaches) come to light and we spot changes in users' passwording behaviour. That data-informed view may also change depending upon region (Chinese prefer digits, in particular 8) and provenance of the dataset (e.g. expect to see 'eBay' more often in an eBay breach).

We now limit search to those keyspaces only, remembering for example that searching Bucket 1 is much simpler than searching for all 6 letter passwords (26^6 possibilities rather than 95^6 possibilities), the same goes for Buckets 2 and 3.

Bucket 4 of course - and digit passwords are very popular - occupies only about 1/10th of the space per character (10 possibilities vs 95).

Efficiency Savings
Add up all those efficiency savings and in a eight character search you reduce a problem which takes up entropy space of 52.6 (that is log2(95^8)) to 38.3.

As ever, the gap between the two (14.3) should be thought of like a factor or efficiency scaling, and equates to around x20,000 (2^14.3).

Corporate Impact: business as usual or management clearout?
It always makes sense to analogise such numbers: with cracking speeds of 100mm/sec (easily achievable these days) that efficiency scaling is the difference between the hacker cracking over half your database in under an hour versus more than two years.

For the corporate that has just lost control of its data, that's the difference between containment of the problem (with interim password resets, regulator, media handling etc.), and a hugely damaging leak.

Further Reading
If you would like to read more on some of the theory behind the ideas there's a nice paper "Passwords - Divided the Stand, United they Fall" here.

Brute Force or nimble Mask?