Protecting Against ‘Natural’ Cybersecurity Erosion
Every child who’s ever played a board game understands that the act of rolling dice yields an unpredictable result. In fact, that’s why children’s board games use dice in the first place: to ensure a random outcome that is (from a macro point of view, at least) about the same likelihood each time the die is thrown.
Consider for a moment what would happen if someone replaced the dice used in one of those board games with weighted dice — say dice that were 10 percent more likely to come up “6” than any other number. Would you notice? The realistic answer is probably not. You’d probably need hundreds of dice rolls before anything would seem fishy about the outcomes — and you’d need thousands of rolls before you could prove it.
A subtle shift like that, in large part because the outcome is expected to be uncertain, makes it almost impossible to differentiate a level playing field from a biased one at a glance.
This is true in security too. Security outcomes are not always entirely deterministic or directly causal. That means, for example, that you could do everything right and still get hacked — or you could do nothing right and, through sheer luck, avoid it.
The business of security, then, lies in increasing the odds of the desirable outcomes while decreasing the odds of undesirable ones. It’s more like playing poker than following a recipe.
There are two ramifications of this. The first is the truism that every practitioner learns early on — that security return on investment is difficult to calculate.
The second and more subtle implication is that slow and non-obvious unbalancing of the odds is particularly dangerous. It’s difficult to spot, difficult to correct, and can undermine your efforts without you becoming any the wiser. Unless you’ve planned for and baked in mechanisms to monitor for that, you probably won’t see it — let alone have the ability to correct for it.
Slow Erosion
Now, if this decrease in security control/countermeasure efficacy sounds farfetched to you, I’d argue there are actually a number of ways that efficacy can erode slowly over time.
Consider first that allocation of staff isn’t static and that team members aren’t fungible. This means that a reduction in staff can cause a given tool or control to have fewer touchpoints, in turn decreasing the tool’s utility in your program. It means a reallocation of responsibilities can impact effectiveness when one engineer is less skilled or has less experience than another.
Likewise, changes in technology itself can impact effectiveness. Remember the impact that moving to virtualization had on intrusion detection system deployments a few years back? In that case, a technology change (virtualization) decreased the ability of an existing control (IDS) to perform as expected.
This happens routinely and is currently an issue as we adopt machine learning, increase use of cloud services, move to serverless computing, and adopt containers.
There’s also a natural erosion that’s part and parcel of human nature. Consider budget allocation. An organization that hasn’t been victimized by a breach might look to shave dollars off technology spending — or fail to invest in a manner that keeps pace with expanding technology.
Its management might conclude that since reductions in prior years had no observable adverse effect, the system should be able to bear more cuts. Because the overall outcome is probability-based, that conclusion might be right — even though the organization gradually might be increasing the possibility of something catastrophic occurring.
Planning Around Erosion
The overall point here is that these shifts are to be expected over time. However, anticipating shifts — and building in instrumentation to know about them — separates the best programs from the merely adequate. So how can we build this level of understanding and future-proofing into our programs?
To begin with, there is no shortage of risk models and measurement approaches, systems security engineering capability models (e.g. NIST SP800-160 and ISO/IEC 21827), maturity models, and the like — but the one thing they all have in common is establishing some mechanism to be able to measure the overall impact to the organization based on specific controls within that system.
The lens you pick — risk, efficiency/cost, capability, etc. — is up to you, but at a minimum the approach should be able to give you information frequently enough to understand how well specific elements perform in a manner that lets you evaluate your program over time.
There are two sub-components here: First, the value provided by each control to the overall program; and second, the degree to which changes to a given control impact it.
The first set of data is basically risk management — building out an understanding of the value of each control so that you know what its overall value is to your program. If you’ve followed a risk management model to select controls in the first place, chances are you have the data already.
If you haven’t, a risk-management exercise (when done in a systematic way) can give you this perspective. Essentially, the goal is to understand the role of a given control in supporting your risk/operational program. Will some of this be educated guesswork? Sure. But establishing a working model at a macro level (that can be improved or honed down the road) means that micro changes to individual controls can be put in context.
The second part is building out instrumentation for each of the supporting controls, such that you can understand the impact of changes (either positively or negatively) to that control’s performance.
As you might imagine, the way you measure each control will be different, but systematically asking the question, “How do I know this control is working?” — and building in ways to measure the answer — should be part of any robust security metrics effort.
This lets you understand the overall role and intent of the control against the broader program backdrop, which in turn means that changes to it can be contextualized in light of what you ultimately are trying to accomplish.
Having a metrics program that doesn’t provide the ability to do this is like having a jetliner cockpit that’s missing the altimeter. It’s missing one of the most important pieces of data — from a program management perspective, at least.
The point is, if you’re not looking at risk systematically, one strong argument for why you should do so is the natural, gradual erosion of control effectiveness that can occur once a given control is implemented. If you’re not already doing this, now might be a good time to start.
Leave a Reply
Want to join the discussion?Feel free to contribute!