The Shortfalls of Mean Time Metrics in Cybersecurity

Time Metrics in Cybersecurity
Time Metrics in Cybersecurity

Security teams at mid-sized organizations are constantly faced with the question of “what does success look like?”. At ActZero, their continued data-driven approach to cybersecurity invites them to grapple daily with measuring, evaluating, and validating the work they do on behalf of their customers.

Like most, they initially turned toward the standard metrics used in cybersecurity, built around a “Mean Time to X” (MTTX) formula, where X indicates a specific milestone in the attack lifecycle. In this formula, these milestones include factors like Detect, Alert, Respond, Recover, or even Remediate when necessary.

However, as they started to operationalize their unique AI and machine-learning approach, they realized that “speed” measures weren’t giving them a holistic view of the story. More importantly, simply measuring just speed wasn’t as applicable in an industry where machine-driven alerts and responses were happening in fractions of seconds.

So, instead of focusing solely on the old MTTX formula, they borrowed a long-standing idea from another time-sensitive industry: video streaming. Leading streaming platforms like Netflix, YouTube, and Amazon care about two core principles: speed and signal quality. Simply put: when streaming a video, it should arrive reliably within a certain time (Speed), and your video should look great when it does (Quality). Let’s face it: who cares if the video stream carrying your team’s game shows up on your screen fast if you can’t see them score the goal!

This speed and quality concept squarely applies to cybersecurity alerts as well: it’s critical that alerts are arriving reliably within a certain time (Speed), and that those alerts aren’t wrong (Quality). In the case of cybersecurity, it doesn’t matter how quickly you alert on detection that is wrong (or worse, you get buried by “wrong” detections).

So as they took a step back to assess how they could improve their measurement of success, they borrowed a simple yet incredibly powerful measure from their video streaming colleagues: Signal-to-Noise Ratio (SNR). SNR is the ratio of the amount of desired information received (“signal”) to the amount of undesired information received (“noise”). Success is then measured by a high signal with minimal noise – while maintaining specific TTX targets. It’s important to note the lack of “mean” here, but more on that later.

In order to better understand how considering SNR as well will service your SOC better, let’s walk through three key shortcomings of Mean Time metrics. By understanding SNR for cybersecurity, you’ll be better equipped to assess security providers in a market with a fastly growing number of AI-driven solutions, and you’ll have a better signal of what makes for a quality detection (rather than a fast but inaccurate one).

1 Outliers influence mean times

Means are averages and, therefore, can smooth volatile data values and hide important trends. When we calculate an average TTX, we are really saying 50% of the time we are better than our average, and 50% of the time we are worse. Therefore, when they discuss means at ActZero, they always use “total percentage n” for more accuracy to understand what percentage of the time the mean is applicable. When they say TTX of 5 seconds at TP99, they’re really saying 99 out of 100 times, they hit a TTX of 5 seconds. This total percentage helps you understand how likely it is that your incident will be an actual “outlier” and cost you days of remediation and potential downtime.

2 Mean times = legacy metric

As a measurement standard, mean times are a legacy paradigm brought over from call centers many eons ago. Over the years, cybersecurity leaders adopted similar metrics because IT departments were familiar with them.

In today’s reality, mean times don’t map directly to the type of work we do in cybersecurity, and we can’t entirely generalize them to be meaningful indicators across the attack lifecycle. While these averages might convey speed relative to specific parts of the attack lifecycle, they don’t provide any actionable information other than potentially telling you to hurry up. In the best-case scenario, MTTX becomes a vanity metric that looks great on an executive dashboard but provides little actual business intelligence.

3 Signal-to-noise ratio measures quality detections

The fastest MTTX is not worth anything if it measures the creation of an inaccurate alert. We want mean time metrics to tell us about actual alerts, or true positives and not be skewed by bad data.

So, you might be thinking, “how does an untuned MTTX tell you about the quality of work your security provider does, or how safe it makes your systems?” And you would be correct in questioning that, as it doesn’t.

If you truly want to understand the efficacy of your security provider, you have to understand (1) the breadth of coverage and (2) the quality of detections. The speed vs. quality challenge is why we think (and measure success) in terms of SNR rather than mean times.

For security providers or those running a SOC in-house, it’s the signal of quality detections relative to the mass amounts of benign or other noise that will enable you to understand your SNR and use it to drive operational efficiency. And, when it comes time for that quarterly executive update, you will be able to tell a much stronger and valuable story about your cybersecurity efforts than MTTX on a dashboard ever could.

Action item: Look at how many quality detections your cybersecurity provider raises relative to the number of inaccurate alerts to understand the real measure of how successful they are at keeping your systems safe.

How ActZero is helping customers like you

There are better measures than MTTX to evaluate cybersecurity efficacy. They recommend thinking in terms of signal-to-noise to better measure the quality and breadth of detections made by your security provider. New metrics like signal-to-noise will be crucial as cybersecurity solutions are empowered through AI and machine learning to react at machine speed.

To explore our thinking on this more deeply, check out their white paper in collaboration with Tech Target, “Contextualizing Mean Time Metrics to Improve Evaluation of Cybersecurity Vendors.”

Note — This article is contributed and written by Jerry Heinz, VP of Engineering at ActZero.ai. He is an industry veteran with over 22 years of experience in product design and engineering. As the VP of Engineering at ActZero, Jerry drives the company’s Research and Development efforts in its evolution as the industry’s leading Managed Detection and Response service provider.

ActZero.ai is a cybersecurity startup that makes small- and mid-size businesses more secure by empowering teams to cover more ground with fewer internal resources. Our intelligent managed detection and response service provides 24/7 monitoring, protection, and response support that goes well beyond other third-party software solutions. Our teams of data scientists leverage cutting-edge technologies like AI and ML to scale resources, identify vulnerabilities and eliminate more threats in less time. We actively partner with our customers to drive security engineering, increase internal efficiencies and effectiveness and, ultimately, build a mature cybersecurity posture. Whether shoring up an existing security strategy or serving as the primary line of defense, ActZero enables business growth by empowering customers to cover more ground. For more information, visit https://actzero.ai