It is a common theme of mine, but one which bears repeating. We collect and disseminate all manner of data, but not so much of data which count. In an article by mathematician Hannah Fry in the March 29, 2021 edition of The New Yorker, she reviews two new books on data and statistics. The article has the title “What Really Counts: When it comes to people—and policy—numbers are both powerful and perilous” and is available, under the title “What Data Can’t Do” at What Data Can’t Do | The New Yorker 
The books by Deborah Stone and Tim Harford are well worth pursuing for those wanting to delve more deeply into the subject. Here, I will just comment on the review article as it makes several very important points. One quote is: “Numbers can be at their most dangerous when they are used to control things rather than to understand them.” [emphasis added] This distinction is particularly important when it comes to artificial intelligence (AI) and machine learning (ML) and their application to cybersecurity.
For cybersecurity, it is the difference between intrusion detection and intrusion prevention systems (IDS vs. IPS). Intrusion detection systems let you know if an attack is in progress, but do nothing to stop it. Intrusion prevention systems, on the other hand, will actively block a purported attack, which can prevent further damage but also can be a problem if the IPS is mistaken.
For AI, the difference is even starker. With road vehicles, for example, you can have a lane departure warning system, which warns you that you are “over the line” but leaves it up to you to decide what to do. However, with a lane departure control system, the vehicle takes over automatically and “corrects” your position on the road, whether appropriate or not.
In Fry’s article, there is a very apt quote, as follows:
“To count well, we need humility to know what can’t and shouldn’t be counted.”
This is in contrast to the hubris displayed by some cybersecurity (and other) professionals. And it supports questioning the oft-used statement that you can only manage what you can measure. We have previously discussed the questionable nature of the latter assertion and now we have even greater reason for doubt.
The bottom line is that we should be circumspect when it comes to using data to further our understanding of situations and we need to be confident that actions we take are based on significant—not just convenient—data.