The simple answer is … no! But how can that be? Surely if we were to assemble every scrap of available data about our systems and networks and their use, we should be able to find the veritable needles in the haystacks, given the right tools and sufficient time.
This is clearly an underlying presumption in the upcoming report by the Big Data Working Group of the Cloud Security Alliance (CSA). I was offered the privilege of reviewing the current draft of the report “Big Data Analytics for Security Intelligence,” which should be made public shortly.
A personal disclosure about the CSA … I strongly encouraged Jim Reavis to move ahead with the concept of such a group when he was chatting with colleagues to determine its potential viability. I have also participated in several of the CSA’s working groups and contributed to several reports. The Alliance was grown from strength to strength as security remains center stage for organizations contemplating cloud services.
Back to the report … It is an excellent depiction of how the analysis of big data can be used to significantly improve our knowledge and understanding of security events, particularly anomalous behaviors. However, as I read through the report, I became increasingly convinced that, even if one were able to effectively gather and analyze all relevant big data, there would still be a big something missing. The missing piece is the data not being created by applications and systems and hence not available for analysis.
I pointed to the situation where more useful data are often more difficult and expensive to collect and analyze in my article, “Accounting for Value and Uncertainty in Security Metrics,” ISACA Information Systems Control Journal (November 2008), which won the 2009 ISACA Michael P. Cangemi Best Book/Best Paper Award.
I further made a plea for generating and making available useful and timely data in “Creating Data from Applications for Detecting Stealth Attacks,” STSC CrossTalk: The Journal of Defense Software Engineering (September/October 2011). There is a vital need to anticipate security data requirements and ensure that they are included in software system design and development so that appropriate instrumentation is written into applications ahead of time.
While many (myself included) are very impressed (and should be) by the major advances in recent years in the analysis of both structured and unstructured big data, there are many clear examples where such data and analyses have been deficient. Indicative of this is that data breaches are supposedly not discovered until they have been active for the better part of a year on average. And even when they are discovered, organizations are often at a loss to determine when the attacks took place and what data were accessed, even when they bring in computer forensics experts, which they do after the fact rather than proactively.
So what should be done? For a start, information security professionals must be included in teams determining requirements for software systems and they must insist on having necessary security data generated by the system (it helps an awful lot to have senior management sponsorship and support for this). They then must arrange for the collection and analysis of the data and install alerting systems that notify them when unusual behaviors occur. This can be a substantial task and one that is likely to increase the costs and times of development lifecycles as well as increase the burden on operations staff. However, until and unless such instrumentation is built into software systems and the collected data carefully and completely analyzed, we stand little chance of catching the bad guys in the act or of being able to determine what they actually did when we eventually discover that our systems and data have been violated.