On Thursday, May 6, 2010, the Dow Jones index experienced a 1000-plus-point fall, followed by a rapid recovery of some 700 points. This event shocked traders, regulators, and the public alike. It came as a big surprise to many how a drop in stock prices (possibly a result of a data-input error, or an options trade, or … what?) could possibly trigger a domino effect on the many markets that now constitute the trading “ecosystem.” Days later the NYSE, NASD, regulators and others were still trying to figure out what exactly happened, and why. The good news was that the markets did in fact rebound … fortunately the bungee cord didn’t break.
As an aside, I, probably with many others, had thought of the bungee-jump analogy on seeing graphs of the price movements and, sure enough, the term was used in an article by Benyamin Appelbaum in The New York Times of May 9, 2010 with the title “Thursday’s Stock Free Fall May Prompt New Rules.” Quite by coincidence, a person by the name of Jacob Bunge contributed to several articles on the bungee-jump market in The Wall Street Journal, although I didn’t see them using that particular term.
Back to the main theme … It appears that the downward spike in the market could well have been due to a number of “systemic” idiosyncrasies, to put it mildly. Why am I not surprised? As with the subprime fiasco, major disruptions have occurred in areas so complex that only immense high-speed computers can deal with them … sometimes with catastrophic results. Part of the problem is that, while very bright humans designed, developed and tested these systems, they did so in relative isolation. True, the creators of these systems had to ensure that each component system “talked” to others, but in this case no one had apparently grasped that the sum of the somewhat understandable parts was greater than the incomprehensible whole. There were reportedly prior rumblings about the inconsistencies among the various types of stock exchange, but it appears that no one in authority insisted on creating a stable overall computer-based marketplace.
Which brings me to my main point … In my earlier columns of January 11, February 16 and February 22, 2010, I pushed the concept of negative testing, which I choose to call “functional security testing.” This is a form of testing which determines that systems do not do what they are not intended to do. It is not good enough to verify that systems function as they are supposed to do. One must also make sure that such systems do not behave in inconsistent or dangerous ways when subjected to particular inputs and that, if forced into failure mode, they fail gracefully. I wrote about this topic in an article, “Investing in Software Resiliency,” which was published in the September/October 2009 issue of STSC CrossTalk: The Journal of Defense Software Engineering.
Recently I had a chat with Chris Wysopal of Veracode on the subject of functional security testing. It was at the May 7, 2010 CSO Breakfast Meeting, which is an outstanding forum created and run by Bill Sieglein. Chris rightly pointed out that such testing can be huge with a virtually unbounded number of test cases. I agree. However, I suggest that, if testing is limited to first-order cases (that is, what happens when a single input, rather than a sequence of inputs, is entered) and if statistical sampling methods are used, then it might be somewhat tractable. I have certainly found such an approach to be beneficial, even though not all-encompassing.
Now, this type of negative testing might be doable for systems under the control of a single entity, but what do you do about a complex of systems that span many entities? Well, there are several possible approaches. If, as in the case of U.S. stock markets, they are ultimately under the jurisdiction of the same government regulator, such as the SEC, then that regulator can mandate broad-based testing across all participating market operations. Otherwise, industry groups, consortia or coalitions might be able to coordinate such testing among their membership.
Of course, such multi-entity testing exhibits orders-of-magnitude greater complexity than that done by a single organization. I bemoaned the increasing complexity of computer systems as far back as April/May 1994 in my regular Technology column in the short-lived, though excellent, magazine Securities Industry Management. The title of the piece was “The Death of K.I.S.S.” Today, we have not only much more complex systems and networks than 16 years ago, but the security technologies, which we use to protect them, are similarly complex.
So, what is the answer for determining system behavior across many organizations? One approach is to develop more macro-level simulation models of the various marketplaces. Such models can be used to determine impact for higher-level scenarios, such as the one that occurred on May 6, 2010, rather than at the individual program level. I personally helped initiate a project to create such a model for financial transactions. I am hopeful that models, such as this one, will provide insights that will both anticipate and explain unforeseen behaviors in response to major events.
One cannot expect organizations and their regulators to understand and control complexities developing in financial services and other sectors without the aid of sophisticated methods, such as advanced testing tools and simulation models. Yes, we can opt to remain in reactive mode, suffer the damaging consequences of catastrophic events, and then try to come up with solutions after the fact. How much better it would be if we could understand the impact of various disasters before they actually happen and modify our systems in advance so as to avoid the pain.