# 6 Theories of Probability and 6 Reasons Why They Matter to ISRA

While probably everyone would agree that information security risk analysis (ISRA) is shot through with appeals to probability , very few non-academic discussions of ISRA provide any sort of rigorous analysis of what “probability” means. (See Alberts and Dorofee 2003 for a notable exception.) In this blog post, I will try to fill that gap and provide a brief overview of the basic interpretations of probability. I will then explain why these distinctions matter to ISRA.

## 6 Theories of Probability

It may come as a surprise to some that there actually a variety of interpretations or theories of probability. All of the theories about probability can be divided into two groups: objective and non-objective.

### Objective Theories of Probability

Objective theories of probability define probability values in a way that makes them independent of opinion. For example, consider what it means to talk about the probability of a compromise of a web server. According to objective theories of probability, there is only correct probability value (or range of values), period, and it doesn’t matter what you or I or anyone else thinks. There are many different types of objective probability, including the frequency theory, the logical theory, and the propensity theory.

The classical theory of probability is the oldest and probably best-known theory of probability. It treats all possible outcomes as equally probable. Thus, given a fair, standard six-sided die, the probability that any one side will land is 1/6.

The aleatory or frequency theory of probability is arguably the best-known theory of probability; it is also an empirical theory that views probability as a feature of the natural world. Anyone who has taken a mathematics course on statistics or probability theory has learned frequency probability. The frequency theory of probability defines probability as the frequency with which an outcome appears in a long series of similar events (Gillies 2000, p. 1).  Note that what the mathematician calls “frequency,” the information security professional has traditionally called the “rate of occurrence.” (For example, the auto insurance industry computes the “rate of occurrence” of theft for your vehicle, based on historical data about thefts in your location, the number of thefts involving your type of car, your claim history, and so forth.) In order to apply the frequency interpretation of probability, one must have sufficient data in order to arrive at a statistically valid conclusion about the frequency of the event in question.

The logical theory of probability defines probability in terms of a logical relation between evidence and a hypothesis. In the words of Gillies, the logical theory “identifies probability with degree of rational belief” (Gillies 2000, p. 1). If the logical theory of probability is true, probability is always conditional upon evidence. The economist John Keynes is often credited with the first rigorous statement of logical probability. At first glance, the logical theory of probability might look redundant with the personal theory of probability, since both refer to “degree of belief.” For the same reason, one might wonder why the logical theory is not considered a subjective theory. The answer is that the personal theory is based upon degree of belief, whereas the logical theory is based upon the degree of rational belief.

The propensity theory of probability was developed by the philosopher Karl Popper. In the words of Ian Hacking, the propensity theory of probability defines probability in terms of “the tendency, disposition, or propensity of some chance setup” (see Hacking 2001, p. 145). Popper’s motivation for creating the propensity theory was to provide way to objectively assign a probability value to a singular event. (Popper was concerned with quantum mechanics, but the worry is equally applicable to certain types of events within information security.) The propensity theory allows us to assign probabilities to singular events, events for which there is no series of events that could form the basis for a frequency probability.

### Non-Objective Theories of Probability

Non-objective theories of probability define probability values according to the beliefs of individuals or groups of individuals. According to non-objective theories of probability, probability values represent degrees of belief. Since different people can have different degrees of belief about the same thing, it follows that if a non-objective theory of probability is true, then different people can assign different personal probability values to the same event.

There are two main types of non-objective theories of probability. The first is called the epistemic, subjective, belief-type, or personal theory of probability. The personal probability of a statement is a measure of the probability that a statement is true, given some stock of knowledge. In other words, personal probability measures a person’s degree of belief in a statement. The personal probability of a statement can vary from person to person and from time to time (based upon what knowledge a given person had at a given time) (see Skyrms, p. 23). For example, the personal probability that a factory worker Joe will get a pay raise might be different for Joe than it is for Joe’s supervisor, due to differences in their knowledge.

The second approach is the intersubjective theory of probability. That theory defines probability in terms of a social group’s degree of belief in a statement (Gillies, pp. 1-2). If some group of individuals, perhaps a team of information security consultants applying Thomas Peltier’s Facilitated Risk Analysis Process (see Peltier 2010), reach a consensus regarding the probability of a statement, then that value constitutes the team’s intersubjective probability for that statement.

Like the other types of probability values, non-objective probabilities represent numerical values. Although these values are rarely known precisely, they are real numerical values that must obey the axioms of the probability calculus . Thus, while the values may be subjective, they cannot be completely arbitrary.

### Example: Probability = 0.5

Consider the following problem: There is a fair coin that is about to be tossed. What is the probability of it landing heads? 

Classical Theory of Probability: There are two possible outcomes of a coin toss: heads and tails. Since each outcome is equally probable, the probability of heads is 0.5.

Frequency Theory of Probability: Given repeated throws of the coin, the actual observed relative frequency of the coin landing heads is the same as the actual observed relative frequency of the coin landing tails. Therefore, the probability of heads is 0.5.

Logical Theory of Probability: The logical relation between the evidence and the hypothesis that heads will land, and the relation between the evidence and the hypothesis of tails, is the same. So the probabilities are equal, and the rational degree of belief that the coin will land heads is 0.5.

Propensity Theory of Probability: The setup of the coin toss is arranged so that the propensity for the coin to land heads is the same as it is for tails. So the probability of heads is 0.5.

Personal Theory of Probability: My degree of belief that the coin will land heads is 0.5.

Intersubjective Theory of Probability: Our degree of belief that the coin will land heads is 0.5.

### Example: Probability = 0

Consider the following problem: What is the probability that a man is married, conditional upon him being a bachelor?

Classical Theory of Probability: A “married bachelor” is not a possible outcome. Therefore, the probability that a man is married, conditional upon him being a bachelor, is 0.

Frequency Theory of Probability: Given a long sequence of married men, there have been no observed instances of a married bachelor. Therefore, the probability that a man is married, conditional upon him being a bachelor, is 0.

Logical Theory of Probability: The concept of a “married bachelor” is a contradiction in terms. Therefore, the probability that a man is married, conditional upon him being a bachelor, is 0.

Propensity Theory of Probability: There is no tendency in the population of married men to be bachelors. Therefore, the probability that a man is married, conditional upon him being a bachelor, is 0.

Personal Theory of Probability: My degree of belief that a man is married, conditional upon him being a bachelor, is 0.

Intersubjective Theory of Probability: Our degree of belief that a man is married, conditional upon him being a bachelor, is 0.

### Example: Probability = 100%

Consider the following problem: What is the probability of an unmarried bachelor?

Classical Theory of Probability: There is only 1 type of bachelor: an unmarried one. Therefore, the probability that a man is unmarried, conditional upon him being a bachelor, is 100%.

Frequency Theory of Probability: Given a long sequence of observed bachelors, all of them have been unmarried. Therefore, the probability of an unmarried bachelor is 100%.

Logical Theory of Probability: The concept of an “unmarried bachelor” is a tautology. Therefore, the probability of a married bachelor is 100%.

Propensity Theory of Probability: There is a universal tendency among bachelors to be unmarried. Therefore, the probability of an unmarried bachelor is 100%.

Personal Theory of Probability: My degree of belief that there is an unmarried bachelor is 100%.

Intersubjective Theory of Probability: Our degree of belief that there is an unmarried bachelor is 100%.

## Why These Theories Matter to ISRA

Why does this philosophical distinction between objective and subjective theories of probability matter to security specialists?

(1) The theories provide  much-needed clarification of the meaning of “probability.” At risk of stating the obvious, one reason these distinctions matter is because they get at the heart of what we mean by “probability.” For example, consider a system administrator who has not applied the latest security patches to his servers as quickly as the security engineer would like. Let Pr(C | ~P) represent the probability of system compromise, conditional upon not having applied the patches. The system administrator’s estimate of Pr(C | ~P) may well be lower than the security engineer’s  estimate. According to the personal theory of probability, each individual is simply measuring their individual degree of belief in C conditional upon not P. Thus, the two individuals do not strictly disagree with each other.

If this seems counterintuitive, imagine two people talking about their favorite flavor of ice cream. Suppose that both of them use a personal theory of preference to describe their favorite flavor. The first person says that chocolate is their favorite flavor, while the second person prefers vanilla. The statement, “I, person A, prefer chocolate,” and the statement, “I, person B, prefer vanilla,” do not contradict each other. According to the personal theory of preference, there could only be a disagreement if the first said, “I, person A, prefer chocolate,” and then the second person, “No, person A, you prefer vanilla.”

A similar situation applies to our earlier example involving the security engineer and the system administrator. Consider the following statements.

(a) System administrator: “I estimate the value of Pr(C | ~P) is 0.1.”

(b) Security engineer: “I estimate the value of Pr(C | ~P) is 0.25.”

If both the system administrator and security engineer are using the personal interpretation, we can use the definition of personal probabilities to show why these two statements are not literally in contradiction.

(a’) System administrator: “My degree of belief that C will occur, conditional upon ~P, is 0.1.”

(b’) Security engineer: “My degree of belief that C will occur, conditional upon ~P, is 0.25.”

Since each person is merely describing their own degree of belief, (a’) and (b’) are not contradictory.

(2) Different ISRA methodologies use different probability theories. Most quantitative ISRA methodologies rely upon on a frequency theory of probability, while qualitative methodologies tend to rely on a non-objective theory of probability. While I have not discussed the arguments for and against each of the interpretations of probability, suffice it to say there is an entire literature devoted to the subject (see Gillies 2000 for a comprehensive discussion). A genuine criticism of a probability theory is automatically a criticism of any ISRA methodology that relies upon it.

(3) The theories suggest audit or test procedures for validating  probability estimates. The theories clarify what sort of justification is needed for a probability value in an ISRA. For example, the frequency approach requires data about what sorts of events happen in the long run. Thus, if an auditor has been asked to determine whether an organization has performed an ISRA, one test an auditor could perform would be to request empirical evidence about the series of events used to calculate the frequency probability. If that evidence is available, the auditor could then perform a further test by double-checking the math used to derive the frequency probability.

Similarly, when testing personal or intersubjective probability estimates, the auditor could request evidence that the person(s) providing those estimates have been sufficiently calibrated so that their estimates fall within, say, a 90% confidence level. If the estimator(s) have not been calibrated, then the auditor could reject those portions of the ISRA based upon the uncalibrated personal or intersubjective probability estimates.

(4) The theories clarify the scope of different interpretations of probability. Again, the frequency approach defines “probability” in the context of a long sequence of events; it is controversial whether the frequency interpretation can even be applied to a single event, such as this server, this application, this vulnerability. Indeed, the problem of the single-case probability was one of the reasons, if not the reason, why Popper originally developed the propensity theory.

(5) The theories provide insight into the common worry about ISRA. When critics of risk-based security talk about the “lack of actuarial data” for calculating the Annual Loss Expectancy (ALE) for information security incidents, they are what Ian Hacking calls “frequency dogmatists:” they are implicitly presupposing that the frequency theory is the one and only way to understand probability (Hacking 2000, p. 140). As the above discussion should make clear, however, the frequency theory is not the only game in town.

(6) The theories can be used in a complementary fashion. There is no need to adopt a “one-size-fits-all” approach to interpretations of probability; one can instead take an eclectic approach and use different interpretations can be used in different contexts (See Hacking 2001, pp. 140-141; Gillies 2000, pp. 180-205). For example, one approach would be to use the frequency interpretation where feasible, but then use the personal or intersubjective interpretation (with calibrated experts) when that is not feasible.

## References

Alberts, Christopher and Audrey Dorofee. Managing Information Security Risks: The OCTAVE Approach (Boston: Pearson, 2003).

Gillies, Donald. Philosophical Theories of Probability (New York: Routledge, 2000).

Hacking, Ian. An Introduction to Probability and Inductive Logic (New York: Cambridge University Press, 2001).

Peltier, Thomas R. Information Security Risk Analysis (third ed., Boca Raton, Florida: Auerbach, 2010).

Skyrms, Brian. Choice & Chance: An Introduction to Inductive Logic (4th ed., Belmont: Wadsworth, 2000).