Data Classification: Begin With Your Personally Identifiable Information

Let’s face it: Data classification—despite being an information security “best practice”— is an expensive, time-consuming, labor-intensive task. For those organizations supporting thousands (or even hundreds) of applications and databases, the job of identifying all data elements and classifying each into categories of varying degrees of sensitivity is daunting and, perhaps, impossible. However, the plethora of privacy-related mandates issued by federal and state governments has, probably unintentionally, eased this burden; the onerous effort of classifying data has been provided a clear focus and point of departure. More specifically, if your organization processes or stores health-related or financial information that can be associated with specific individuals, you must begin by identifying Personally Identifiable Information and classifying it at the highest level of sensitivity.

The Privacy Act of 1974 mandated that the resources of information security professionals must be deployed to ensure the privacy of information concerning individuals when that information is maintained by federal agencies in the United States. However, beginning in the late 1990s, the relationship between information security and privacy has broadened to include data stored or transmitted by private businesses. In addition, federal and state governments commenced to implement an assortment of legislative initiatives intended to prevent unauthorized access to or modification of personally identifiable information.

The Gramm-Leach-Bliley Act, enacted in November, 1999, required financial services institutions to establish security controls to ensure that the “nonpublic personal information” of consumers is not disclosed to unauthorized third parties. The Act, and its accompanying regulatory guidance, did not provide an unambiguous listing of data elements that comprise “nonpublic personal information.” Rather, these data were described as any information “provided by a consumer to a financial institution, or resulting from a transaction with the consumer or any service performed for the consumer, or otherwise obtained by the financial institution.” The objective of this component of the legislation was to prevent an unauthorized individual from obtaining a consumer’s personal information and then using this information to conduct financial transactions (“identity theft”). Gramm-Leach-Bliley requires financial institutions to establish written information security programs that describe technical and administrative controls established to safeguard the privacy of consumer nonpublic personal information.

The Health Insurance Portability and Accountability Act (HIPAA), which was fully implemented in July, 2006, mandated that health insurance companies, clearinghouses, and individual medical providers must safeguard health information if that information can be associated, or identified with, a specific individual. The central intent of this legislation, as with Gramm-Leach-Bliley, is to protect information against unauthorized disclosure. HIPAA, however, was not primarily enacted as a safeguard against identity theft; rather, this Act sought to prevent the possible misuse of confidential medical information by unauthorized third parties.

HIPAA includes both a “Privacy Rule” and a “Security Rule.” The Privacy Rule specifies the kinds of data that must be protected and the conditions under which confidentiality is required. The Security Rule, which complements the Privacy Rule, applies to information that is electronically stored or transmitted. HIPAA’s Security Rule requires that information security controls, such as access control mechanisms and encryption technology, must be applied to individually identifiable health information. Several data elements—including patient name, telephone number, fax number, Social Security number, medical record number, and email address—are considered “identifiers” of medical information. Thus, medical data associated with these identifiers are considered highly sensitive and must receive appropriate security control.

In July, 2003, the California legislature enacted a statute requiring any company that conducts business with a state resident to inform the resident if his or her “personal data” have been, or may have been, purposefully or accidentally disclosed to an unauthorized third party. The California statute, SB 1386, provides a very clear definition of the types of data that, if disclosed, must trigger the customer notification requirement: “Personal information” is an individual’s name in combination with one or more of the following data elements when either the name or the data element is not encrypted or otherwise rendered unreadable or unusable:

  • Social Security number;
  • Driver’s license or state identification card number; or
  • Account number or credit or debit card number in combination with any required security code, access code or password that would permit access to an individual’s financial account.

The purpose of this legislation is to provide customers with timely notification that sensitive personal information may have been compromised and that appropriate action should be taken (e.g., notifying credit card companies) to prevent identity theft. As of April, 2007, 33 states and one municipality (New York City) had enacted “data breach notification” laws; most are similar to the California legislation, but many states have implemented unique provisions that are not duplicated elsewhere.

Information security is explicitly involved with the enforcement of these laws because of the requirement that customers must be notified only if protected information is disclosed in an unencrypted form. Thus, information security professionals are obligated to encrypt sensitive information to ensure that a potential or actual data breach will not damage a company’s reputation or possibly result in costly litigation.

The continuing proliferation of privacy-related legislation has established an awkward terrain within which information security professionals must navigate. Gramm-Leach-Bliley, HIPAA, and the state “data breach notification laws” mandate that an assortment of data elements must be protected against unauthorized disclosure. The enabling legislation occasionally provides very specific examples of protected data; state laws, for example, usually categorize a customer name and associated driver’s license or state identification card number as “personal information.” However, Gramm-Leach-Bliley and HIPAA prefer more ambiguous definitions of personally identifiable information.

For information security professionals, the most prudent method to ensure compliance with legislative mandates is to ensure that all data associated with individual customers or employees are provided the highest level of security control. In order to implement this control, the first and most critical task of risk management must occur: data classification.

Privacy-related information—data elements that, either singly or in combination with other data, pertain to an individual—must be classified in the most secure category. Organizations may designate this category as “Top Secret,” “Personally Identifiable Information,” “Restricted,” “Confidential,” “Sensitive,” or by a similarly descriptive label. Many compliance officers prefer that classifications should be self-explanatory; thus, the terms “Nonpublic Personal Information” or “Personally Identifiable Information” may be preferable as the type of classification that describes privacy-related data. However, the specific name selected is less significant than the establishment of a category that explicitly includes data that are associated with individual persons. In addition, a written policy, explaining the various kinds of data included within the classification and accompanied by easily comprehended examples, must be developed and published for internal organizational use. Information Security will contribute to the formulation of this policy, although the actual document may be authored by the legal department, compliance, or a similar control function. The policy should be accessible to all employees.

The task of identifying privacy-related data—which also includes documenting the electronic locations of these data and the methods by which data are communicated within the organization and to external entities—is best performed by business stakeholders and the technologists who support systems and applications used by the stakeholders. However, information security, in its role as enforcer of mandated privacy regulations, is not a merely passive observer of the identification and classification processes. In fact, information security personnel must ensure that classification occurs in a manner that will permit technical security controls (e.g., encryption, secure architecture design, access control mechanisms) properly to perform their intended functions.

Although the data classification effort is concerned with the identification of information associated with “natural persons” (legalese for “individual human beings”), information security professionals rarely implement controls intended to protect specific data elements. As a practical matter, applications, databases, files, email messages, or physical media (such as USB drives or magnetic tapes) are encrypted—not just the social security or customer account numbers contained within the applications or databases. Thus, information security personnel must be able to identify, not only the kinds of data that are classified as personally identifiable information, but also the broader technical contexts within which the information reside. If, for example, an organization frequently transmits email messages containing its customers’ nonpublic personal information, then information security must implement an appropriate email encryption solution that is acceptable to business units within the organization and to the customers.

In order to address the privacy-related mandates, however, it is not necessary to perform an enterprise-wide data classification effort. Rather, it is required only to identify those data elements that pertain to individual customers or employees. Although the scope of this work should not be minimized, it is far less formidable than a full classification of all available data. In other words: Begin with your personally identifiable information.

Post a Comment

Your email is never published nor shared. Required fields are marked *