To recap briefly, we have identified and analyzed all our primary sources of user data and the system and service providers who consume those data. We have funding, developers, and a project plan to follow. We understand our provisioning process, have identified or built a directory of user attributes, and have generated policies and procedures to initialize and terminate users.
In short, we are finally ready to build our IDM system. The key to our success is crafting a web service that generates identities and binds a user to one. Because we are going to some pains to create and assign these identities and because your associates may over time be located in different geographies, be sourced by different systems or play different roles, there is significant benefit, from operational and compliance perspectives, to reconstruct their history with the organization. A single identity is your only hope of accomplishing this aggregation.
As you design your service, identify the attributes that are most distinctive about each associate. In certain instances, a government-issued identifier, like the Social Security Number (SSN-USA) or Social Insurance Number (SIN-Canada) may be available. These numbers are useful because they are unique identifiers assigned, managed, and maintained by a trusted source. Unfortunately, given the proliferation of and sensitivity to identity theft, use of such identifiers is severely constrained, even in those jurisdictions where they are available. However, as mentioned in a previous post, one of the advantages of building your own robust identity system is that you can both protect the identifier and prevent its proliferation elsewhere in your organization. Strong access controls ensure more sensitive attributes are only available on a business need to know basis.
Be aware though, that even a government identifier is no panacea. In my former experience building an enterprise identity system, our analysis of source systems identified multiple instances of SSNs and SINs with the same values requiring that we include a country code field to distinguish them. Our North America HR system was not so prescient and was forced to create “fake” identifiers for our Canadian staff with the same values as an American counterpart, leading to unfortunate technological gymnastics at tax reporting time.
So, use a government identifier where available. But for all the instances, particularly in a global organization or where the Privacy Office has run completely amuck, you will need more attributes. Within reason, more is better because each additional attribute, assuming it is from a trusted and maintained source, reduces the number of false positive results your service might identify. Examples of other useful attributes are:
- First Name
- Middle Name (Middle Initial is much less useful)
- Last Name
- Suffix (Jr., Sr. etc.)
- Date of Birth (year is helpful but access to it could be limited by privacy concerns. Month and Date are a must, though)
- Maiden Name
- Gender (sometimes also limited but very useful particularly where name is not an evident indicator)
- Previous company employment (only useful if historical data are accessible from your source systems and mergers over time haven’t completely obscured formerly independent components of the organization)
You may have identified others but ensure that they are valid, maintained, and enduring. For example, college of attendance or current address could change over time, though place of birth would not; however, providing data integrity for the latter can be a challenge (e.g., I could be born in Dorchester, Massachusetts or Boston – the former is part of the latter and is often used by residents as their city name). Once you have gathered your attributes, determine their level of criticality. Other than date of birth, most other attributes, at least, theoretically can change. Assign a weighting to each attribute.
Each source system, either in batch or transactionally (we used both approaches simultaneously, depending on the system’s capabilities), sends an agreed upon data set for each new record. Your web service retrieves the records and compares them to your data store of existing identities. Ideally you have leveraged your most complete and accurate directory for this, or have used multiple vetted sources to build your own. For each attribute matched, assign the record the point value of the attribute type (e.g.,, 10 for government identifier and date of birth, 9 for last name) and once all the matches have been made compare the score to the expected score if every attribute matched perfectly. Set a baseline for a definite match, realizing that not all attributes may be the same because of typing errors or missing fields. If using a transactional approach return as XML the existing organizational identifier for that person and provide some way – we used a dialog box – of letting the user doing the data entry to review and accept the proposed update.
Give the user an “opt-out” option in case their review suggests the proposed identity is not for the record in question. For instance, consider having a capability to flag records of deceased staff and VIPs so their records are never part of a results set. An opt-out choice should also generate an alert to identity program staff that there may be an issue with the matching protocol.
If there are several potential matches, rank them by score and have at least the top three to five results displayed to the data entry user and allow them to pick one or none. In this case, opting out should prompt the data entry user to create a new id, thereby initiating a new call to the identity store that returns the next incremental identifier. When no matches exist, a new identifier is automatically returned.
Batch processes would work similarly but some sort of exception reporting would be necessary identifying the various types of matches and probably some sort of holding area would be necessary to ensure that source system records are not updated until proposed matches have been accepted and new identity creation initiated. We actually found that the transactional approach was more manageable assuming the source system was able to consume the XML. Those systems also need a way to ensure that their newly created records could not migrate to downstream systems until the identities are assigned. That constraint ensures full compliance with a managed identifier for all.
In future columns on provisioning, governance,contractor management, and access control reviews, I will describe how to best leverage the unique identifier throughout the organization. By building our own identity management system, we were able to completely change for the better the operational and security culture of a complex global organization. I hope you experience the same benefit. Please feel free to share your stories.