Information Governance: Get Data Classification Right First

As featured in Compliance Week – with Dirk Anderson, Managing Director, Coalfire

Bookmark and Share

Data classification is one of the most crucial elements of an effective information governance process—yet it’s also one that many companies fail to implement well. In its simplest terms, data classification is the process of categorizing data based on its level of sensitivity. When done properly, the classification of data helps a company determine the most appropriate level of safeguards and controls that need to be in place.

While we don't see this in practice in a lot of cases, data classification fundamentally is the first step to any sort of security or information risk-management program. Companies don't need to be wasting time and resources deploying firewalls and other information security controls for data that doesn't need protection, even though they very often do.  Data classification begins with answering the following questions:

  • What data, unstructured and structured, does your company have?
  • Where does the data reside?
  • What data is the company trying to protect?
  • What are the potential risks associated with each data set from a confidentiality, integrity, and availability perspective?

Take a step back, look at your business processes, and identify what information is used by those business processes. If your company does background checks, for example, do you store sensitive information about employees? Does HR collect data that may contain protected health information?

The way to get real clarity on data is to initiate a discussion that involves stakeholders from various parts of your organization—legal, compliance, human resources, IT, and the various business units. It’s a collaborative effort. To determine which information requires the most safeguards, companies should consider the security objectives they want to meet. Most companies make the mistake of just thinking about the confidentiality of information. The integrity and availability of that information, however, is just as critical, if not more critical.

Considered in that light, companies find that it's not only important to place restrictions on sensitive data, but also to guard against unauthorized changes or destruction of data. On the other end of the spectrum, it's equally important to ensure that the right data is easily accessible to authorized individuals when they need it.

The next step is to classify data into one of four categories:

  • Restricted: Requires the highest level of security controls. Examples include proprietary information and data protected by state or federal privacy rules and regulations.
  • Confidential: Information in which only specific groups of employees are allowed access. Examples include marketing plans, intellectual property, employee lists, and more.
  • Internal use: Information that pertains to employees only. Examples may include employment policies, social media polices, and acceptable use policies.
  • Public: Information with no sensitivity attached to it and likely will result in little or no risk if disclosed, altered, or destroyed—such as press releases.

It’s recommended that companies take the data classification process one step further by tagging the information itself. For example, data classified as “internal use only” offers little insight into what type of data it is and what specific controls apply.  A more effective measure is to additionally tag the data as a specific type such as “employment policies” so you will have controls that apply to both categories of information for broader protection.  Also, try not to be “too refined” by having too many data categories; a minimalistic view is probably better. If you have more than ten categories of information, it's probably worth taking a step back and asking if you need to be that precise.

Once you’ve identified your most sensitive and valuable information, data stewards should be appointed to oversee the lifecycle of that information.  Data stewards will vary from organization to organization. Ultimately, they should be individuals who have day-to-day interaction with the information and are most familiar with it. The chief privacy officer, for example, may be the steward for sensitive data, which in some instances may be defined by privacy rules and regulations. The responsibility of each data steward is to then understand where the data is located, how it is being used, who has access to it, and how long that information is being retained. Data stewards also have responsibility to clarify how that data should be handled and what the ramifications will be (to the company and the employee) if the data is not handled in the appropriate way. How data should be protected and managed must be communicated as part of your acceptable use policy.

Another mistake companies make is having a static data-classification process. Organizations should be aware that data classification may change throughout the lifecycle. It's important for data stewards to re-evaluate the classification of information on a regular basis, based on changes to regulations and contractual obligations, as well as changes in the use of the data or its value to the company. A common example is a public company's earnings statement, which might be confidential until the date of the earnings announcement, at which time it becomes public.

Data classification is a process that needs to have support from the top. This is because data stewards need to be given the authority to make decisions around how to fully implement the data classification program, and also to ensure it is integrated into the company's business practices. We recommend training employees to understand the meaning of each classification and what safeguards need to be in place. Ask yourself how you are communicating those objectives to employees so they are clear about the company's expectations for handling data in specific ways.  Let them know what their specific responsibilities are in the lifecycle of that data. The most important thing to remember is that you get the data classification process going, and then worry about ironing out the kinks later. Even if data classification is not achieved immediately, it's the end result that matters most.