Application Decommissioning for Information Security, Step 5: Archive and Manage the Content

While every organization has applications that are no longer needed, the question is how to go about decommissioning these dormant applications—and the associated content repositories.

In previous posts in this series, I discussed:

Now, we’re on to Step 5 of application decommissioning for information security: Archiving the Content.

Archived content must still be accessed and managed.

Archiving content involves moving that content to a new location where it can still be accessed and managed, but where the costs and risks of keeping the content are reduced.

Successfully managing archived content requires:

  • Limiting access to authorized users
  • Supporting the ability to search and retrieve information, and
  • Addressing compliance and regulatory requirements (including purging content that’s no longer operationally needed or is past its legal life).

Key capabilities to look for in content archiving tools.

Note that different archiving tools address each of these needs in a different ways. Here are some key capabilities to look for:

  • Data analytics
  • Search
  • Tiered storage
  • Retention management
  • Litigation hold

1. You need data analytics.

Data Analytics. The ability to analyze the content stored in archives can provide powerful insights about the large data sets that are stored within them. This includes understanding which data is sensitive, ROT (Redundant Obsolete and Transitory), stale or orphaned.

There are two types of functions to consider:

  • Text analytics uses both “wrapper” information (that is, metadata about the file, such as date created, last accessed, etc.) and pattern matching of regular expressions
  • Auto-classification uses machine-learning and natural language processing that classifies unstructured data into buckets, e.g. invoices versus contracts, performance reviews versus job descriptions, etc.

2. Archives must be searchable.

Search. The ability to search archives for specific content is a basic requirement for archiving solutions. But some systems support only basic search of database records or the metadata that's defined for unstructured documents.

Sophisticated solutions, on the other hand, support full-text search and the ability to search for regular expression patterns. These technologies leverage dictionary lookups and understand synonyms. A case in point in the insurance industry might be: "car," "vehicle" or "auto."

3. Tiered storage should match the level of performance needed.

Tiered Storage. The ability to select the right storage technology and environment is essential to successful archiving. Because slower storage technologies cost less, you can match archived content to the tier of storage that provides the level of performance needed, but no more.

Solutions should include the capability for tier-based storage based on such rules as date last accessed, whether the data is sensitive, etc.

4. There needs to be a retention management program.

Retention Management. The larger the volume of content retained, the higher the cost to manage and the higher the risk in relation to data theft.

To lower costs and risks, content should be deleted as soon as it is no longer required to support operations or meet regulatory requirements. This is as true for data from decommissioned applications as it is for data from active apps.

The best archiving solutions include the ability to define and apply retention policies that assign different retention periods to different types of records. These then kick off workflows that delete records whose retention period has expired.

5. You need to manage records needed for litigation.

Litigation Hold. When an organization receives requests for documents associated with litigation, an archive solution must be able to lock down and preserve the records requested. It  also should provide audit capabilities to prove that the holds were applied properly.

An archive solution should include the ability to put data on legal hold and to identify the duration and reason for the hold. It also should be able to release that data when the hold is lifted.

Archiving tools also can be used to purge data.

In addition to archiving information, you should purge information that’s past its legal and operational life. Doing so not only leads to savings when it comes to storage, but also reduces legal and information security risks.

Any tool that can be used to archive typically also can purge that data. Tools should include audit trails and decision workflows to obtain reviews and approvals from data owners before data is deleted.

The final post in this series, Step 6, focuses on retiring applications which are no longer needed.

The Doculabs Application Decommissioning Blog Series

Step 1: Getting Your Stakeholders on Board

Step 2: Identify and Prioritize Systems to Retire

Step 3: Defining the Archiving Plan

Step 4: Cataloging Data Within the Target Application

Step 5: Archive and Manage the Content

Step 6: Retire Applications

The CISO's Six-Step Guide to Managing Application Risk

Rich Medina
Joe Shepley
I’m VP and Practice Lead, focusing on developing Doculabs’ InfoSec practice and its applications in a wide range of industries.