How to Create a Metadata Framework

Many of our clients are busy planning their move to the cloud –to next-generation content services, ECM platforms, or Office 365. Often there is executive pressure on the IT organization to move quickly— and this can lead to moving content to the new platform without the time to plan a proper information architecture, content analysis, or migration plan.

The most important part of the information architecture is the metadata framework— it’s critical to getting the maximum benefits from your ECM investment. A metadata framework is simply a small collection of three types of metadata that are used together to make managing and finding content more efficient. It’s an essential step that’s often overlooked. If your migrated content isn’t tagged appropriately and accurately, users will have a hard time finding it—if they can find it at all.

[Revised and updated on May 11, 2021. Originally published on April 4, 2019.]

Benefits of a Metadata Framework

The four benefits of a good metadata framework are findability, re-use, user experience, and manageability

  • Improved findability. Search results are dramatically improved by using search filters and facets. Content can also be delivered dynamically, based on the user.
  • Improved reuse. Documents can be accesses, shared, and reused across processes and lines of business; document duplication should be dramatically reduced.
  • Improved user experience. For most departments, users will assign a single tag and all metadata will be automatically applied.
  • Improved manageability. Records and security will be improved because every document will have the correct metadata to enforce compliance.

Need help with your metadata strategy? Or other governance issues? Click here for a conversation about how we can help.

Building Blocks of a Metadata Framework

The process of building a metadata framework remains the same regardless of industry or size. It’s important to approach metadata modularly, and to avoid overcomplicating the process. Simplicity and flexibility are essential

Simply put, your framework should include a set of global metadata (applicable to all information across the enterprise), supplemented by a set of local metadata (applicable to departments, functions, or processes). The third component is Document Type – which is itself a type of metadata – but I like to think of document types as separate from Global and Local. Document types have a specific function in the framework, and as we’ll discuss shortly, they are a separate part of the information gathering process when developing your metadata framework. Together, this power trio enable dramatic improvements in the ways you manage and deliver content in your organization. More importantly, the end user experience is simplified: Finding information is faster, more reliable, and with greater confidence that the correct document (and version) has been retrieved. Hasn’t this always been the dream of effective content management?

metadata_simplicity and flexibility

Furthermore, the global and local metadata (including document types) that are collected as part of your framework can be used to create simplified navigation, sorting, and those checkbox filters that are so commonplace in our online shopping experience today.

Here’s a recap of the three major components of your framework:

Global metadata

Used to ensure efficient document management. Tagging documents with global metadata helps ensure proper document security, enables content lifecycle management, and can improve regulatory compliance.

Local metadata

Specific to each business area, local tags help users find the documents they need by enabling advanced navigation and search capabilities.

Document types

These are a special variety of global metadata and it’s useful to treat them separately. They are business-friendly names used to describe the documents created, stored, and used during business processes. They can also be used to allow multiple, related metadata values to be automatically applied when a user uploads a document.

Steps to Create a Metadata Framework

Most of the information you need for a metadata framework will come directly from talking with business users. They know their own processes, content, and organizational principles. If your stakeholders are currently storing information in network shared drives, they’ll likely have great ideas about how they’d like to “improve the current, complicated folder structure,” which is your cue to introduce them to a better way— folderless content management.

1. Define Standard Global Metadata Fields

Global metadata are data that are assigned to all content, regardless of who generates it, where it is stored, or what location, division, or country it is associated with. Expect to define 8 to 10 global fields. The good news is that these values rarely differ from company to company, so you may be able to use the list of examples below as your starting point. Generally – IT, Legal, RIM (Records and Information Management) or Information Governance, have the final say about these terms, but as an IT leader or manager it will be your job to assess if the metadata fields can be used in production systems.

Examples of Global Metadata

  • Subject Matter
  • Record Series Code
  • Security Classification
  • Regulatory Area
  • Department Name
  • Location Description
  • Company (If your organization has more than one operating company)
  • Document Type (more about this metadata field, below)

In the Global metadata set, we also include 8 to 10 system-generated metadata values, such as Date Created, File Format, and other values that are not user-selectable. Therefore, you don’t need to manually assign values for these metadata fields – they’ll be provided by the content management solution you’re using.

You can see examples of system-generated metadata in the Global Metadata Example list, here:


2. Define Standard Local Metadata Fields

Meet with representatives from each department, and conduct short, one-hour interactive sessions with frequent breaks. Attendees should be selected because they have a high-level knowledge of their group’s processes and are familiar with the work products that their group or team produces. The sessions are meant to be an exchange of knowledge—you are helping them find better ways to work, starting with the organization of their content (usually on shared network drives).

Questions to ask during these sessions include:

  1. What are your most important business processes that require the use of documents?
  2. For each major document-centric process they describe, what are their most important document types?
  3. How is their information organized? (Examples: by year, by subject, by process, by person). This is essential – it’s local metadata that you can later use for creating sortable columns and search result “refiners.”
  4. What key fields or terms would help them the most in searching for documents? (Examples: searching by document type, year, subject, last-modified date, entity). This is another way to ask the previous question. The answers you receive to these two questions will often be similar and can reveal the “best” terms.
  5. Where is their information currently stored? (Examples: on shared drives, on my hard drive, in email). This not required for the framework but is helpful if you have specific, near-term migration plans for the group.

Examples of Document Types

  • Audit Report
  • Purchase Order
  • Regulatory Filing
  • Project Plan
  • Meeting Minutes
  • Asset Tag

Examples of Local Metadata

  • Project Type
  • Project Level
  • Project Name
  • Work Order Number
  • Project ID
  • Equipment Type
  • Vendor

3. Compile Your Findings Into a Spreadsheet

Use this spreadsheet to gain buy-in and approval from stakeholders—and ultimately used for setting up your content repositories with metadata during migration. Your results should look something like this example of local metadata, after two one-hour sessions with a Public Utility Company’s Environmental Health and Services department:


How the Framework Components Connect and Relate to Each Other

By now you may be wondering how these three metadata components work together. Remember, global metadata is assigned to every document in the repository (and ideally in all repositories in all places in the company). These are fields that are useful by everyone in the organization. They enable users to search and sort any group of documents by high-level attributes like author or company division. Fields like Record Series Code allows your Records Management team to retain and dispose of content as required by law. Your IT or Security groups use global fields like Information Security Classification to manage sensitive content.

At the business unit or department level, the metadata is specific to the local community of workers—hence local metadata. Local metadata allows them to share common organizing principles through agreed-upon metadata fields and values. These in turn can be applied as column headers (in SharePoint, for example) or as search result refiners. Last, the Document Type field is not the format of the file; rather it is the way that the local users describe their work product.

Your framework will evolve over time. While global metadata is the least likely to change, expect local metadata to need adjustment periodically as your business evolves.

Your Next Steps

  • Consider conducting a simple project to define the content taxonomy for the content that will reside in the new content system. Focus on the minimum set of document types and metadata fields you need in the target system.
  • Where possible, align the ECM taxonomy with other structured data standards and initiatives to ensure consistency where possible (e.g., agreement on “policy” vs “contract” or “customer” vs “member”).
  • You should likely use the new taxonomy on a go-forward basis – don’t re-index historical content in other, older, systems unless they are being upgraded, enhanced, or migrated. But you should encourage standardization of content metadata naming conventions in other (non-target) content systems (e.g., Salesforce, Office 365, Box).

We hope you’ve found this overview of metadata useful. If you need help with your metadata strategy, or any other ECM/content services-related strategy and advice, contact us.

Need help with your metadata strategy? Or other governance issues? Click here for a conversation about how we can help.


Rich Medina
Doculabs Vision Team
Our blogs are a group effort, from writing to editing to brainstorming topics. We collaborate to provide you with our best thinking that will help you use technology to improve how your organization operates. The Doculabs blogging team is Richard Medina, James Watson, Marty Pavlik, and Tom Roberts.