Skip to content
Cloudflare Docs

Metadata

Page-level metadata - content type, associated products, last updated, word count - lets you take a broader, more strategic view of your content.

It helps you answer questions like the following:

  • As a writer:
    • Am I missing something obvious in the content strategy?
    • What are some pages I should be updating right now?
    • How does X tutorial compare with all tutorials? Is it getting more traffic than the baseline?
  • As a manager:
    • Are we over or underinvesting in a specific product area? Or a specific content type?
    • How does the traffic to this set of products compare to another?
    • How can I communicate broader trends to my stakeholders?

You cannot answer these questions without some level of rollup reporting, which you can only get through metadata.

What we track

At Cloudflare, we track the following information about different pages:

ValueDescriptionExamples
ProductThe top-level subfolder of the page.dns, bots
Product GroupThe primary area that each product falls into.Application Performance, Developer Platform
TagsSpecific atttributes related to a page's content or purpose.AI, JavaScript, Headers
Content typeThe primary purpose of the page, which corresponds to our listed content types.how-to, faq
Last modifiedHow many days ago was this page last updated?63
Last reviewed (optional)How many days ago was this page last reviewed?100

Of all of these values, there is a bit of nuance to our Last reviewed metadata. Last reviewed differs from Last modified because a review is more thorough than an update. A review implies that all contents of the page have been vetted for accuracy.

Because of this extra effort, we only track Last reviewed for content types that are particularly important to the user journey and require an additional level of maintenance. At the moment, those content types are tutorials.


How we track

We set these values at two different levels, the folder level and the page level.

Folder-level attributes

We set two values at a folder level, Product and Product Group. We take this approach because we can assume that these values apply every page within that folder.

For example, here's the content from our DNS folder.

dns.yaml
name: DNS
product:
title: DNS
url: /dns/
group: Application performance
meta:
title: Cloudflare DNS docs
description: Cloudflare DNS provides the fastest, most resilient, and simplest
managed DNS platform to meet your needs.
author: "@cloudflare"
resources:
community: https://community.cloudflare.com/tags/c/reliability/7/none
dashboard_link: https://dash.cloudflare.com/?to=/:account/:zone/dns
learning_center: https://www.cloudflare.com/learning/dns/what-is-dns/

Page-level attributes

We primarily set page-level attributes through the page's frontmatter.

For example, here are the values set for our Build a Slackbot tutorial.

build-a-slackbot.mdx
---
updated: 2024-06-05
difficulty: Beginner
pcx_content_type: tutorial
title: Build a Slackbot
tags:
- Hono
languages:
- TypeScript
---

However, the last_modified value is pulled automatically from the git history of a file.


How we use values

We choose to render all of these values as specific meta properties for each page.

For example, these are the meta properties and values on the AI Audit - Get Started page.

Get Started | AI Audit
<meta name="pcx_content_group" content="Core platform" >
<meta name="pcx_product" content="AI Audit" >
<meta name="pcx_content_type" content="get-started" >
<meta name="pcx_last_modified" content="7" >

We render these values using a custom override for our Head.astro file. If specific values are set, we then add them as meta tags onto the page.

Head.astro
if (product.data.product.title) {
["pcx_product", "algolia_product_filter"].map((name) => {
metaTags.push({
name,
content: product.data.product.title,
});
});
}

Benefits

We get two primary benefits from structuring our content this way.

First, our metadata is easily consumable by anyone who crawls our pages. We started using these values for our Algolia search configuration and internal reporting, but have since expanded to sharing this data with other teams that consume our content for AI systems too.

Additionally, this decisions means that our GitHub repo is always the source of truth. We do not have to keep a spreadsheet or mapping updated elsewhere, the source of truth is always in our repo and - by extension - a lot more likely to be accurate than if we maintained multiple sources of truth.


How we ensure quality

It's difficult to avoid errors with this kind of metadata, specifically because we are relying on freeform text entry in the frontmatter of individual files.

We utilize Zod schemas heavily in our Astro site, which are defined in src/schemas/.

These allow us to provide Intellisense guidance for contributors using IDEs for local development.

Intellisense in action