Build Data Loss Prevention (DLP) policies

In order to use Data Loss Prevention (DLP) tools within Cloudflare Zero Trust, you first need to define your DLP profiles. DLP profiles are complex objects with dictionaries, pre-built detections, and custom logic that you can reference as selectors within your Gateway policies.

Configure a DLP profile

You may either use DLP profiles predefined by Cloudflare, or create your own custom profiles based on regular expressions (regex), predefined detection entries, and DLP datasets.

Configure a predefined profile

In Zero Trust ↗, go to DLP > DLP profiles.
Choose a predefined profile and select Configure.
Enable one or more Detection entries according to your preferences. The DLP Profile matches using the OR logical operator — if multiple entries are enabled, your data needs to match only one of the entries.
Select Save profile.

Build a custom profile

In Zero Trust ↗, go to DLP > DLP profiles.
Select Create profile.
Enter a name and optional description for the profile.
Add custom or existing detection entries.
Add a custom entry
1. Select Add custom entry and give it a name.
2. In Value, enter a regular expression (or regex) that defines the text pattern you want to detect. For example, test\d\d will detect the word test followed by two digits.
  - Regular expressions are written in Rust. We recommend validating your regex with Rustexp ↗.
  - DLP detects UTF-8 characters, which can be up to 4 bytes each. Custom text pattern detections are limited to 1024 bytes in length.
  - DLP does not support regular expressions with + or * operators because they are prone to exceeding the length limit. For example, the regex pattern a+ can detect an infinite number of a characters. We recommend using a{min,max} instead, such as a{1,1024}.
3. To save the detection entry, select Done.
Add existing entries

Existing entries include predefined and user-defined detection entries.
1. Select Add existing entries.
2. Choose which entries you want to add, then select Confirm.
3. To save the detection entry, select Done.
(Optional) Configure profile settings for the profile.
Select Save profile.

Build effective DLP profiles

For many Cloudflare users, Zero Trust is often one of the only measures for preventing the loss of sensitive data. For other users, Zero Trust may be the one of the early in-line measures of a complex Internet and SaaS app security strategy. No matter which model you most resemble, developing effective and appropriate DLP policies and practices starts with first-principles definitions.

Define your sensitive data

Existing data patterns

If your organization is most concerned about general data patterns that fit existing classifications such as personal identifiable information (PII), protected health information (PHI), financial information, or source code, we recommend using the default predefined profiles.

To help this better match the needs of your organization, you can also build a complex profile that matches data to both an existing library and a custom string detection or database. For example:

Dashboard
API

Selector	Operator	Value	Logic	Action
DLP Profile	in	Credentials and Secrets	Or	Block
DLP Profile	in	AWS Key Dataset

curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/gateway/rules \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "action": "block",
  "description": "Detect secrets and AWS keys",
  "enabled": true,
  "filters": [
    "http"
  ],
  "name": "Secrets and AWS keys",
  "precedence": 0,
  "traffic": "any(dlp.profiles[*] in <CREDENTIALS_DLP_PROFILE_UUID>) or any(dlp.profiles[*] in <AWS_DLP_PROFILE_UUID>)""
}'

Assorted data patterns

If your data patterns take many different forms and contexts, consider building a custom profile using one or multiple regexes.

For example, you can use a custom expression to detect when your users share product SKUs in the format CF1234-56789:

Dashboard
API

Build a custom profile with the following custom entry:

Detection entry name Value
Product SKUs CF[0-9]{1,4}-[0-9]{5}
Create an HTTP policy with the following expressions:

Selector Operator Value Logic Action
DLP Profile in Product SKUs And Block
User Email matches regex [a-z0-9]{0,15}@example.com

Detection entry name	Value
Product SKUs	`CF[0-9]{1,4}-[0-9]{5}`

Selector	Operator	Value	Logic	Action
DLP Profile	in	Product SKUs	And	Block
User Email	matches regex	`[a-z0-9]{0,15}@example.com`

curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/gateway/rules \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "action": "block",
  "description": "Detect product SKUs shared by users in organization",
  "enabled": true,
  "filters": [
    "http"
  ],
  "name": "Detect product SKU leaks",
  "precedence": 0,
  "traffic": "any(dlp.profiles[*] in <SKU_DLP_PROFILE_UUID>)",
  "identity": "identity.email matches \"[a-z0-9]{0,15}@example.com\""
}'

DLP datasets

If your data is a distinct dataset you have defined, you can build a profile by uploading a database to use in an Exact Data Match or Custom Wordlist function. Exact Data Match and Custom Wordlist feature some key differences:

	Exact Data Match	Custom Wordlist
Encryption	Hashed and compared to encrypted traffic	Stored as plaintext
Payload logging	Matches redacted in logs	Matches appear in logs
Usage	PII (such as names, addresses, and credit card numbers)	Non-sensitive data (such as intellectual property and SKU numbers)

We recommend using Exact Data Match for highly sensitive datasets and Custom Wordlists for lists of keywords.

As your datasets change and grow, we recommend building a pipeline to update the data source in Cloudflare Zero Trust. For more information, contact your account team.

Microsoft Information Protection (MIP) labels

If your data already contains Microsoft Information Protection (MIP) labeling schema, Cloudflare can detect those values in-transit automatically. To get started, connect your Microsoft 365 account with a CASB integration. Cloudflare will automatically pull in your existing MIP definitions into Zero Trust. You can then use the MIP definitions to build DLP profiles for use in Gateway policies.

For more information, refer to Integration profiles.

Build DLP policies

The best way to start applying data loss prevention to your traffic, minimize the chance of false positives, and collect actionable data is to start with the known knowns in your sensitive data policies. Rather than building policies to detect sensitive data like SSNs or financial information across all of your traffic, you should start by building policies that target both sensitive data types and destinations that are known data sources or points of high risk. These sources can be inside or outside your organization.

Example

Many organizations want to detect and log financial information egressing from user devices to critical SaaS applications. To limit the risk of false positives and to filter out logging noise, Cloudflare recommends building your first series of policies to specify both target data and target destination. For example, you can block financial information from being sent to AI chatbots, such as ChatGPT and Gemini:

Dashboard
API

Selector	Operator	Value	Logic	Action
DLP Profile	in	Financial Information	And	Block
Content Categories	in	Artificial Intelligence

curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/gateway/rules \
--header "Authorization: Bearer <API_TOKEN>" \
--header "Content-Type: application/json" \
--data '{
  "action": "block",
  "description": "Prevent financial information from being shared with AI tools",
  "enabled": true,
  "filters": [
    "http"
  ],
  "name": "Block AI financial info",
  "precedence": 0,
  "traffic": "any(dlp.profiles[*] in <FINANCIAL_INFO_DLP_PROFILE_UUID>) and any(http.request.uri.content_category[*] in {184})"
}'

Once you have analyzed the flow and magnitude of data from the known sources, you can begin focusing on more specialized or explicit datasets for more generalized sources. You may want to allow sources that are known internal locations where sensitive data is intentionally transferred.

After developing a level of confidence from reviewing the logs and evaluating a rate of false positives for both types of policies, you can feel more confident in experimenting more broadly with data loss prevention policies.

Was this helpful?

Community
X
Discord
YouTube
GitHub