Unsafe and custom topic detection
AI Security for Apps can detect when an LLM prompt touches on unsafe or unwanted subjects. There are two layers of topic detection:
- Default unsafe topics: A built-in set of safety categories that detect harmful content such as violent crimes, hate speech, and sexual content.
- Custom topics: Topics you define to match your organization's specific policies, such as "competitors" or "financial-advice".
When AI Security for Apps is enabled, it automatically evaluates prompts against a set of default unsafe topic categories and populates two fields:
- LLM Unsafe topic detected (
cf.llm.prompt.unsafe_topic_detected):trueif any unsafe topic was found. - LLM Unsafe topic categories (
cf.llm.prompt.unsafe_topic_categories): An array of the specific categories detected.
Default unsafe topic categories
| Category | Description |
|---|---|
S1 | Violent crimes |
S2 | Non-violent crimes |
S3 | Sex-related crimes |
S4 | Child sexual exploitation |
S5 | Defamation |
S6 | Specialized advice |
S7 | Privacy |
S8 | Intellectual property |
S9 | Indiscriminate weapons |
S10 | Hate |
S11 | Suicide and self-harm |
S12 | Sexual content |
S13 | Elections |
S14 | Code interpreter abuse |
Custom topic detection lets you define your own topics and AI Security for Apps will score each prompt against them. You can then use these scores in custom rules or rate limiting rules to block, challenge, or log requests based on a relevance score that you define.
This capability uses a zero-shot classification model that evaluates prompts at runtime. No model training is required.
- You define a list of up to 20 custom topics. Each topic consists of:
- Label: A short, hyphenated identifier used in rule expressions and analytics (for example,
financial-advice). - Topic description: The descriptive text the model uses to classify prompts (for example,
seeking financial advice).
- Label: A short, hyphenated identifier used in rule expressions and analytics (for example,
- When a request arrives at a
cf-llmlabeled endpoint, the model evaluates the prompt against all defined topic descriptions and returns a relevance score for each. - Scores are written to the
cf.llm.prompt.custom_topic_categoriesmap field, keyed by label. You use labels (not topic descriptions) in rule expressions and analytics.
You can manage custom topics from two places in the dashboard:
- Security Settings page: Go to Security > Settings and search for the AI Security for Apps section. Under Custom Topics, select Manage topics to add, edit, or remove topics.
- Expression builder sidebar: When creating or editing a custom rule, select the LLM Custom topic field. Then, select Manage custom topics to open a sidebar where you can manage topics without leaving the rule creation page.
Both methods will update the same underlying topic list. Changes made in one are immediately reflected in the other.
-
In the Cloudflare dashboard, go to the Security Settings page.
Go to SettingsAlternatively, go to the custom rules creation page, select the LLM Custom topic field, and select Manage custom topics to open the sidebar.
-
Add a topic by providing:
- Label: A short, hyphenated identifier (for example,
competitors). - Topic Description: A descriptive English phrase the model uses for classification (for example,
seeking info on competitors).
- Label: A short, hyphenated identifier (for example,
-
Select Save.
Update your custom topics list using a PUT request:
curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/firewall-for-ai/custom-topics" \ --request PUT \ --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ --json '{ "topics": [ { "label": "competitors", "topic": "seeking info on competitors" }, { "label": "financial-advice", "topic": "seeking financial advice" }, { "label": "hr-internal", "topic": "asking about internal HR policies" } ] }'To retrieve your current topics, use a GET request:
curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/firewall-for-ai/custom-topics" \ --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN"| Parameter | Limit |
|---|---|
| Maximum number of topics | 20 |
| Topic string length | 2–50 printable ASCII characters |
| Label length | 2–20 characters |
| Label format | Lowercase letters, numbers, and hyphens (-) only |
The quality of custom topic detection depends on how you write your topic descriptions. The underlying model is a zero-shot classifier that compares the semantic meaning of the prompt against your topic description.
The most important thing to do is to describe the user's intent, not just the subject.
The model performs semantic classification, not keyword matching. Topic descriptions that capture what the user is trying to do are significantly more accurate than descriptions that simply name a subject area. A short verb phrase (3–6 words) is usually the best trade-off between precision and coverage.
Compare how the same two topic descriptions perform against two prompts that both mention finance but with very different intent:
| Topic description | Prompt: "Should I invest my savings in index funds?" | Prompt: "Our finance team just finished the Q3 report." |
|---|---|---|
financial advice (noun only) | Matches | Also matches. The word "finance" appears, even though no advice is being sought |
seeking financial advice (intent phrase) | Matches | Correctly ignored. Mentions finance but has no advice-seeking intent |
| Quality | Topic description | Why |
|---|---|---|
| Best | seeking info on competitors | Captures intent. Only fires when users are actively asking about competitors |
| Okay | Acme Corp, Banana Co, Candy & Sons | Works for known names but misses unnamed competitors and catches casual mentions |
| Avoid | other companies | Far too vague. Matches nearly any prompt that mentions a business |
| Quality | Topic description | Why |
|---|---|---|
| Best | seeking financial advice | Intent-driven. Matches users asking for guidance, ignores passive mentions of finance |
| Okay | securities and investments | Reasonable subject scope but fires on news articles and factual mentions, not just advice-seeking |
| Avoid | finance | Extremely broad. Matches almost everything from expense reports to pricing questions |
- Be specific. Overly broad topics cause false positives; overly narrow topics cause false negatives.
- Avoid semantic overlap. If two topics mean nearly the same thing (for example,
seeking financial adviceandasking for investment guidance), they will score similarly on the same prompts and waste your 20-topic budget. - Test and iterate. Send test prompts and review scores in Security Analytics. You can tune by adjusting the topic description (more or less specific) or the score threshold in your rule (
lt 20is strict,lt 50is permissive). - Do not list multiple values in one topic description. The model only evaluates against the first item in a comma-separated list. For example,
Toyota, Ford, Audi, BMWwill only match prompts aboutToyota— the remaining items are ignored. Removing the commas does not improve results. Use a single intent-driven phrase such asseeking info on competitors, or create separate topics for each value.
| Label | Topic description |
|---|---|
competitors | seeking info on competitors |
financial-advice | seeking financial advice |
legal-advice | asking for legal or regulatory advice |
sensitive-data | requesting passwords or API keys |
job-seeking | asking about job openings or careers |
bias | comparing demographic groups as better or worse |