Loading...

XML

Word

Printable

Type: Suggestion
Resolution: Unresolved
Fix Version/s: None
Component/s: Content - Labels, Macros - Content by Label
Labels:
- ewt-ctb-improve-existing

Support reference count:
1
EWT (CtB, RtB, DP):
CtB - Improve Existing

When a public-facing Confluence instance has pages with labels, crawler bots
can reach the Labeled Content page at /label/<labelname>.

From that page, the "Related Labels" section (top-right) displays links to
all other labels that co-occur with the current label(s). Crucially, clicking
any "Related Label" appends it to the URL (e.g. /label/foo+bar+baz), and the
new page again shows its own "Related Labels", creating a near-infinite
combination of crawlable URLs.

This is confirmed in the Atlassian Support KB article:
https://support.atlassian.com/confluence/kb/web-crawler-bots-and-confluence-how-public-access-can-lead-to-performance-issues/

Which shows real-world log examples of bots (PanguBot, bingbot, etc.)
generating requests like:
GET /label/aggregate+coverage+database_management+estimation+eu+intra_regional_trade+qa+territory+world

These bots:

Use unique user-agent strings
Come from unique IP addresses
Actively ignore robots.txt directives

This creates near-infinite traffic combinations that degrade performance
and can cause outages on public Confluence Data Center instances.

This issue was partially addressed in ~~CONFSERVER-11940~~ (fixed in 2.8.2)
which added rel="nofollow" to label links, but the fix does not appear
to cover the "Related Labels" links on the /label/ Labeled Content page
in modern versions of Confluence Data Center.

STEPS TO REPRODUCE
1. Set up a public-facing Confluence Data Center instance with anonymous access
2. Add labels to several pages (e.g. "kb-how-to-article", "troubleshooting")
3. Visit /label/kb-how-to-article
4. Inspect the HTML of the "Related Labels" section in the top-right
5. Observe that the label links do NOT have rel="nofollow"
6. A crawler bot will follow each Related Label link, landing on a new
/label/ page with its own Related Labels, generating combinatorial
URL explosion

EXPECTED BEHAVIOR

The "Related Labels" links on the /label/ Labeled Content page should
have rel="nofollow" and/or the page should include a
<meta name="robots" content="noindex,nofollow"> tag, preventing bots
from following the combinatorial label URL chains.

OR alternatively:

Provide an admin-level option to disable the "Related Labels" feature
entirely, or restrict it to logged-in users only.

ACTUAL BEHAVIOR

Related Labels links are fully followable by crawlers, with no
nofollow attribute, creating a near-infinite crawl loop.

WORKAROUND (per Atlassian KB):

Add "Disallow: /label" in robots.txt. But this does NOT work against
bots that ignore robots.txt.
Block IPs at firewall level. Impractical when bots use thousands of
unique IP addresses

RELATED TICKETS

~~CONFSERVER-11940~~: Add nofollow to label links (Fixed in 2.8.2 — but
appears incomplete for modern DC versions)
~~CONFSERVER-12011~~: Multiple-label filter generates redundant URLs (Closed)
CONFSERVER-8749: Make Confluence more configurable for web crawlers
CONFCLOUD-82811: Disable "Show Details" for anonymous users (incl. labels)
Atlassian Support KB: https://support.atlassian.com/confluence/kb/web-crawler-bots-and-confluence-how-public-access-can-lead-to-performance-issues/

Assignee:: Unassigned
Reporter:: Rigel Carbajal
Votes:: 1 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: 30/Apr/2026 11:46 PM
Updated:: 25/May/2026 8:35 AM

Details

Description

Attachments

Forms

Activity

People

Dates