Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-11144

Attachments deleted due to google crawled and no 'nofollow' link

XMLWordPrintable

      When google crawler (or other searchers) crawls Confluence, it can execute a link if "nofollow" is not included in the URL. In 2.7.1, the following snippet appears in the source of the attachments page:

      <meta name="robots" content="noindex,nofollow">
      <meta name="robots" content="noarchive">

      This does not appear on the same page in 2.7.2. If Google crawls this page, and executes its subpages, it'll come across this:
      <a id="removeAttachmentLink" href="/confluence/pages/removeattachment.action?pageId=1638408&fileName=xxx&version=1" onClick="javascript:if( confirm('Are you sure you want to remove attached file xxx?')) return true; else return false;" >Remove</a>

      If the crawler is not using javascript, it'll fail to load this error message window, and it will actually remove the attachment. Users will be notified that attachments have been deleted by the original owner.

      This requires no configuration to robots.txt (described in http://confluence.atlassian.com/display/DISC/Prevent+Search+Engine+Indexing+Using+Robots.txt)

        1. macros.vm
          123 kB
          dave

              Unassigned Unassigned
              jlargman Jeremy Largman
              Votes:
              3 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: