Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-34137

Import directory function is not working

    XMLWordPrintable

Details

    Description

      When we have a set of doc files in a directory and we try to import the pages do not get created and the following error display in the logs:

      2014-07-01 14:49:44,970 ERROR [http-bio-8090-exec-3] [confluence.spaces.actions.ImportPagesAction] execute An error occurred importing: C:\Users\dluvison\Downloads\CSP126202\Test.docx
       -- space: 491521 | url: /spaces/doimportpages.action | userName: admin | referer: http://confluence:8090/spaces/importpages.action?key=TI | action: doimportpages
      java.lang.RuntimeException: org.owasp.validator.html.ScanException: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. 
      	at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.antiSamyClean(PolicyConfiguredCleaner.java:221)
      	at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.cleanEntityBody(PolicyConfiguredCleaner.java:160)
      	at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.cleanQuietlyWithLogging(PolicyConfiguredCleaner.java:171)
      	at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.cleanQuietly(PolicyConfiguredCleaner.java:87)
      	at com.atlassian.confluence.content.render.xhtml.storage.XhtmlCleaningInterceptor$1.clean(XhtmlCleaningInterceptor.java:31)
      	at com.atlassian.confluence.core.BodyContent.cleanBody(BodyContent.java:134)
      	at com.atlassian.confluence.content.render.xhtml.storage.XhtmlCleaningInterceptor.cleanBodyContents(XhtmlCleaningInterceptor.java:115)
      	at com.atlassian.confluence.content.render.xhtml.storage.XhtmlCleaningInterceptor.onFlushDirty(XhtmlCleaningInterceptor.java:63)
      	at com.atlassian.confluence.core.persistence.hibernate.CompositeInterceptor.onFlushDirty(CompositeInterceptor.java:43)
      	at net.sf.hibernate.impl.SessionImpl.flushEntity(SessionImpl.java:2582)
      	...
      	at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041)
      	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:603)
      	at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: org.owasp.validator.html.ScanException: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. 
      	at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan(AntiSamyDOMScanner.java:174)
      	at org.owasp.validator.html.AntiSamy.scan(AntiSamy.java:113)
      	at org.owasp.validator.html.AntiSamy.scan(AntiSamy.java:92)
      	at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.antiSamyClean(PolicyConfiguredCleaner.java:202)
      	... 320 more
      Caused by: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. 
      	at org.apache.xerces.dom.CoreDocumentImpl.createElement(Unknown Source)
      	at org.cyberneko.html.parsers.DOMFragmentParser.startElement(DOMFragmentParser.java:432)
      	at org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:1136)
      	....
      	at org.cyberneko.html.parsers.DOMFragmentParser.parse(DOMFragmentParser.java:167)
      	at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan(AntiSamyDOMScanner.java:172)
      	... 323 more
      

      Steps to Reproduce:

      1. Use Microsoft Office and create 4 doc/docx files with some content on it.
      2. Save into a local directory.
      3. Create a Space.
      4. Go to Space Tools > Content Tools > Import tab
      5. In import directory put the path to the directory where you saved the files created in step 1.
      6. Click in Import.
      7. The spaces will not be created.
        Tested With FF and Chrome using customer data.

      Workaround

      1. Create a Blank Page.
      2. Click In Tools > Import From Word Document
      3. Select your Word Document.

      Attachments

        Activity

          People

            Unassigned Unassigned
            dluvison Deividi Luvison (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: