Details
-
Bug
-
Resolution: Timed out
-
Low
-
None
-
5.5.2, 5.5.3
-
3
-
Severity 3 - Minor
-
Description
When we have a set of doc files in a directory and we try to import the pages do not get created and the following error display in the logs:
2014-07-01 14:49:44,970 ERROR [http-bio-8090-exec-3] [confluence.spaces.actions.ImportPagesAction] execute An error occurred importing: C:\Users\dluvison\Downloads\CSP126202\Test.docx -- space: 491521 | url: /spaces/doimportpages.action | userName: admin | referer: http://confluence:8090/spaces/importpages.action?key=TI | action: doimportpages java.lang.RuntimeException: org.owasp.validator.html.ScanException: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.antiSamyClean(PolicyConfiguredCleaner.java:221) at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.cleanEntityBody(PolicyConfiguredCleaner.java:160) at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.cleanQuietlyWithLogging(PolicyConfiguredCleaner.java:171) at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.cleanQuietly(PolicyConfiguredCleaner.java:87) at com.atlassian.confluence.content.render.xhtml.storage.XhtmlCleaningInterceptor$1.clean(XhtmlCleaningInterceptor.java:31) at com.atlassian.confluence.core.BodyContent.cleanBody(BodyContent.java:134) at com.atlassian.confluence.content.render.xhtml.storage.XhtmlCleaningInterceptor.cleanBodyContents(XhtmlCleaningInterceptor.java:115) at com.atlassian.confluence.content.render.xhtml.storage.XhtmlCleaningInterceptor.onFlushDirty(XhtmlCleaningInterceptor.java:63) at com.atlassian.confluence.core.persistence.hibernate.CompositeInterceptor.onFlushDirty(CompositeInterceptor.java:43) at net.sf.hibernate.impl.SessionImpl.flushEntity(SessionImpl.java:2582) ... at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:603) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.owasp.validator.html.ScanException: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan(AntiSamyDOMScanner.java:174) at org.owasp.validator.html.AntiSamy.scan(AntiSamy.java:113) at org.owasp.validator.html.AntiSamy.scan(AntiSamy.java:92) at com.atlassian.confluence.content.render.xhtml.PolicyConfiguredCleaner.antiSamyClean(PolicyConfiguredCleaner.java:202) ... 320 more Caused by: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. at org.apache.xerces.dom.CoreDocumentImpl.createElement(Unknown Source) at org.cyberneko.html.parsers.DOMFragmentParser.startElement(DOMFragmentParser.java:432) at org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:1136) .... at org.cyberneko.html.parsers.DOMFragmentParser.parse(DOMFragmentParser.java:167) at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan(AntiSamyDOMScanner.java:172) ... 323 more
Steps to Reproduce:
- Use Microsoft Office and create 4 doc/docx files with some content on it.
- Save into a local directory.
- Create a Space.
- Go to Space Tools > Content Tools > Import tab
- In import directory put the path to the directory where you saved the files created in step 1.
- Click in Import.
- The spaces will not be created.
Tested With FF and Chrome using customer data.
Workaround
- Create a Blank Page.
- Click In Tools > Import From Word Document
- Select your Word Document.