When adding pages to Shared Links, I found some pages cannot be registered due to 4-byte encoding of UTF-8 characters as follows:
2014-01-26 13:04:26,737 ERROR [TP-Processor11] [ext.usage.event.UsageEventListener] handleEvent Error creating task to index for event:com.atlassian.confluence.event.events.content.page.PageCreateEvent@1ee23bfd[page=page: Google、約50分のサービス障害について謝罪と説明 v.1 (38667643),timestamp=1390709066731] : org.springframework.jdbc.UncategorizedSQLException: Hibernate operation: could not insert: [com.atlassian.confluence.core.BodyContent#38733176]; uncategorized SQLException for SQL []; SQL state [HY000]; error code [1366]; Incorrect string value: '\xF0\x90\xB6\x90\xEF\xBF...' for column 'BODY' at row 1; nested exception is java.sql.SQLException: Incorrect string value: '\xF0\x90\xB6\x90\xEF\xBF...' for column 'BODY' at row 1
This is a well-known problem of MySQL and we have to configure MySQL to use utf8mb4 encoding (see. http://stackoverflow.com/questions/10957238/incorrect-string-value-when-trying-to-insert-utf-8-into-mysql-via-jdbc).
However, when I configure MySQL to use utf8mb4 encoding, Confluence won't start up because it rejects utf8mb4 encoding as unsupported:
2014-02-01 19:12:54,966 WARN [main] [atlassian.config.bootstrap.DefaultAtlassianBootstrapManager] checkMySQLCharacterSetAndCollation Your database uses non-recommended character set: utf8mb4. See https:2014-02-01 19:12:54,985 ERROR [main] [atlassian.confluence.setup.BootstrapApplicationStartupListener] checkConfigurationOnStartup Detected tables with non-default character encoding. See https:
Please add utf8mb4 as a supported encoding.
An example of failed page is here:
http://www.itmedia.co.jp/news/articles/1401/25/news011.html
Workaround:
XML Cleaner in this KB can clean XML backup from 4-bytes characters.
Documented here