Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-32453

Confluence with MySQL should support utf8mb4 encoding

XMLWordPrintable

    • Icon: Suggestion Suggestion
    • Resolution: Duplicate
    • 7.3.1
    • None
    • Confluence 5.3.4, MySQL 5.5.35
    • We collect Confluence feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      NOTE: This suggestion is for Confluence Server. Using Confluence Cloud? See the corresponding suggestion.

      When adding pages to Shared Links, I found some pages cannot be registered due to 4-byte encoding of UTF-8 characters as follows:

      2014-01-26 13:04:26,737 ERROR [TP-Processor11] [ext.usage.event.UsageEventListener] handleEvent Error creating task to index for event:com.atlassian.confluence.event.events.content.page.PageCreateEvent@1ee23bfd[page=page: Google、約50分のサービス障害について謝罪と説明 v.1 (38667643),timestamp=1390709066731] : org.springframework.jdbc.UncategorizedSQLException: Hibernate operation: could not insert: [com.atlassian.confluence.core.BodyContent#38733176]; uncategorized SQLException for SQL []; SQL state [HY000]; error code [1366]; Incorrect string value: '\xF0\x90\xB6\x90\xEF\xBF...' for column 'BODY' at row 1; nested exception is java.sql.SQLException: Incorrect string value: '\xF0\x90\xB6\x90\xEF\xBF...' for column 'BODY' at row 1
      

      This is a well-known problem of MySQL and we have to configure MySQL to use utf8mb4 encoding (see. http://stackoverflow.com/questions/10957238/incorrect-string-value-when-trying-to-insert-utf-8-into-mysql-via-jdbc).

      However, when I configure MySQL to use utf8mb4 encoding, Confluence won't start up because it rejects utf8mb4 encoding as unsupported:

      2014-02-01 19:12:54,966 WARN [main] [atlassian.config.bootstrap.DefaultAtlassianBootstrapManager] checkMySQLCharacterSetAndCollation Your database uses non-recommended character set: utf8mb4. See https://confluence.atlassian.com/x/IrYC
      2014-02-01 19:12:54,985 ERROR [main] [atlassian.confluence.setup.BootstrapApplicationStartupListener] checkConfigurationOnStartup Detected tables with non-default character encoding. See https://confluence.atlassian.com/x/TABrFw
      

      Please add utf8mb4 as a supported encoding.

      An example of failed page is here:
      http://www.itmedia.co.jp/news/articles/1401/25/news011.html

      Workaround:

      XML Cleaner in this KB can clean XML backup from 4-bytes characters.

      Documented here

            Unassigned Unassigned
            f80121283f78 Nagai Takayuki
            Votes:
            37 Vote for this issue
            Watchers:
            40 Start watching this issue

              Created:
              Updated:
              Resolved: