• Icon: Suggestion Suggestion
    • Resolution: Duplicate
    • 7.3.1
    • None
    • Confluence 5.3.4, MySQL 5.5.35
    • We collect Confluence feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      NOTE: This suggestion is for Confluence Server. Using Confluence Cloud? See the corresponding suggestion.

      When adding pages to Shared Links, I found some pages cannot be registered due to 4-byte encoding of UTF-8 characters as follows:

      2014-01-26 13:04:26,737 ERROR [TP-Processor11] [ext.usage.event.UsageEventListener] handleEvent Error creating task to index for event:com.atlassian.confluence.event.events.content.page.PageCreateEvent@1ee23bfd[page=page: Google、約50分のサービス障害について謝罪と説明 v.1 (38667643),timestamp=1390709066731] : org.springframework.jdbc.UncategorizedSQLException: Hibernate operation: could not insert: [com.atlassian.confluence.core.BodyContent#38733176]; uncategorized SQLException for SQL []; SQL state [HY000]; error code [1366]; Incorrect string value: '\xF0\x90\xB6\x90\xEF\xBF...' for column 'BODY' at row 1; nested exception is java.sql.SQLException: Incorrect string value: '\xF0\x90\xB6\x90\xEF\xBF...' for column 'BODY' at row 1
      

      This is a well-known problem of MySQL and we have to configure MySQL to use utf8mb4 encoding (see. http://stackoverflow.com/questions/10957238/incorrect-string-value-when-trying-to-insert-utf-8-into-mysql-via-jdbc).

      However, when I configure MySQL to use utf8mb4 encoding, Confluence won't start up because it rejects utf8mb4 encoding as unsupported:

      2014-02-01 19:12:54,966 WARN [main] [atlassian.config.bootstrap.DefaultAtlassianBootstrapManager] checkMySQLCharacterSetAndCollation Your database uses non-recommended character set: utf8mb4. See https://confluence.atlassian.com/x/IrYC
      2014-02-01 19:12:54,985 ERROR [main] [atlassian.confluence.setup.BootstrapApplicationStartupListener] checkConfigurationOnStartup Detected tables with non-default character encoding. See https://confluence.atlassian.com/x/TABrFw
      

      Please add utf8mb4 as a supported encoding.

      An example of failed page is here:
      http://www.itmedia.co.jp/news/articles/1401/25/news011.html

      Workaround:

      XML Cleaner in this KB can clean XML backup from 4-bytes characters.

      Documented here

            [CONFSERVER-32453] Confluence with MySQL should support utf8mb4 encoding

            Paul Frischknecht added a comment - - edited

            There are 10 issues in my company about this, all reported to the people operating our Atlassian products, who cannot do anything to fix it. Please address this.

            Paul Frischknecht added a comment - - edited There are 10 issues in my company about this, all reported to the people operating our Atlassian products, who cannot do anything to fix it. Please address this.

            I had a ton of problems with UTF8mb4 with Jira and MySQL, both with upgrading, with some plugins, and in general - as UTF8mb4 was getting used increasingly 'all over the place' as people use tablets and phones to mail and interact with Atlassian products.

            For me, moving to PostgreSQL was something I should have done so much earlier; that entire subject just ... dissapeared.

            This is just an encouragement to make a test setup with ProstgreSQL and see what it does for you; it made my Atlassian life so much easier.

            Regards,
            Martin

            flexdanmark added a comment - I had a ton of problems with UTF8mb4 with Jira and MySQL, both with upgrading, with some plugins, and in general - as UTF8mb4 was getting used increasingly 'all over the place' as people use tablets and phones to mail and interact with Atlassian products. For me, moving to PostgreSQL was something I should have done so much earlier; that entire subject just ... dissapeared. This is just an encouragement to make a test setup with ProstgreSQL and see what it does for you; it made my Atlassian life so much easier. Regards, Martin

            Anders added a comment - - edited

            watching 

            Anders added a comment - - edited watching 

            Closing as duplicate of CONFSERVER-18509

            James Richards added a comment - Closing as duplicate of CONFSERVER-18509

            Version 6.3.1 of Confluence supports MySql 5.7, and Confluence SHOULD support Unicode (all of it) ie the utf8mb4 format!

            Craig Solinski added a comment - Version 6.3.1 of Confluence supports MySql 5.7, and Confluence SHOULD support Unicode (all of it) ie the utf8mb4 format!

            cweiske added a comment -

            If we change that manually, migrations to new confluence versions stop working. Please fix that natively.

            cweiske added a comment - If we change that manually, migrations to new confluence versions stop working. Please fix that natively.

            I got it working, I think: Ubuntu 16.04 w/ MySQL 5.7.17, with JDBC connector 5.1.40. Confluence 6.0.4.

             

            Added the following to mysqld.conf under [mysqld]:

            • character-set-server=utf8mb4
            • collation-server=utf8mb4_bin
            • default-storage-engine=INNODB
            • max_allowed_packet=256MB
            • innodb_log_file_size=2GB

             

            When I configured the external database, I set the options in the DB connection url to:

            • ?sessionVariables=default_storage_engine%3DInnoDB&useUnicode=true

            It's complaining about the character set in the admin/diag screen, and some of the java and filesystem settings look like they're still restricted to UTF8, but I was able to copy/paste some 4-byte unicode into a page and it worked, so...

             

            I think Atlassian should be able to support this without too much additional work.

            Deleted Account (Inactive) added a comment - I got it working, I think: Ubuntu 16.04 w/ MySQL 5.7.17, with JDBC connector 5.1.40. Confluence 6.0.4.   Added the following to mysqld.conf under [mysqld] : character-set-server=utf8mb4 collation-server=utf8mb4_bin default-storage-engine=INNODB max_allowed_packet=256MB innodb_log_file_size=2GB   When I configured the external database, I set the options in the DB connection url to: ?sessionVariables=default_storage_engine%3DInnoDB&useUnicode=true It's complaining about the character set in the admin/diag screen, and some of the java and filesystem settings look like they're still restricted to UTF8, but I was able to copy/paste some 4-byte unicode into a page and it worked, so...   I think Atlassian should be able to support this without too much additional work.

            Is there anything that is blocking this issue? utf8mb4 is supported since mysql 5.5 ( https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html ) and MySQL 5.5 as minimum is listed under system requirements ( https://confluence.atlassian.com/doc/supported-platforms-207488198.html ). It should be simple to migrate all utf8 charsets to utf8mb4 (utf8mb4 is a superset of utf8). utf8 is simply broken. using emojis with confluence will raise an exception (and we already had some users that pasted emoji texts into confluence leading to 500ter erors), only with mysql, and those could easily be fixed.

            Jakob Englisch added a comment - Is there anything that is blocking this issue? utf8mb4 is supported since mysql 5.5 ( https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html ) and MySQL 5.5 as minimum is listed under system requirements ( https://confluence.atlassian.com/doc/supported-platforms-207488198.html ). It should be simple to migrate all utf8 charsets to utf8mb4 (utf8mb4 is a superset of utf8). utf8 is simply broken. using emojis with confluence will raise an exception (and we already had some users that pasted emoji texts into confluence leading to 500ter erors), only with mysql, and those could easily be fixed.

            Nagai Takayuki added a comment - - edited

            I found this problem is essentially related to this one:

            Make directory sync more robust when handling names with emoji characters
            https://jira.atlassian.com/browse/CONF-30465

            Nagai Takayuki added a comment - - edited I found this problem is essentially related to this one: Make directory sync more robust when handling names with emoji characters https://jira.atlassian.com/browse/CONF-30465

              Unassigned Unassigned
              f80121283f78 Nagai Takayuki
              Votes:
              37 Vote for this issue
              Watchers:
              40 Start watching this issue

                Created:
                Updated:
                Resolved: