• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Medium Medium
    • 1.4
    • 1.1.2
    • None
    • CF 1.1.2 Standalone / PostgresSQL / RH Enterprise Linux 3 / SUN JDK 1.4

      We still experiencing problems with non-ASCII characters in page names.
      In our case "äüö" work fine, however the trouble starts when there is an uppercase umlaut in the page name, e.g. "ÄÜÖ".
      The page is created and can be display /edited.
      The children macro will however produce a link which is display to an non-existing page and when selected will try to create a page by that name.

            [CONFSERVER-1570] Still trouble with non-ASCII characters in page names

            jens added a comment -

            This bug is fixed for 1.4-DR8.

            jens added a comment - This bug is fixed for 1.4-DR8.

            Ingomar,

            We'll attack this for 1.4 - I think we should be able to fix it now Charles' has tracked down the problem in detail. Sorry for taking so long, it's one of those things us non-i18n people don't think of (that the lowercase version of a character might be different in Java and in Postgres!!!).

            m

            Mike Cannon-Brookes added a comment - Ingomar, We'll attack this for 1.4 - I think we should be able to fix it now Charles' has tracked down the problem in detail. Sorry for taking so long, it's one of those things us non-i18n people don't think of (that the lowercase version of a character might be different in Java and in Postgres!!!). m

            I've found the problem. We do a case-insensitive match to find pages by their title: so Bob and BOB and bob are the same page.

            The problem is that we do this by calling title.toLowerCase() in Java code, but matching against lower(title) in the database. So if Java and the database disagree on what the lower-case of a particular letter is, then the whole thing blows up. Viz:

            confchars=# select lower('ÄÜÖ');
            lower
            -------
            ÄÜÖ
            (1 row)

            We should change the query so that we do the toLowerCase() in the query itself:

            select * from content where lower(title) = lower('ÄÜÖ');

            Charles Miller (Inactive) added a comment - I've found the problem. We do a case-insensitive match to find pages by their title: so Bob and BOB and bob are the same page. The problem is that we do this by calling title.toLowerCase() in Java code, but matching against lower(title) in the database. So if Java and the database disagree on what the lower-case of a particular letter is, then the whole thing blows up. Viz: confchars=# select lower('ÄÜÖ'); lower ------- ÄÜÖ (1 row) We should change the query so that we do the toLowerCase() in the query itself: select * from content where lower(title) = lower('ÄÜÖ');

            kgbvax added a comment -

            Mhhm, I am not sure whether I am about to loose my sense of humor about this.

            kgbvax added a comment - Mhhm, I am not sure whether I am about to loose my sense of humor about this.

            Ingomar,
            Yes, fair point; i18n is currently scheduled for 1.5.

            Cheers,
            Nick


            Nick Faiz - nick@atlassian.com

            ATLASSIAN
            Confluence - the professional J2EE wiki - tried it yet?
            http://www.atlassian.com/confluence

            JIRA - need a simple, powerful way to track and manage issues?
            http://www.atlassian.com/software/jira

            Nick Faiz [OLD] (Inactive) added a comment - Ingomar, Yes, fair point; i18n is currently scheduled for 1.5. Cheers, Nick – Nick Faiz - nick@atlassian.com ATLASSIAN Confluence - the professional J2EE wiki - tried it yet? http://www.atlassian.com/confluence JIRA - need a simple, powerful way to track and manage issues? http://www.atlassian.com/software/jira

            kgbvax added a comment -

            This issue now celebrates it's 0,5 bithday.

            kgbvax added a comment - This issue now celebrates it's 0,5 bithday.

            kgbvax added a comment -

            The problem persists in 1.3.
            I don't think this has anything to do with the DB, as it works fine for lowercase charcters. I would assume that this is an encoding issue.

            kgbvax added a comment - The problem persists in 1.3. I don't think this has anything to do with the DB, as it works fine for lowercase charcters. I would assume that this is an encoding issue.

            kgbvax added a comment -

            I have another observation:

            Same scenario, I jsut created that page.
            The i go back to the dashboard.
            The page shows up on "recently update" - when I select THAT link everything is fine.
            When I go to the page I created that page from (by creating a link and selecting that one) I get the silly "create page".

            I could reproduce this in the public test space:
            http://confluence.atlassian.com/display/TEST/CONF-1570+Test

            kgbvax added a comment - I have another observation: Same scenario, I jsut created that page. The i go back to the dashboard. The page shows up on "recently update" - when I select THAT link everything is fine. When I go to the page I created that page from (by creating a link and selecting that one) I get the silly "create page". I could reproduce this in the public test space: http://confluence.atlassian.com/display/TEST/CONF-1570+Test

            kgbvax added a comment -

            I am sorry to report that although I did the migration as suggested I still recieve the same error.

            Here is the database i use right now:
            cfu | cfu | UNICODE

            kgbvax added a comment - I am sorry to report that although I did the migration as suggested I still recieve the same error. Here is the database i use right now: cfu | cfu | UNICODE

            Ah, this is most likely the problem. You have created your databases in "unknown 8 bit character set" mode, so they're behaving randomly when presented with multi-byte Unicode characters.

            Your should probably do the following:

            Back up your Confluence database (using pg_dump). Drop the database and then re-create it with Unicode support using 'createdb -E UNICODE'. Import your dump-file into the new database.

            I'm not entirely sure what effect this will have on existing high-bit or multi-byte data in the database, though.

            Charles Miller (Inactive) added a comment - Ah, this is most likely the problem. You have created your databases in "unknown 8 bit character set" mode, so they're behaving randomly when presented with multi-byte Unicode characters. Your should probably do the following: Back up your Confluence database (using pg_dump). Drop the database and then re-create it with Unicode support using 'createdb -E UNICODE'. Import your dump-file into the new database. I'm not entirely sure what effect this will have on existing high-bit or multi-byte data in the database, though.

              Unassigned Unassigned
              150ccb5cf9f8 kgbvax
              Affected customers:
              1 This affects my team
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved: