Problems rendering four-byte UTF8 characters in some contexts

XMLWordPrintable

    • Type: Bug
    • Resolution: Answered
    • Priority: Low
    • N/A
    • Affects Version/s: 3.0.0, 3.9.0
    • Component/s: None
    • Severity 3 - Minor
    • 2

      I committed a file containing a four-byte UTF8 character (🌵). The commit comment contains the same character.

      $ echo -e "\xF0\x9F\x8C\xB5" > foo.txt
      $ svn commit foo.txt -m `cat foo.txt`
      

      When I view the file in Fisheye, both the comment and the content display as ��. The underlying HTML is:

      �&#57141
      

      It's returned correctly if I do:

      $ curl "https://localhost/rest-service-fe/revisionData-v1/revisionInfo/foo?path=/foo.txt&revision=1"
      <?xml version="1.0" encoding="UTF-8" standalone="yes"?><fileRevision totalLines="1" fileRevisionState="CHANGED" rev="16" path="baz.txt" linesRemoved="1" linesAdded="1" date="2015-08-14T23:06:10.007-04:00" csid="16" contentLink="/browse/~raw,r=16/foo/baz.txt" author="svn"><ancestor>15</ancestor><comment>🌵</comment></fileRevision>
      

      Three-byte characters like "\xE2\x98\xA0" (☠) work fine.

      My postgres database is encoded using UTF8 (although I don't think Fisheye uses it).

              Assignee:
              Unassigned
              Reporter:
              J
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: