Uploaded image for project: 'FishEye'
  1. FishEye
  2. FE-5596

Mercurial: error indexing content of file with unicode name on Windows

      hg cat command is executed during content indexing with filename passed as an argument and while Windows converts arguments to ANSI codepage, garbled filename gets to git process, eg:

      > hg --config ui.verbose=false cat -r 547a7f4d4890c363836ebe7ae0bfb7e2dc421095 αφιλε.τχτ
      ?????.???: no such file in rev 547a7f4d4890
      

      This leads to an error during indexing file content being printed to log and visible in Administration > Repository Status > Recent Errors section:

      Problem indexing content of "αφιλε.τχτ" for rev "547a7f4d4890c363836ebe7ae0bfb7e2dc421095" from repo "unicode-file-names" due to class com.atlassian.fisheye.dvcs.handler.DvcsProcessException - Error while communicating with VCS: ???.?: no such file in rev 547a7f4d4890

      and also an attempt to display file content (request for /browse/unicode-file-names/%CE%B1%CF%86%CE%B9%CE%BB%CE%B5.%CF%84%CF%87%CF%84?r=547a7f4d4890c363836ebe7ae0bfb7e2dc421095) cause an error to be displayed.

      Described problem occurs if file.encoding is set to UTF-8 (it's a value that we recommend user to force during fisheye startup). When run with default windows encoding (windows-1250) there are no indexing errors, file content can be displayed, but filename gets garbled:

      αφιλε.τχτ

          Form Name

            [FE-5596] Mercurial: error indexing content of file with unicode name on Windows

            Atlassian Update – 4 February 2019

            Hello,

            We have recently reviewed this issue and the overall interest in the problem. As the issue hasn't collect votes, watchers, comments, or support cases from many customers during its lifetime, it's very low on our priority list, and will not be fixed in the foreseeable future. Therefore we decided to close it.

            We want to be clear in managing your expectations. The Fisheye & Crucible team is focusing on issues that have broad impact and high value, reflected by the number of comments, votes, support cases, and customers interested. Please consult the Atlassian Bug Fixing Policy for more details.

            If you still see this bug occurring in the latest release and a fix is very important for you, please don't hesitate to share your feedback in the issue comments and vote on it. We will continue to watch the issue for further updates.

            Regards
            Marek Parfianowicz
            Fisheye/Crucible TL

            Marek Parfianowicz added a comment - Atlassian Update – 4 February 2019 Hello, We have recently reviewed this issue and the overall interest in the problem. As the issue hasn't collect votes, watchers, comments, or support cases from many customers during its lifetime, it's very low on our priority list, and will not be fixed in the foreseeable future. Therefore we decided to close it. We want to be clear in managing your expectations. The Fisheye & Crucible team is focusing on issues that have broad impact and high value, reflected by the number of comments, votes, support cases, and customers interested. Please consult the Atlassian Bug Fixing Policy for more details. If you still see this bug occurring in the latest release and a fix is very important for you, please don't hesitate to share your feedback in the issue comments and vote on it. We will continue to watch the issue for further updates. Regards Marek Parfianowicz Fisheye/Crucible TL

            Bo added a comment -

            We have the same problem, where filenames are encoded using windows-1252 but file content (mostly .cs files) are encoded using UTF-8.

            The Fisheye service is started with the '-Dfile.encoding=UTF-8' option, which works well with filecontent, but will garble any filenames using for instance the danish 'æøå'.

            So we would like to have two options when configuring Fisheye:

            • File content encoding
            • Filename encoding

            Bo added a comment - We have the same problem, where filenames are encoded using windows-1252 but file content (mostly .cs files) are encoded using UTF-8. The Fisheye service is started with the '-Dfile.encoding=UTF-8' option, which works well with filecontent, but will garble any filenames using for instance the danish 'æøå'. So we would like to have two options when configuring Fisheye: File content encoding Filename encoding

            Bartosz Borkowski added a comment - - edited

            I just run into this problem

            (Version:4.4.1 Build:20170515065843 2017-05-15)

            Bartosz Borkowski added a comment - - edited I just run into this problem (Version:4.4.1 Build:20170515065843 2017-05-15)

              Unassigned Unassigned
              mzyromski Mikolaj Zyromski
              Affected customers:
              3 This affects my team
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: