Mercurial: error indexing content of file with unicode name on Windows

XMLWordPrintable

    • Severity 3 - Minor

      hg cat command is executed during content indexing with filename passed as an argument and while Windows converts arguments to ANSI codepage, garbled filename gets to git process, eg:

      > hg --config ui.verbose=false cat -r 547a7f4d4890c363836ebe7ae0bfb7e2dc421095 αφιλε.τχτ
      ?????.???: no such file in rev 547a7f4d4890
      

      This leads to an error during indexing file content being printed to log and visible in Administration > Repository Status > Recent Errors section:

      Problem indexing content of "αφιλε.τχτ" for rev "547a7f4d4890c363836ebe7ae0bfb7e2dc421095" from repo "unicode-file-names" due to class com.atlassian.fisheye.dvcs.handler.DvcsProcessException - Error while communicating with VCS: ???.?: no such file in rev 547a7f4d4890

      and also an attempt to display file content (request for /browse/unicode-file-names/%CE%B1%CF%86%CE%B9%CE%BB%CE%B5.%CF%84%CF%87%CF%84?r=547a7f4d4890c363836ebe7ae0bfb7e2dc421095) cause an error to be displayed.

      Described problem occurs if file.encoding is set to UTF-8 (it's a value that we recommend user to force during fisheye startup). When run with default windows encoding (windows-1250) there are no indexing errors, file content can be displayed, but filename gets garbled:

      αφιλε.τχτ

            Assignee:
            Unassigned
            Reporter:
            Mikolaj Zyromski
            Votes:
            3 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: