Bitbucket seems to have trouble viewing and diffing files of various encodings.
      I created a git repo, committed a bunch of files with different encodings, then attempted to view them via the file browser in Bitbucket. Some of the files render with a special character shown, some don't render at all.

      Then I branched, made changes to each file, committed and made a pull request. Most of the files had trouble rendering in the diff and instead bitbucket showed "This file cannot be rendered in your browser".

      Here's some results. I've attached the git repository that I originally uploaded to our Bitbucket instance to perform the tests.

      File Encoding Result file command result
      test-utf8.txt UTF-8 Good ASCII text, with CRLF line terminators
      test-utf8bom.txt UTF-8 with BOM Displays \ufeff char at beginning of direct view, and during pull request diff. Diff renders correct otherwise UTF-8 Unicode (with BOM) text, with CRLF line terminators
      test-utf16le.txt UTF-16 LE Can't be displayed directly nor can it be displayed in pull request diff data
      test-utf16lebom.txt UTF-16 LE with BOM Displays \ufeff char at beginning of direct view. Diff renders only if transcoding is enabled. Little-endian UTF-16 Unicode text, with CRLF line terminators
      test-utf16be.txt UTF-16 BE Can't be displayed directly nor can it be displayed in pull request diff data
      test-utf16bebom.txt UTF-16 BE with BOM Displays \ufeff char at beginning of direct view. Diff renders only if transcoding is enabled. Big-endian UTF-16 Unicode text, with CRLF line terminators

      Environment:

      • Server
        • Bitbucket Server 4.0.2
        • cat /etc/redhat-release 
          Red Hat Enterprise Linux Server release 7.1 (Maipo)
        • uname -a
          3.10.0-229.11.1.el7.x86_64 #1 SMP Wed Jul 22 12:06:11 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
      • Client (where I created the files)
        • Windows 7 Enterprise SP1 64-bit
        • Sublime Text 3.3103
        • SourceTree 1.8.2.2 with embedded Git 1.9.5

        1. cannot_render.png
          cannot_render.png
          19 kB
        2. encoding-test.zip
          10 kB
        3. ueff_character_pullrequest.png
          ueff_character_pullrequest.png
          17 kB

            [BSERV-8501] Bitbucket has trouble viewing files with UTF-16

            This issue has been partly addressed by the fixes in BSERV-8428, and is partly covered by the encoding auto-detection improvements covered in BSERV-7323. I'm closing this as a duplicate of the above issues.

            Cristan Szmajda (Inactive) added a comment - This issue has been partly addressed by the fixes in BSERV-8428 , and is partly covered by the encoding auto-detection improvements covered in BSERV-7323 . I'm closing this as a duplicate of the above issues.

            Brent P added a comment - - edited

            Hi rnelson2,

            I'm a developer on Bitbucket Server. Thanks a ton for the excellent bug report. It was very easy to reproduce and explore your problem.

            We added support for transcoding non-UTF-8 files for diff in Stash 3.1. You just need to enable the support in the Repository Settings page. Please note the performance implications of enabling transcoding - which are discussed on that page.

            I tested your repo with transcoding enabled and the diff now shows for the two UTF-16 LE with BOM files. Only the two UTF-16 (without BOM) files render as if they were binary.

            We'll open this issue to support UTF-16.

            Brent P added a comment - - edited Hi rnelson2 , I'm a developer on Bitbucket Server. Thanks a ton for the excellent bug report. It was very easy to reproduce and explore your problem. We added support for transcoding non-UTF-8 files for diff in Stash 3.1. You just need to enable the support in the Repository Settings page . Please note the performance implications of enabling transcoding - which are discussed on that page. I tested your repo with transcoding enabled and the diff now shows for the two UTF-16 LE with BOM files. Only the two UTF-16 (without BOM) files render as if they were binary. We'll open this issue to support UTF-16.

              colivier Charles Olivier (Inactive)
              65c73e965862 Rob Nelson
              Affected customers:
              1 This affects my team
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: