• Icon: Suggestion Suggestion
    • Resolution: Fixed
    • 1.9.5
    • None
    • None
    • Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      If a character which would be present in the log call is badly encoded, it can stop FoundationKit from decoding the entire 200-line block of log lines as UTF, and subsequently cause issues with other, properly encoded UTF characters elsewhere.

      Typically this is caused by someone committing from another platform using Latin-1 encoding or similar. Eventually the problem passes into history and extended characters start working again for recent log entries, but it can mess up the display for a while.

      The problem is that the FoundationKit methods for converting NSData to NSString simply abort the entire conversion when a badly encoded character is encountered. Right now we just fall back on other encodings so we can at least display something, but really what we'd like to do is identify where the decoding failed, decode up to that point, decode the problem character some other way, then decode the rest as UTF. Or, force the interpretation as UTF anyway and leave that one character garbled. I haven't found a way to do either on FoundationKit yet, but since this can cause issues that don't appear on some other tools that use other more tolerant decoders (like Java), it deserves another look. Perhaps there is another third party NSData -> NSString decoder we can use instead.

            [SRCTREE-1285] Improve handling of badly encoded characters

            I'm the original poster, with the company login.

            I've installed Version 1.9.5.2 and the problem has, indeed, been fixed!
            In our case, it occurred with portuguese characters.
            Thank very much for this fix!
            Excellent work! :-D

            (I did have to wait a long time, though...)

            Impactwave Admin added a comment - I'm the original poster, with the company login. I've installed Version 1.9.5.2 and the problem has, indeed, been fixed! In our case, it occurred with portuguese characters. Thank very much for this fix! Excellent work! :-D (I did have to wait a long time, though...)

            I finally managed to reproduce this to test my fix, it required that I created a commit with Cyrillic 866 encoded characters in it (created on Windows then pushed/pulled onto Mac). In this case I edited the .git/config file and changed the author name to (valid) Russian characters but saved as Cyrillic 866 without also setting git's own i18n.commitencoding. Only the Cyrillic 866 encoding seemed to cause this specific issue, other Cyrillic encodings like Windows-1251 and ISO 8869-5 only corrupted a single line in the log.

            The changes I made fixed the issue and will be present in 1.9.5.

            Steve Streeting (Inactive) added a comment - I finally managed to reproduce this to test my fix, it required that I created a commit with Cyrillic 866 encoded characters in it (created on Windows then pushed/pulled onto Mac). In this case I edited the .git/config file and changed the author name to (valid) Russian characters but saved as Cyrillic 866 without also setting git's own i18n.commitencoding. Only the Cyrillic 866 encoding seemed to cause this specific issue, other Cyrillic encodings like Windows-1251 and ISO 8869-5 only corrupted a single line in the log. The changes I made fixed the issue and will be present in 1.9.5.

            Malik Mangier (Inactive) added a comment - https://support.atlassian.com/browse/STSP-730

            Another vote up based on a related issue reported in SRCTREE-1247, which is probably the same issue as the Answers post. My issue was when working with text with unsupported (CR) line endings (adding to comments or in existing checked-in code) that SourceTree doesn't handle the invalid characters well. Hopefully this fix will also be able to handle bad line endings too. (FYI, the CR endings are coming from a long standing bug with Parallels Desktop. Copying text from Windows to Mac gets converted improperly from CRLF to CR instead of CRLF to LF. Hopefully it is fixed in PD v9.)

            Insomniac Software added a comment - Another vote up based on a related issue reported in SRCTREE-1247 , which is probably the same issue as the Answers post. My issue was when working with text with unsupported (CR) line endings (adding to comments or in existing checked-in code) that SourceTree doesn't handle the invalid characters well. Hopefully this fix will also be able to handle bad line endings too. (FYI, the CR endings are coming from a long standing bug with Parallels Desktop. Copying text from Windows to Mac gets converted improperly from CRLF to CR instead of CRLF to LF. Hopefully it is fixed in PD v9.)

            KieranA added a comment -

            KieranA added a comment - Another vote for this one here: https://answers.atlassian.com/questions/153376/differences-view-line-numbers

            I see. If difference doesn't contain problem characters at all there is no problems. But I still have no ability to use CP1251 or KOI8-R for projects if I want to.

            Editors I use like XCode, AppCode, MacVim, vim, TextMate, TexWorks skip error characters and show anything (characters which editor can't understand are shown directly) and in most of them I can set encoding if I need to.

            Malik Mangier (Inactive) added a comment - I see. If difference doesn't contain problem characters at all there is no problems. But I still have no ability to use CP1251 or KOI8-R for projects if I want to. Editors I use like XCode, AppCode, MacVim, vim, TextMate, TexWorks skip error characters and show anything (characters which editor can't understand are shown directly) and in most of them I can set encoding if I need to.

            Yes, but as one of the comments says, you probably don't want to do that - it essentially creates an entirely new repository by re-writing all the history so any existing clones would become defunct - everyone would have to re-clone from the new repo.

            Steve Streeting (Inactive) added a comment - Yes, but as one of the comments says, you probably don't want to do that - it essentially creates an entirely new repository by re-writing all the history so any existing clones would become defunct - everyone would have to re-clone from the new repo.

            Thanks!

            I am attaching here a Git patch file of a commit that causes the problem mentioned on this issue.
            The offending character is on the Author name: it's the accented A (in Cláudio).
            The relevant line is this:

            From: =?UTF-8?q?Cl=E1udio=20Silva?= <claudio.silva@impactwave.com>

            The accented A is encoded as just 1 character (xE1). This seems to be a valid encoding for Unicode, but NOT for UTF-8 (see this: Unicode Character 'LATIN SMALL LETTER A WITH ACUTE' (U+00E1)).

            Here is an example of a correctly encoded author name:

            From: =?UTF-8?q?Cla=CC=81udio=20Silva?= <claudio.silva@impactwave.com>

            The accented A is encoded as 3 characters (a xCC x81). This commit causes no problems on SourceTree.

            Claudio Silva added a comment - I am attaching here a Git patch file of a commit that causes the problem mentioned on this issue. The offending character is on the Author name: it's the accented A (in Cláudio). The relevant line is this: From: =?UTF-8?q?Cl=E1udio=20Silva?= <claudio.silva@impactwave.com> The accented A is encoded as just 1 character (xE1). This seems to be a valid encoding for Unicode, but NOT for UTF-8 (see this: Unicode Character 'LATIN SMALL LETTER A WITH ACUTE' (U+00E1) ). Here is an example of a correctly encoded author name: From: =?UTF-8?q?Cla=CC=81udio=20Silva?= <claudio.silva@impactwave.com> The accented A is encoded as 3 characters (a xCC x81). This commit causes no problems on SourceTree.

              Unassigned Unassigned
              sstreeting Steve Streeting (Inactive)
              Votes:
              3 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: