-
Suggestion
-
Resolution: Unresolved
Mercurial has no native support for different encoding of file paths. It can determine encoding of content, but file paths are not supported.
Decoding ambiguities apply to file contents, as well as file names in the bytes-based manifest. This spec applies to the former only and does not address manifest parsing. (Tracking File Encoding in Mercurial)
File paths are stored in binary form, thus cannot determine encoding properly. Most modern systems use UTF-8, except for Windows which use their own, incompatible with the UTF standard code pages.
We strongly encourage our customers, to not use non-ASCII letters in file paths, because not only Fisheye will be unable to index repository but also it breaks compatibility between different systems.
More details about possible configurations
If Mercurial repository doesn't contain non ASCII characters in file paths, any configuration should work correctly.
Otherwise it can cause Fisheye to be unable to index repository. Crucial piece are committers here. If they use Windows as their target platform, non UTF-8 characters can be committed causing problems.
Possible platforms:
- Fisheye on Windows - cannot index non ASCII file paths
- Fisheye on Linux - can index repository, if file paths are encoded in UTF-8
Workarounds
I'm not affected but I want to be sure it will not happen in future
- encourage your team to use only ASCII characters in file paths.
- ensure all committers use OS with UTF-8 set as default encoding (modern UNIX related systems - Mac OS, Linux)
I'm affected, what can I do now?
- affected paths could be added to excluded paths. It will require repository reindex; after reindex excluded files will be not available in Fisheye
- repository can be converted using hg convert; it provides ability to rename files, so non ASCII characters can be removed
NOTE! It creates a new repository, commit hashes and other internal elements will be different, it may break links in other tools. For example, existing Crucible code reviews will point to non-existing commit hashes.
Proposed solution
Provide ability to choose file path encoding for Mercurial repositories, so Fisheye will be able to properly decode and encode paths.
Note: if the option would be set at the repository level, then it requires that all Mercurial clients committing non-ascii paths has to be configured to use the same path encoding - for example all committers running Windows with the same codepage.
- depended on by
-
FE-5596 Mercurial: error indexing content of file with unicode name on Windows
- Closed
- is duplicated by
-
FE-7093 Add support for different file paths encoding on Mercurial
- Closed
- relates to
-
FE-6961 Add option to set default charset encoding for file content for Mercurial
- Not Being Considered
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...