Details
-
Suggestion
-
Resolution: Unresolved
-
None
Description
Fisheye/Crucible has an isTest method that attempts to guess, as a last resort, whether a file is textual or binary.
If it can't detect that the file is various known text types by the leading bytes, it checks the value of the first 20k bytes in the file and if any are outside the range 1 - 127, it decides the file must be binary.
This occasionally causes problems with files containing occasional values outside that range that are in ASCII-like but not ASCII formats.
Alternatively we should delete our own attempts to detect textness and use a library for this. Some seem to exist.
It would probably be better to be a bit more lenient, and assume that if a file's bytes are mostly in a texty looking range, then it's still better to show it as text than binary. If a file is 5% boxes then I'm probably still happy to look at it. The %age could be a system property.