-
Bug
-
Resolution: Won't Fix
-
Low
-
None
-
None
-
Severity 3 - Minor
JDK 6 implements Unicode 4.0. In this version of Unicode, the zero-width whitespace character (0x200B) is being treated as a whitespace.
JDK 7 implements Unicode 6.0. In this version of Unicode, the zero-width whitespace has been reclassified to the 'format character' group (other characters in this group are, for example, left-/right- text direction markers).
Thus, Java compiler in JDK 6 allows to use 0x200B as a normal whitespace character, e.g. separating symbols.
The Java compiler since JDK 7 silently ignores the 0x200B, which means that it cannot be used to separate symbols anymore. However you can put this character virtually in any place, e.g.:
void this<200B>Is<200B>MyMethod();
Clover fails on parsing the 200B character:
Xyz.java:287:90:unexpected char: 0x200B at com.atlassian.clover.instr.java.Instrumenter.instrument(Instrumenter.java:166) at com.atlassian.clover.CloverInstr.execute(CloverInstr.java:76) at com.atlassian.clover.CloverInstr.mainImpl(CloverInstr.java:54) at ...
Planned fix:
- Ignore 200B characters in Java 7+. Treat 200B character as space in Java 6.
- Question: Shall it be based on source level setting or the JDK detected?
- Question: which other control characters needs to be ignored by Clover?
- Question: which other whitespace characters (other than space, \t, \n, \r) shall be recognized by Clover parser?
Workaround:
Remove all 200B character occurrences from the source code.
[CLOV-1835] Unicode 0x200B (zero-width whitespace) causes instrumentation failure
Resolution | New: Won't Fix [ 2 ] | |
Status | Original: Open [ 1 ] | New: Closed [ 6 ] |
Symptom Severity | New: Minor [ 14432 ] |
Workflow | Original: New Clover Workflow [ 983440 ] | New: New Clover Workflow - Restricted [ 1474581 ] |
Remote Link |
New:
This issue links to "Clover › All JDK Tests › |
Remote Link |
New:
This issue links to "Clover › All Ant Groovy Tests › |
Remote Link |
New:
This issue links to "Clover › Default › |
Description |
Original:
JDK 6 implements Unicode 4.0. In this version of Unicode, the zero-width whitespace character (0x200B) is being treated as a whitespace.
JDK 7 implements Unicode 6.0. In this version of Unicode, the zero-width whitespace has been reclassified to the 'format character' group (other characters in this group are, for example, left-/right- text direction markers). Thus, Java compiler in JDK 6 allows to use 0x200B as a normal whitespace character, e.g. separating symbols. The Java compiler since JDK 7 silently ignores the 0x200B, which means that it cannot be used to separate symbols anymore. However you can put this character virtually in any place, e.g.: {code:java} void this<200B>Is<200B>MyMethod(); {code} Clover fails on parsing the 200B character: {noformat} Xyz.java:287:90:unexpected char: 0x200B at com.atlassian.clover.instr.java.Instrumenter.instrument(Instrumenter.java:166) at com.atlassian.clover.CloverInstr.execute(CloverInstr.java:76) at com.atlassian.clover.CloverInstr.mainImpl(CloverInstr.java:54) at ... {noformat} *Planned fix:* * Ignore 200B characters in Java 7+. Treat 200B character as space in Java 6. * Question: Shall it be based on source level setting or the JDK detected? * Question: which other control characters needs to be ignored by Clover? * Question: which other whitespace characters (other than space, \t, \n, \r) shall be recognized by Clover parser? *Workaround:* Remove all 200B character from the source code. |
New:
JDK 6 implements Unicode 4.0. In this version of Unicode, the zero-width whitespace character (0x200B) is being treated as a whitespace.
JDK 7 implements Unicode 6.0. In this version of Unicode, the zero-width whitespace has been reclassified to the 'format character' group (other characters in this group are, for example, left-/right- text direction markers). Thus, Java compiler in JDK 6 allows to use 0x200B as a normal whitespace character, e.g. separating symbols. The Java compiler since JDK 7 silently ignores the 0x200B, which means that it cannot be used to separate symbols anymore. However you can put this character virtually in any place, e.g.: {code:java} void this<200B>Is<200B>MyMethod(); {code} Clover fails on parsing the 200B character: {noformat} Xyz.java:287:90:unexpected char: 0x200B at com.atlassian.clover.instr.java.Instrumenter.instrument(Instrumenter.java:166) at com.atlassian.clover.CloverInstr.execute(CloverInstr.java:76) at com.atlassian.clover.CloverInstr.mainImpl(CloverInstr.java:54) at ... {noformat} *Planned fix:* * Ignore 200B characters in Java 7+. Treat 200B character as space in Java 6. * Question: Shall it be based on source level setting or the JDK detected? * Question: which other control characters needs to be ignored by Clover? * Question: which other whitespace characters (other than space, \t, \n, \r) shall be recognized by Clover parser? *Workaround:* Remove all 200B character occurrences from the source code. |
Description |
Original:
JDK 6 implements Unicode 4.0. In this version of Unicode, the zero-width whitespace character (0x200B) is being treated as a whitespace.
JDK 7 implements Unicode 6.0. In this version of Unicode, the zero-width whitespace has been reclassified to the 'format character' group (other characters in this group are, for example, left-/right- text direction markers). Thus, Java compiler in JDK 6 allows to use 0x200B as a normal whitespace character, e.g. separating symbols. The Java compiler since JDK 7 silently ignores the 0x200B, which means that it cannot be used to separate symbols anymore. However you can put this character virtually in any place, e.g.: {code:java} void this<200B>Is<200B>MyMethod(); {code} Clover fails on parsing the 200B character: {noformat} Xyz.java:287:90:unexpected char: 0x200B at com.atlassian.clover.instr.java.Instrumenter.instrument(Instrumenter.java:166) at com.atlassian.clover.CloverInstr.execute(CloverInstr.java:76) at com.atlassian.clover.CloverInstr.mainImpl(CloverInstr.java:54) at ... {noformat} *Possible fix:* Ignore 200B characters in Java 7+. Treat 200B character as space in Java 6. Q: Shall it be based on source level setting or the JDK detected? *Workaround:* Remove all 200B character from the source code. |
New:
JDK 6 implements Unicode 4.0. In this version of Unicode, the zero-width whitespace character (0x200B) is being treated as a whitespace.
JDK 7 implements Unicode 6.0. In this version of Unicode, the zero-width whitespace has been reclassified to the 'format character' group (other characters in this group are, for example, left-/right- text direction markers). Thus, Java compiler in JDK 6 allows to use 0x200B as a normal whitespace character, e.g. separating symbols. The Java compiler since JDK 7 silently ignores the 0x200B, which means that it cannot be used to separate symbols anymore. However you can put this character virtually in any place, e.g.: {code:java} void this<200B>Is<200B>MyMethod(); {code} Clover fails on parsing the 200B character: {noformat} Xyz.java:287:90:unexpected char: 0x200B at com.atlassian.clover.instr.java.Instrumenter.instrument(Instrumenter.java:166) at com.atlassian.clover.CloverInstr.execute(CloverInstr.java:76) at com.atlassian.clover.CloverInstr.mainImpl(CloverInstr.java:54) at ... {noformat} *Planned fix:* * Ignore 200B characters in Java 7+. Treat 200B character as space in Java 6. * Question: Shall it be based on source level setting or the JDK detected? * Question: which other control characters needs to be ignored by Clover? * Question: which other whitespace characters (other than space, \t, \n, \r) shall be recognized by Clover parser? *Workaround:* Remove all 200B character from the source code. |
Rank | New: Ranked higher |
How javac (JDK7) treats control characters: