[CONFSERVER-8238] Header anchors do not work in Firefox with non-ASCII characters

Type: Bug
Resolution: Fixed
Priority: High
Fix Version/s: 2.5.1
Affects Version/s: 2.4.4
Component/s: None
Labels:
- affects-server
- editor
Environment:

standalone

Bug Fix Policy:
View Atlassian Server bug fix policy

We noticed that generating of header anchors was changed and our anchor links were broken.
In v.2.2.9 was:
<h3><a name="H3H4-UTF8notASCIIstring"></a>UTF8 not ASCII string</h3>
The code works properly everywhere.

But in v. 2.4.4 is:
<h3 id="H3H4-UTF8notASCIIstring">UTF8 not ASCII string</h3>
The code works properly in IE and Opera.
Firefox can not work with anchors by an id, if the id contains not ASCII string (e.g. Russian).

I can suppose it is not a Confluence bug, it is a Firefox problem.
But it would be great if you could manage to find a workaround of the problem, e.g. rollback of generating of header anchors to v.2.2.9

is caused by

CONFSERVER-8032 Use HTML id instead of empty named anchors in headings

Closed

is related to

CONFSERVER-8859 {toc} macro not working for Japanese links

Closed

Mingyi Liu added a comment - 26/Apr/2007 10:39 AM

Hmm, so I guess the existing document fragments links are hard-coded that's why you can't change encoding now.

But anyway reverting back is probably better given that it's always a little worrisome that those elements (like <h2>) are forced to have an id, which could interfere with other macros (if any) that also need to modify element ids.

Mingyi Liu added a comment - 26/Apr/2007 10:39 AM Hmm, so I guess the existing document fragments links are hard-coded that's why you can't change encoding now. But anyway reverting back is probably better given that it's always a little worrisome that those elements (like <h2>) are forced to have an id, which could interfere with other macros (if any) that also need to modify element ids.

Tom Davies added a comment - 26/Apr/2007 6:22 AM

didn't make 2.5 cutoff

Tom Davies added a comment - 26/Apr/2007 6:22 AM didn't make 2.5 cutoff

Tom Davies added a comment - 26/Apr/2007 1:53 AM

We have reverted to the original behaviour, i.e. a named anchor.

Tom Davies added a comment - 26/Apr/2007 1:53 AM We have reverted to the original behaviour, i.e. a named anchor.

Christopher Owen [Atlassian] added a comment - 26/Apr/2007 12:21 AM

It isn't Firefox's fault. It is simply following the standards to do with HTML ID character sets. We could workaround it by changing the encoding standard but that has ramifications for existing links with fragment identifiers. Pity that.

Christopher Owen [Atlassian] added a comment - 26/Apr/2007 12:21 AM It isn't Firefox's fault. It is simply following the standards to do with HTML ID character sets. We could workaround it by changing the encoding standard but that has ramifications for existing links with fragment identifiers. Pity that.

David Peterson added a comment - 26/Apr/2007 12:06 AM

By the way, I understand the motivation for the original change - it's a pity FireFox doesn't play along, the 'id' arrangement is definitely much neater.

David Peterson added a comment - 26/Apr/2007 12:06 AM By the way, I understand the motivation for the original change - it's a pity FireFox doesn't play along, the 'id' arrangement is definitely much neater.

David Peterson added a comment - 26/Apr/2007 12:05 AM

While you're fixing the problem, could you possibly change the anchor generation algorithm to not drop out instantly when it encounters a non-alphanumeric? If you nave a macro in a title (eg. ) all your anchors end up being the same name - ie 'PageName-'... Not terribly helpful.

David Peterson added a comment - 26/Apr/2007 12:05 AM While you're fixing the problem, could you possibly change the anchor generation algorithm to not drop out instantly when it encounters a non-alphanumeric? If you nave a macro in a title (eg. ) all your anchors end up being the same name - ie 'PageName-'... Not terribly helpful.

Christopher Owen [Atlassian] added a comment - 25/Apr/2007 11:34 PM

We have been discussing this internally here at Atlassian and it is likely now given the limited character set that may be placed in the id of an element that we will revert to using an empty anchor with a name. This is the only way we can proceed while preserving existing links to document fragments.

Christopher Owen [Atlassian] added a comment - 25/Apr/2007 11:34 PM We have been discussing this internally here at Atlassian and it is likely now given the limited character set that may be placed in the id of an element that we will revert to using an empty anchor with a name. This is the only way we can proceed while preserving existing links to document fragments.

Mingyi Liu added a comment - 25/Apr/2007 3:57 PM

Hmm, spoke too soon. Seems it's still a Confluence bug. I checked further and found Firefox implementation actually does conform to what's described at http://www.w3.org/TR/html401/struct/links.html. Based on the allowed character set for HTML element IDs (http://www.w3.org/TR/html401/types.html#type-name), it only allows [A-Za-z0-9_.:-]. Firefox supports all of them. Additionally, ',', '%' etc. (in fact most of the printable characters), are supported in Firefox.

So it's really not a firefox problem as it does conform to the standards.

Upon further inspection, I found that the reason why some links do not work in Firefox is because of the escaping in the IDs. For example, the following situation would work in Firefox:

<a href="#:">test</a>
...
<h2 id=":">first heading</h2>

So does:

<a href="#%3A">test</a>
...
<h2 id=":">first heading</h2>

But not:

<a href="#%3A">test</a>
...
<h2 id="%3A">first heading</h2>

This suggests that Firefox only unescapes the URI in <a> (correct behavior) but not ID (again, I believe it's the correct behavior too. Why should browsers unescape characters in IDs? Unicode argument does not apply here. In fact, it's surprising that other browsers would all be behaving incorrectly including Opera, as suggested by the other user).

So here're my suggestions as to how one could address this problem:

1. One could change the code in com.atlassian.confluence.renderer.NoAnchorHeadingBlockRenderer to escape the string in <a>, but do not escape them in <h2 id... etc. This, however, runs into the risk that some characters MUST be escaped. For example, "

2. One could escape the code in both <a> and <h2 id... BUT get rid of the '%' after escaping the string. This way you're left with a unique, standards-conforming string for both link and ID and they'd work in all browsers.

What's more, I do not understand the logic of escaping everything except for space character, which was just discarded. It seems to me if the space is not regarded as important for uniqueness, so are all the punctuations, which are all escaped by your renderer after removing space. So the method 2 above should be used for space character too and the result would have guaranteed uniqueness even for any situation. So instead of removing space then escape, the procedure should become (not remove space), escape, then remove % character.

BTW, I also noticed that your renderer for some reason was using the deprecated ' in <a href='uri'> instead of <a href="uri">. " should be used instead of '.

Mingyi Liu added a comment - 25/Apr/2007 3:57 PM Hmm, spoke too soon. Seems it's still a Confluence bug. I checked further and found Firefox implementation actually does conform to what's described at http://www.w3.org/TR/html401/struct/links.html . Based on the allowed character set for HTML element IDs ( http://www.w3.org/TR/html401/types.html#type-name ), it only allows [A-Za-z0-9_.:-] . Firefox supports all of them. Additionally, ',', '%' etc. (in fact most of the printable characters), are supported in Firefox. So it's really not a firefox problem as it does conform to the standards. Upon further inspection, I found that the reason why some links do not work in Firefox is because of the escaping in the IDs. For example, the following situation would work in Firefox: <a href="#:">test</a> ... <h2 id=":">first heading</h2> So does: <a href="#%3A">test</a> ... <h2 id=":">first heading</h2> But not: <a href="#%3A">test</a> ... <h2 id="%3A">first heading</h2> This suggests that Firefox only unescapes the URI in <a> (correct behavior) but not ID (again, I believe it's the correct behavior too. Why should browsers unescape characters in IDs? Unicode argument does not apply here. In fact, it's surprising that other browsers would all be behaving incorrectly including Opera, as suggested by the other user). So here're my suggestions as to how one could address this problem: 1. One could change the code in com.atlassian.confluence.renderer.NoAnchorHeadingBlockRenderer to escape the string in <a>, but do not escape them in <h2 id... etc. This, however, runs into the risk that some characters MUST be escaped. For example, " 2. One could escape the code in both <a> and <h2 id... BUT get rid of the '%' after escaping the string. This way you're left with a unique, standards-conforming string for both link and ID and they'd work in all browsers. What's more, I do not understand the logic of escaping everything except for space character, which was just discarded. It seems to me if the space is not regarded as important for uniqueness, so are all the punctuations, which are all escaped by your renderer after removing space. So the method 2 above should be used for space character too and the result would have guaranteed uniqueness even for any situation. So instead of removing space then escape, the procedure should become (not remove space), escape, then remove % character. BTW, I also noticed that your renderer for some reason was using the deprecated ' in <a href='uri'> instead of <a href="uri">. " should be used instead of '.

Mingyi Liu added a comment - 25/Apr/2007 3:01 PM

More and more people in my company are taking up firefox, so this is becoming a big issue for us. Based on what's described here (http://www.w3.org/TR/html401/struct/links.html), what you guys did were the correct thing. Ideally, Firefox should fix their bug. I'll file a bug report there, but based on my experience with an Ajax bug Mozilla family has, it could take years before it gets fixed. In the meantime, I hope you guys could find a better way.

Mingyi Liu added a comment - 25/Apr/2007 3:01 PM More and more people in my company are taking up firefox, so this is becoming a big issue for us. Based on what's described here ( http://www.w3.org/TR/html401/struct/links.html ), what you guys did were the correct thing. Ideally, Firefox should fix their bug. I'll file a bug report there, but based on my experience with an Ajax bug Mozilla family has, it could take years before it gets fixed. In the meantime, I hope you guys could find a better way.

David Peterson added a comment - 25/Apr/2007 12:45 AM

This is apparently also a problem with commas, question marks and other non-alphanumeric characters.

David Peterson added a comment - 25/Apr/2007 12:45 AM This is apparently also a problem with commas, question marks and other non-alphanumeric characters.

Assignee:: Christopher Owen [Atlassian]

Reporter:: Sergey Zakharov

Affected customers:: 2 This affects my team

Watchers:: 2 Start watching this issue

Created:: 10/Apr/2007 12:17 PM

Updated:: 11/Oct/2018 9:10 AM

Resolved:: 26/Apr/2007 1:53 AM

Details

Description

Attachments

Issue Links

Forms

Activity

Collapse comment: Mingyi Liu added a comment - 26/Apr/2007 10:39 AM

Expand comment: Mingyi Liu added a comment - 26/Apr/2007 10:39 AM

Collapse comment: Tom Davies added a comment - 26/Apr/2007 6:22 AM

Expand comment: Tom Davies added a comment - 26/Apr/2007 6:22 AM

Collapse comment: Tom Davies added a comment - 26/Apr/2007 1:53 AM

Expand comment: Tom Davies added a comment - 26/Apr/2007 1:53 AM

Collapse comment: Christopher Owen [Atlassian] added a comment - 26/Apr/2007 12:21 AM

Expand comment: Christopher Owen [Atlassian] added a comment - 26/Apr/2007 12:21 AM

Collapse comment: David Peterson added a comment - 26/Apr/2007 12:06 AM

Expand comment: David Peterson added a comment - 26/Apr/2007 12:06 AM

Collapse comment: David Peterson added a comment - 26/Apr/2007 12:05 AM

Expand comment: David Peterson added a comment - 26/Apr/2007 12:05 AM

Collapse comment: Christopher Owen [Atlassian] added a comment - 25/Apr/2007 11:34 PM

Expand comment: Christopher Owen [Atlassian] added a comment - 25/Apr/2007 11:34 PM

Collapse comment: Mingyi Liu added a comment - 25/Apr/2007 3:57 PM

Expand comment: Mingyi Liu added a comment - 25/Apr/2007 3:57 PM

Collapse comment: Mingyi Liu added a comment - 25/Apr/2007 3:01 PM

Expand comment: Mingyi Liu added a comment - 25/Apr/2007 3:01 PM

Collapse comment: David Peterson added a comment - 25/Apr/2007 12:45 AM

Expand comment: David Peterson added a comment - 25/Apr/2007 12:45 AM

People

Dates