Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-4452

500 errors on PR page due to markdown renderer

    XMLWordPrintable

Details

    Description

      The issue here is annoying, but easy to explain:

      Caused by: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. 
      	at org.apache.xerces.dom.CoreDocumentImpl.createAttribute(Unknown Source) ~[xercesImpl-2.9.1.jar:na]
      	at org.apache.xerces.dom.ElementImpl.setAttribute(Unknown Source) ~[xercesImpl-2.9.1.jar:na]
      	at org.cyberneko.html.parsers.DOMFragmentParser.startElement(DOMFragmentParser.java:433) ~[nekohtml-1.9.7.jar:na]
      	at org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:1019) ~[nekohtml-1.9.7.jar:1.9.7]
      	at org.cyberneko.html.HTMLTagBalancer.startElement(HTMLTagBalancer.java:652) ~[nekohtml-1.9.7.jar:1.9.7]
      	at org.cyberneko.html.filters.DefaultFilter.startElement(DefaultFilter.java:136) ~[nekohtml-1.9.7.jar:na]
      	at org.cyberneko.html.filters.NamespaceBinder.startElement(NamespaceBinder.java:278) ~[nekohtml-1.9.7.jar:na]
      	at org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLScanner.java:2680) ~[nekohtml-1.9.7.jar:1.9.7]
      	at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2012) ~[nekohtml-1.9.7.jar:1.9.7]
      	at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:910) ~[nekohtml-1.9.7.jar:1.9.7]
      	at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) ~[nekohtml-1.9.7.jar:1.9.7]
      	at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) ~[nekohtml-1.9.7.jar:1.9.7]
      	at org.cyberneko.html.parsers.DOMFragmentParser.parse(DOMFragmentParser.java:166) ~[nekohtml-1.9.7.jar:na]
      	at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan(AntiSamyDOMScanner.java:172) ~[na:na]
      	... 340 common frames omitted
      

      Those 4 pull requests have a bad markdown description or comment that our HTML sanitizer (to stop XSS attacks) is dying on. This has been fixed in newer versions of Stash. For now, unless you want to upgrade, the easiest thing to do is to find the offending text and fix it directly in the database.

      The general nature of the problem is that Stash chokes on a particular XML-like pattern of text in comments. From other customers with the same problem we believe the text is similar to the following "<SomeLettersOrNumbers " followed by a character that is invalid for an XML attribute. E.g. "<a -" or "<abc !" (note that the quotes I use are for punctuation in this paragraph - these quotes are not in the comment text in the database).

      From this error 2014-02-25 15:07:42,652 WARN [http-bio-8443-exec-1531] a_svn 907x552466x5 1ydvh8d 10.30.130.141 "GET /projects/DSST/repos/ca/pull-requests/524/overview HTTP/1.1" c.atlassian.stash.json.JsonRenderer Failed to marshal com.atlassian.stash.internal.pull.InternalPullRequest to JSON, this will find all the comments that are probably suspect:

      select c.id, c.comment_text from sta_comment c inner join sta_pr_comment_activity prca on c.id = prca.comment_id inner join sta_pr_activity a on a.activity_id = prca.activity_id inner join sta_pull_request pr on a.pr_id = pr.id inner join repository repo on pr.to_repository_id = repo.id inner join project proj on repo.project_id = proj.id
      where pr.scoped_id =  and repo.slug='plugins' and proj.key='DSST'
      

      And the pull request description:

      select pr.description from sta_pull_request pr
      inner join repository repo on pr.to_repository_id = repo.id 
      inner join project proj on repo.project_id = proj.id 
      where pr.scoped_id = 524 and repo.slug='plugins' and proj.name='DSST'
      

      You just need to tweak the queries to use the different project name and PR scoped ids for each failing PR. When you have the results feel free to post them here, unless they match my description above and you feel confident about updating them directly. As always, before you make any direct changes to the database please make sure you have backed it up appropriately.

      Attachments

        Activity

          People

            cofarrell CharlesA
            cszmajda Cristan Szmajda (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: