Due to a bug (XERCESJ-1668) in the Apache Xerces library, which is used in the Bitbucket backup client for creating/parsing the XML database backup, an attempt to parse a large XML database backup containing certain special characters can result in a database restoration/migration attempt failing with a SAXParseException.
Note: As this issue has been primarily reproduced with data from actual Bitbucket instances, the internal attempts at reproducing/investigating this issue have been done through using copies of this same data. No steps are currently available to generate a database schema from scratch that is able to reproduce this issue.
- Use the Bitbucket backup client to generate a backup archive for a Bitbucket instance with a large database that contains many special characters (such as pull request comments with emojis)
- Use the standard restore process to attempt to restore the generated backup to a new external database.
- Using the database migration wizard, attempt to migrate a large Bitbucket database containing many special characters (such as pull request comments with emojis) to a new database.
The database restoration completes as expected, with no exception being thrown.
The restore fails, with the following exception in the associated restoration attempt's logs:
Follow the workaround steps listed in the following knowledge article: