-
Bug
-
Resolution: Won't Fix
-
Low
-
None
-
8.13.6, 8.16.1
-
None
-
8.13
-
Severity 3 - Minor
-
Issue Summary
Testing deployments of Jira on AArch64-based EC2 instances (c6g, t4g, etc) revealed a consistent failure to restore index snapshots when using SNAPPY format. In my testing, it did not matter if an x86_64 or AArch64 node created the snapshot – snapshots created on either architecture could not be restored by an AArch64 node.
Steps to Reproduce
- Deploy Jira cluster on AArch64 hardware (e.g., via AWS)
- Attempt to restore index from SNAPPY snapshot (either manually pulled from another node, when automatically pushed to other cluster nodes after foreground reindex, or from daily snapshot)
Expected Results
Jira should extract the snapshot and restore the index.
Actual Results
Snapshot creation completes successfully (in this example, via manual snapshot copy request from another node):
2021-05-12 20:40:32,074+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Received message: "Backup Index" - request to create index snapshot from node: i-0b4522367c8b12a6f-ipd-10.234.25.233 on current node: i-032534fde76 2021-05-12 20:40:32,074+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Index backup started. Requesting node: i-0b4522367c8b12a6f-ipd-10.234.25.233, currentNode: i-032534fde76ee8e5d-ipd-10.234.27.144 2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Creating index snapshot of type: SNAPPY in local temporary directory: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz 2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.util.compression.ArchiveUtils] Creating archive of type: SNAPPY in path: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz 2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.sharing.index.DefaultSharedEntityIndexManager] SharedEntityIndexer starts adding 2 indexes to backup 2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Creating 1 commit points for Lucene indexes 2021-05-12 20:40:32,081+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 13 files from entities/searchrequest to backup... 2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 13 files from entities/searchrequest to backup, time: 1 ms 2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Releasing 1 commit points for Lucene indexes 2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Creating 1 commit points for Lucene indexes 2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 13 files from entities/portalpage to backup... 2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 13 files from entities/portalpage to backup, time: 1 ms 2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Releasing 1 commit points for Lucene indexes 2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.sharing.index.DefaultSharedEntityIndexManager] SharedEntityIndexer done adding 2 indexes to backup 2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.issue.index.DefaultIssueIndexer] IssueIndexer starts adding 4 indexes to backup from: [issues, worklogs, comments, changes] 2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Creating 4 commit points for Lucene indexes 2021-05-12 20:40:32,087+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 15 files from issues to backup... 2021-05-12 20:40:32,091+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 15 files from issues to backup, time: 4 ms 2021-05-12 20:40:32,091+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 1 files from worklogs to backup... 2021-05-12 20:40:32,092+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 1 files from worklogs to backup, time: 0 ms 2021-05-12 20:40:32,092+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 15 files from comments to backup... 2021-05-12 20:40:32,095+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 15 files from comments to backup, time: 3 ms 2021-05-12 20:40:32,095+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 12 files from changes to backup... 2021-05-12 20:40:32,097+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 12 files from changes to backup, time: 1 ms 2021-05-12 20:40:32,097+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Releasing 4 commit points for Lucene indexes 2021-05-12 20:40:32,098+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.issue.index.DefaultIssueIndexer] IssueIndexer done adding 4 indexes to backup from: [issues, worklogs, comments, changes] 2021-05-12 20:40:32,100+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Done creating index snapshot: IndexSnapshot_10201.tar.sz, number of index files processed: 69 2021-05-12 20:40:32,100+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Copying index snapshot from: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz to remote shared directory: /media/atl/j 2021-05-12 20:40:32,142+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Finished copying index snapshot from: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz to remote shared directory: /me 2021-05-12 20:40:32,155+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Deleted: 1 old index snapshot files 2021-05-12 20:40:32,155+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Finished taking a snapshot: /media/atl/jira/shared/caches/IndexSnapshot_10201.tar.sz. File size uncompressed: 70 KB, compressed: 32 KB. Compression time 20 ms, 2021-05-12 20:40:32,155+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Sending message: "Index Backed Up":IndexSnapshot_10201.tar.sz - notification that current node: i-032534fde76ee8e5d-ipd-10.234.27.144 created an in 2021-05-12 20:40:32,165+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Index backup complete. Snapshot file: IndexSnapshot_10201.tar.sz
On restore (in this case, on the requesting node), the below exception is thrown in the atlassian-jira.log file:
2021-05-12 20:40:32,916+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Received message: "Index Backed Up" - notification that node: i-032534fde76ee8e5d-ipd-10.234.27.144 created an index snapshot which can be restored 2021-05-12 20:40:32,929+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Index restore started. Total 0 issues on instance before loading Snapshot file: IndexSnapshot_10201.tar.sz 2021-05-12 20:40:33,012+0000 ClusterMessageHandlerServiceThread:thread-1 ERROR [c.a.jira.cluster.OfBizMessageHandlerService] There was a problem handling a cluster message java.lang.RuntimeException: com.atlassian.jira.issue.index.IndexException: java.io.IOException: FAILED_TO_UNCOMPRESS(5) at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:158) at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.receive(DefaultIndexCopyService.java:182) at com.atlassian.jira.cluster.OfBizMessageHandlerService.sendLocalFromNode(OfBizMessageHandlerService.java:214) at com.atlassian.jira.cluster.OfBizMessageHandlerService.tryHandleMessage(OfBizMessageHandlerService.java:140) at com.atlassian.jira.cluster.OfBizMessageHandlerService.handleReceivedMessages(OfBizMessageHandlerService.java:124) at com.atlassian.jira.util.thread.JiraThreadLocalUtils.lambda$wrap$1(JiraThreadLocalUtils.java:156) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.atlassian.jira.issue.index.IndexException: java.io.IOException: FAILED_TO_UNCOMPRESS(5) at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:190) at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:156) ... 12 more Caused by: java.io.IOException: FAILED_TO_UNCOMPRESS(5) at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:98) at org.xerial.snappy.Snappy.uncompress(Snappy.java:547) at org.xerial.snappy.SnappyFramedInputStream.ensureBuffer(SnappyFramedInputStream.java:466) at org.xerial.snappy.SnappyFramedInputStream.read(SnappyFramedInputStream.java:207) at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:168) at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:142) at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.readRecord(TarArchiveInputStream.java:428) at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getRecord(TarArchiveInputStream.java:397) at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:274) at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:552) at com.atlassian.jira.util.compression.AbstractArchiveDecompressor.decompress(AbstractArchiveDecompressor.java:44) at com.atlassian.jira.util.compression.ArchiveUtils.decompress(ArchiveUtils.java:72) at com.atlassian.jira.util.compression.ArchiveUtils.decompress(ArchiveUtils.java:90) at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:138) ... 13 more
In case it helps, I've attached a sample archive file below produced on an AArch64 EC2 instance; the data used for this purpose is just one of the default sample projects.
Workaround
Use an alternative snapshot archive format, as documented in this KB article; e.g., via jira-config.properties:
jira.index.snapshot.copy.archiver = zip|tar
- mentioned in
-
Page Loading...