Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-72461

Restoring from SNAPPY snapshot on aarch64 throws java.io.IOException: FAILED_TO_UNCOMPRESS(5)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Low Low
    • None
    • 8.13.6, 8.16.1
    • None

      Issue Summary

      Testing deployments of Jira on AArch64-based EC2 instances (c6g, t4g, etc) revealed a consistent failure to restore index snapshots when using SNAPPY format. In my testing, it did not matter if an x86_64 or AArch64 node created the snapshot – snapshots created on either architecture could not be restored by an AArch64 node.

      Steps to Reproduce

      1. Deploy Jira cluster on AArch64 hardware (e.g., via AWS)
      2. Attempt to restore index from SNAPPY snapshot (either manually pulled from another node, when automatically pushed to other cluster nodes after foreground reindex, or from daily snapshot)

      Expected Results

      Jira should extract the snapshot and restore the index.

      Actual Results

      Snapshot creation completes successfully (in this example, via manual snapshot copy request from another node):

      2021-05-12 20:40:32,074+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Received message: "Backup Index" - request to create index snapshot from node: i-0b4522367c8b12a6f-ipd-10.234.25.233 on current node: i-032534fde76
      2021-05-12 20:40:32,074+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Index backup started. Requesting node: i-0b4522367c8b12a6f-ipd-10.234.25.233, currentNode: i-032534fde76ee8e5d-ipd-10.234.27.144
      2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Creating index snapshot of type: SNAPPY in local temporary directory: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz
      2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.util.compression.ArchiveUtils] Creating archive of type: SNAPPY in path: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz
      2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.sharing.index.DefaultSharedEntityIndexManager] SharedEntityIndexer starts adding 2 indexes to backup
      2021-05-12 20:40:32,080+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Creating 1 commit points for Lucene indexes
      2021-05-12 20:40:32,081+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 13 files from entities/searchrequest to backup...
      2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 13 files from entities/searchrequest to backup, time: 1 ms
      2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Releasing 1 commit points for Lucene indexes
      2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Creating 1 commit points for Lucene indexes
      2021-05-12 20:40:32,083+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 13 files from entities/portalpage to backup...
      2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 13 files from entities/portalpage to backup, time: 1 ms
      2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Releasing 1 commit points for Lucene indexes
      2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.sharing.index.DefaultSharedEntityIndexManager] SharedEntityIndexer done adding 2 indexes to backup
      2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.issue.index.DefaultIssueIndexer] IssueIndexer starts adding 4 indexes to backup from: [issues, worklogs, comments, changes]
      2021-05-12 20:40:32,085+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Creating 4 commit points for Lucene indexes
      2021-05-12 20:40:32,087+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 15 files from issues to backup...
      2021-05-12 20:40:32,091+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 15 files from issues to backup, time: 4 ms
      2021-05-12 20:40:32,091+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 1 files from worklogs to backup...
      2021-05-12 20:40:32,092+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 1 files from worklogs to backup, time: 0 ms
      2021-05-12 20:40:32,092+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 15 files from comments to backup...
      2021-05-12 20:40:32,095+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 15 files from comments to backup, time: 3 ms
      2021-05-12 20:40:32,095+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Adding 12 files from changes to backup...
      2021-05-12 20:40:32,097+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Done adding 12 files from changes to backup, time: 1 ms
      2021-05-12 20:40:32,097+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.SnapshotDeletionPolicyContributionStrategy] Releasing 4 commit points for Lucene indexes
      2021-05-12 20:40:32,098+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.issue.index.DefaultIssueIndexer] IssueIndexer done adding 4 indexes to backup from: [issues, worklogs, comments, changes]
      2021-05-12 20:40:32,100+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Done creating index snapshot: IndexSnapshot_10201.tar.sz, number of index files processed: 69
      2021-05-12 20:40:32,100+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Copying index snapshot from: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz to remote shared directory: /media/atl/j
      2021-05-12 20:40:32,142+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Finished copying index snapshot from: /opt/atlassian/jira-software/current/temp/snapshot1620852032079/IndexSnapshot_10201.tar.sz to remote shared directory: /me
      2021-05-12 20:40:32,155+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Deleted: 1 old index snapshot files
      2021-05-12 20:40:32,155+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.IndexUtils] Finished taking a snapshot: /media/atl/jira/shared/caches/IndexSnapshot_10201.tar.sz. File size uncompressed: 70 KB, compressed: 32 KB. Compression time 20 ms, 
      2021-05-12 20:40:32,155+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Sending message: "Index Backed Up":IndexSnapshot_10201.tar.sz - notification that current node: i-032534fde76ee8e5d-ipd-10.234.27.144 created an in
      2021-05-12 20:40:32,165+0000 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexCopyService] Index backup complete. Snapshot file: IndexSnapshot_10201.tar.sz

      On restore (in this case, on the requesting node), the below exception is thrown in the atlassian-jira.log file:

      2021-05-12 20:40:32,916+0000 ClusterMessageHandlerServiceThread:thread-1 INFO      [c.a.j.index.ha.DefaultIndexCopyService] Received message: "Index Backed Up" - notification that node: i-032534fde76ee8e5d-ipd-10.234.27.144 created an index snapshot which can be restored
      2021-05-12 20:40:32,929+0000 ClusterMessageHandlerServiceThread:thread-1 INFO      [c.a.j.index.ha.DefaultIndexCopyService] Index restore started. Total 0 issues on instance before loading Snapshot file: IndexSnapshot_10201.tar.sz
      2021-05-12 20:40:33,012+0000 ClusterMessageHandlerServiceThread:thread-1 ERROR      [c.a.jira.cluster.OfBizMessageHandlerService] There was a problem handling a cluster message
      java.lang.RuntimeException: com.atlassian.jira.issue.index.IndexException: java.io.IOException: FAILED_TO_UNCOMPRESS(5)
              at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:158)
              at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.receive(DefaultIndexCopyService.java:182)
              at com.atlassian.jira.cluster.OfBizMessageHandlerService.sendLocalFromNode(OfBizMessageHandlerService.java:214)
              at com.atlassian.jira.cluster.OfBizMessageHandlerService.tryHandleMessage(OfBizMessageHandlerService.java:140)
              at com.atlassian.jira.cluster.OfBizMessageHandlerService.handleReceivedMessages(OfBizMessageHandlerService.java:124)
              at com.atlassian.jira.util.thread.JiraThreadLocalUtils.lambda$wrap$1(JiraThreadLocalUtils.java:156)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: com.atlassian.jira.issue.index.IndexException: java.io.IOException: FAILED_TO_UNCOMPRESS(5)
              at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:190)
              at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:156)
              ... 12 more
      Caused by: java.io.IOException: FAILED_TO_UNCOMPRESS(5)
              at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:98)
              at org.xerial.snappy.Snappy.uncompress(Snappy.java:547)
              at org.xerial.snappy.SnappyFramedInputStream.ensureBuffer(SnappyFramedInputStream.java:466)
              at org.xerial.snappy.SnappyFramedInputStream.read(SnappyFramedInputStream.java:207)
              at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:168)
              at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:142)
              at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.readRecord(TarArchiveInputStream.java:428)
              at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getRecord(TarArchiveInputStream.java:397)
              at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:274)
              at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:552)
              at com.atlassian.jira.util.compression.AbstractArchiveDecompressor.decompress(AbstractArchiveDecompressor.java:44)
              at com.atlassian.jira.util.compression.ArchiveUtils.decompress(ArchiveUtils.java:72)
              at com.atlassian.jira.util.compression.ArchiveUtils.decompress(ArchiveUtils.java:90)
              at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:138)
              ... 13 more
      

      In case it helps, I've attached a sample archive file below produced on an AArch64 EC2 instance; the data used for this purpose is just one of the default sample projects.

      IndexSnapshot_10700.tar.sz

      Workaround

      Use an alternative snapshot archive format, as documented in this KB article; e.g., via jira-config.properties:

      jira.index.snapshot.copy.archiver = zip|tar
      

              Unassigned Unassigned
              lgoolsbee Lee Goolsbee
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: