Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-19375

Mesh migration fails with ERROR Sidecar failed to repair the primary Mesh repository. Aborting migration

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Medium
    • None
    • 8.9.9
    • Mesh

    Description

      Issue Summary

      When attempting to migrate a Repository back to the remote mesh that was previously remigrated back to the NFS share from the remote mesh, it fails with the following ERROR message

      2024-04-03 14:45:12,004 ERROR [dc-migration:thread-1] danny *M4OZ1Vx846x420925x13 nb7hm4 10.151.208.210,10.150.3.43 "POST /rest/api/latest/migration/mesh HTTP/1.1" c.a.s.i.m.DefaultMeshMigrationService Migration of hierarchy 4b112484deaa515877c5 failed
      com.atlassian.bitbucket.dmz.migration.MeshMigrationFailedException: [TEST/server[21436]] Sidecar failed to repair the primary Mesh repository. Aborting migration
      

      This is reproducible on Data Center: (no)

      Steps to Reproduce

      Couldn't reproduce the issue locally

      Expected Results

      Repo successfully migrated back to Remote Mesh

      Actual Results

      Migration fails with below errors

      The below exception is thrown in the atlassian-bitbucket.log file:

      2024-04-03 14:45:11,992 ERROR [mesh-grpc-request:thread-130] danny *M4OZ1Vx846x420925x13 nb7hm4 10.151.208.210,10.150.3.43 "POST /rest/api/latest/migration/mesh HTTP/1.1" c.a.s.i.s.g.m.DefaultErrorTranslator ABORTED: Repair of p/000c/h/4b112484deaa515877c5/r/21436 is already running
      2024-04-03 14:45:12,004 ERROR [dc-migration:thread-1] danny *M4OZ1Vx846x420925x13 nb7hm4 10.151.208.210,10.150.3.43 "POST /rest/api/latest/migration/mesh HTTP/1.1" c.a.s.i.m.DefaultMeshMigrationService Migration of hierarchy 4b112484deaa515877c5 failed
      com.atlassian.bitbucket.dmz.migration.MeshMigrationFailedException: [~TEST/server[21436]] Sidecar failed to repair the primary Mesh repository. Aborting migration
      	at com.atlassian.stash.internal.scm.git.mesh.RepositoryMeshMigrator$GitHierarchyMigration.repairFromSidecar(RepositoryMeshMigrator.java:382)
      	at com.atlassian.stash.internal.scm.git.mesh.RepositoryMeshMigrator$GitHierarchyMigration.stage(RepositoryMeshMigrator.java:315)
      	at com.atlassian.stash.internal.migration.DefaultMeshMigrationService$MeshMigrationVisitor.stageRepository(DefaultMeshMigrationService.java:665)
      	at com.atlassian.stash.internal.migration.DefaultMeshMigrationService$MeshMigrationVisitor.visit(DefaultMeshMigrationService.java:459)
      	at com.atlassian.stash.internal.migration.DefaultMeshMigrationService$MeshMigrationVisitor.visit(DefaultMeshMigrationService.java:372)
      	at com.atlassian.bitbucket.scope.RepositoryScope.accept(RepositoryScope.java:26)
      	at com.atlassian.stash.internal.migration.DefaultMeshMigrationService.lambda$migrateRepositories$7(DefaultMeshMigrationService.java:224)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
      	at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
      	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
      	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
      	at java.util.LinkedList$LLSpliterator.forEachRemaining(LinkedList.java:1235)
      	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
      	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
      	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
      	at java.util.Iterator.forEachRemaining(Iterator.java:116)
      	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
      	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
      	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
      	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
      	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      	at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
      	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
      	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
      	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
      	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
      	at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1723)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
      	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
      	at com.atlassian.stash.internal.migration.DefaultMeshMigrationService.migrateRepositories(DefaultMeshMigrationService.java:222)
      	at com.atlassian.stash.internal.migration.DefaultMigrationService.lambda$startMeshMigration$11(DefaultMigrationService.java:505)
      	at com.atlassian.sal.core.executor.ThreadLocalDelegateRunnable.run(ThreadLocalDelegateRunnable.java:34)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.lang.Thread.run(Thread.java:750)
      	... 14 frames trimmed
      Caused by: com.atlassian.bitbucket.scm.CommandFailedException: 'Unknown' exited with code -1
      	at com.atlassian.stash.internal.scm.git.mesh.DefaultErrorTranslator.translateDefault(DefaultErrorTranslator.java:167)
      	at com.atlassian.stash.internal.scm.git.mesh.DefaultErrorTranslator.translate(DefaultErrorTranslator.java:104)
      	at com.atlassian.stash.internal.scm.git.mesh.DefaultErrorTranslator.translateIfKnownCause(DefaultErrorTranslator.java:269)
      	at com.atlassian.stash.internal.scm.git.mesh.DefaultErrorTranslator.maybeTranslate(DefaultErrorTranslator.java:57)
      	at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver.maybeTranslate(AbstractFutureResponseObserver.java:209)
      	at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver.lambda$asFuture$1(AbstractFutureResponseObserver.java:123)
      	at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
      	at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866)
      	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
      	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
      	at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver.onError(AbstractFutureResponseObserver.java:99)
      	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:487)
      	at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
      	at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
      	at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
      	at com.atlassian.stash.internal.scm.git.mesh.LastSeenClientInterceptor$LastSeenClientListener.onClose(LastSeenClientInterceptor.java:40)
      	at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
      	at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
      	at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
      	at com.atlassian.stash.internal.scm.git.mesh.StatefulClientCallListener.onClose(StatefulClientCallListener.java:34)
      	at com.atlassian.stash.internal.scm.git.mesh.ErrorHandlingClientInterceptor$ErrorHandlingCall$1.onClose(ErrorHandlingClientInterceptor.java:149)
      	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
      	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
      	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
      	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
      	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
      	... 3 common frames omitted
      Caused by: io.grpc.StatusRuntimeException: ABORTED: Repair of p/000c/h/4b112484deaa515877c5/r/21436 is already running
      	at io.grpc.Status.asRuntimeException(Status.java:535)
      	... 19 common frames omitted
      

      Remote Mesh Log Errors

      2024-04-03 14:45:11,977 DEBUG [grpc-server:thread-3860] danny 5J57O7SDx885x17403530x9 *M4OZ1Vx846x420925x13,3HI0PCCJx885x135654x4 10.150.3.45 "RepositoryService/Repair" (>1 <0) c.a.b.mesh.repair.RepairTarget [p/000c/h/4b112484deaa515877c5/r/21436] Starting repair from ds/0/h/4b112484deaa515877c5/r/21436
      2024-04-03 14:45:11,978 WARN  [grpc-server:thread-3860] danny 5J57O7SDx885x17403530x9 *M4OZ1Vx846x420925x13,3HI0PCCJx885x135654x4 10.150.3.45 "RepositoryService/Repair" (>1 <0) c.a.b.mesh.repair.RepairTarget [p/000c/h/4b112484deaa515877c5/r/21436] Repair failed
      io.grpc.StatusRuntimeException: ABORTED: Repair of p/000c/h/4b112484deaa515877c5/r/21436 is already running
      	at io.grpc.Status.asRuntimeException(Status.java:535)
      	at com.atlassian.bitbucket.mesh.AbstractStatusException.toStatusException(AbstractStatusException.java:44)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.handleError(RepairTarget.java:225)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.onNext(RepairTarget.java:181)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.onNext(RepairTarget.java:61)
      	at com.atlassian.bitbucket.mesh.grpc.GrpcServiceAdvice$ErrorTranslatingStreamObserver.onNext(GrpcServiceAdvice.java:133)
      	at io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262)
      	at io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
      	at com.atlassian.bitbucket.mesh.request.RequestServerCallListener.onMessage(RequestServerCallListener.java:29)
      	at io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
      	at com.atlassian.bitbucket.mesh.grpc.ExecutionContextServerCallListener.lambda$onMessage$3(ExecutionContextServerCallListener.java:36)
      	at io.grpc.Context.run(Context.java:536)
      	at com.atlassian.bitbucket.mesh.execution.GrpcExecutionManager$GrpcExecutionContext.run(GrpcExecutionManager.java:232)
      	at com.atlassian.bitbucket.mesh.grpc.ExecutionContextServerCallListener.onMessage(ExecutionContextServerCallListener.java:36)
      	at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:330)
      	at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:313)
      	at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:834)
      	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base/java.lang.Thread.run(Thread.java:829)
      Caused by: com.atlassian.bitbucket.mesh.repair.RepositoryRepairAlreadyRunningException: Repair of p/000c/h/4b112484deaa515877c5/r/21436 is already running
      	at com.atlassian.bitbucket.mesh.repair.RepairGate.acquireTicket(RepairGate.java:55)
      	at com.atlassian.bitbucket.mesh.repair.DefaultInteractiveRepairHelper.startRepair(DefaultInteractiveRepairHelper.java:136)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.startRepair(RepairTarget.java:199)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.onNext(RepairTarget.java:142)
      	... 18 common frames omitted
      2024-04-03 14:45:11,978 DEBUG [grpc-server:thread-3860] danny 5J57O7SDx885x17403530x9 *M4OZ1Vx846x420925x13,3HI0PCCJx885x135654x4 10.150.3.45 "RepositoryService/Repair" (>1 <0) c.a.b.mesh.grpc.GrpcServiceAdvice The RPC was closed twice 
      java.lang.IllegalStateException: call already closed
      	at com.google.common.base.Preconditions.checkState(Preconditions.java:512)
      	at io.grpc.internal.ServerCallImpl.closeInternal(ServerCallImpl.java:216)
      	at io.grpc.internal.ServerCallImpl.close(ServerCallImpl.java:209)
      	at io.grpc.PartialForwardingServerCall.close(PartialForwardingServerCall.java:48)
      	at io.grpc.ForwardingServerCall.close(ForwardingServerCall.java:22)
      	at io.grpc.ForwardingServerCall$SimpleForwardingServerCall.close(ForwardingServerCall.java:39)
      	at com.atlassian.bitbucket.mesh.grpc.LoggingServerInterceptor$LoggingServerCall.close(LoggingServerInterceptor.java:37)
      	at io.grpc.PartialForwardingServerCall.close(PartialForwardingServerCall.java:48)
      	at io.grpc.ForwardingServerCall.close(ForwardingServerCall.java:22)
      	at io.grpc.ForwardingServerCall$SimpleForwardingServerCall.close(ForwardingServerCall.java:39)
      	at com.atlassian.bitbucket.mesh.request.RequestServerCall.close(RequestServerCall.java:30)
      	at io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onError(ServerCalls.java:389)
      	at com.atlassian.bitbucket.mesh.grpc.BackoffStreamObserver.onError(BackoffStreamObserver.java:57)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.handleError(RepairTarget.java:231)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.onNext(RepairTarget.java:181)
      	at com.atlassian.bitbucket.mesh.repair.RepairTarget.onNext(RepairTarget.java:61)
      	at com.atlassian.bitbucket.mesh.grpc.GrpcServiceAdvice$ErrorTranslatingStreamObserver.onNext(GrpcServiceAdvice.java:133)
      	at io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262)
      	at io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
      	at com.atlassian.bitbucket.mesh.request.RequestServerCallListener.onMessage(RequestServerCallListener.java:29)
      	at io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
      	at com.atlassian.bitbucket.mesh.grpc.ExecutionContextServerCallListener.lambda$onMessage$3(ExecutionContextServerCallListener.java:36)
      	at io.grpc.Context.run(Context.java:536)
      	at com.atlassian.bitbucket.mesh.execution.GrpcExecutionManager$GrpcExecutionContext.run(GrpcExecutionManager.java:232)
      	at com.atlassian.bitbucket.mesh.grpc.ExecutionContextServerCallListener.onMessage(ExecutionContextServerCallListener.java:36)
      	at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:330)
      	at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:313)
      	at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:834)
      	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base/java.lang.Thread.run(Thread.java:829)
      

      Workaround

      Move the existing hierarchy directory from the partition on all Mesh Nodes and retry the Mesh Migration.

      From the above logs the hierarchy directory is the 4b112484deaa515877c5 directory under the partition 000c in the Filesystem.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              5338c0aa34f9 Danny Samuel
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: