DeferredReplicationCachedReference 'get' operation triggers unnecessary replication events

XMLWordPrintable

      Issue Summary

      The get operation on a DeferredReplicationCachedReference is deferred, hence causes a new DeferredReplicationStrategy instance to be created, registered to a ThreadLocal variable and run. The execution is almost exclusively no-op, for the majority of 'get' invocations won't cause a change in the cache to be replicated, apart from the initialization of the reference on first use.

      This behaviour generates unnecessary computational overhead. Caches shall be optimized for reads as much as possible; for regular 'get's (which don't generate side-effects to be replicated) we shall have a non-deferred implementation. As an example, read-only cache operations like gets are not deferred at DeferredReplicationCache instances.

      In addition, in certain JVM states like defined at JRASERVER-71058, registering cacheReplicationStrategy ThreadLocal may consume very high CPU.

      Steps to Reproduce

      Reproducible always

      Expected Results

      Read-only operations like 'get' shall not trigger a replication event on DeferredReplicationCachedReferences

      Actual Results

      get operation is deferred, hence goes through deferredReplicaction logic

      Notes

      Example of the merged thread-dump containing com.atlassian.jira.cache.DeferredReplicationCachedReference.get

      "*" count=364 prio=5 tid=* nid=* *
         java.lang.Thread.State: RUNNABLE
      100.00% [364/364]X at com.atlassian.jira.cache.DeferredReplicationCachedReference.get(DeferredReplicationCachedReference.java:28)
       57.69% [210/364]|\ at com.atlassian.jira.cluster.cache.ehcache.BlockingParallelCacheReplicator.runDeferred(BlockingParallelCacheReplicator.java:174)
       57.69% [210/364]| \ at java.lang.ThreadLocal.remove(ThreadLocal.java:222)
       57.69% [210/364]|  X at java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298)
       51.65% [188/364]|  |X at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499)
        2.20%   [8/364]|  ||V at java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:603)
       49.45% [180/364]|  | V at java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617)
        6.04%  [22/364]|   V at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:496)
       22.53%  [82/364]|\ at com.atlassian.jira.cluster.cache.ehcache.BlockingParallelCacheReplicator.runDeferred(BlockingParallelCacheReplicator.java:170)
       22.53%  [82/364]| \ at java.lang.ThreadLocal.set(ThreadLocal.java:203)
       22.53%  [82/364]|  X at java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:298)
       21.43%  [78/364]|   V at java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:467)
       18.41%  [67/364] \ at com.atlassian.jira.cluster.cache.ehcache.BlockingParallelCacheReplicator.runDeferred(BlockingParallelCacheReplicator.java:172)
      

      Note: 18% of snippet is actually doing runDeferred, other code is busy with ThreadLocal operations.

      Note on fix

      As part of the change the following settings are obsoleted:

      • atlassian.disable.deferred.cache.replication
      • jira.cache.replication.localq.disabled

      Workaround

      None

            Assignee:
            Unassigned
            Reporter:
            Kurtcebe Eroglu
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: