Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Highest
Fix Version/s: 7.4.6, 7.6.3, 7.8.3, 7.9.0
Affects Version/s: 7.0.1, 7.0.5, 7.1.2, 7.2.0, 7.2.2, 7.3.5, 7.4.3, 7.5.2, 7.6.2, 7.7.2
Component/s: Data Center - Core
Labels:
- hazelcast
- qko-rating-check-week-43

Fixed in Long Term Support Release/s:

Download 7.4
Support reference count:
12
Symptom Severity:
Severity 1 - Critical
UIS:
254
Bug Fix Policy:
View Atlassian Server bug fix policy

Description

Issue Summary

If distributed cache sizes are modified, cluster nodes cannot be restarted without shutting down the whole cluster first. This is due to a Hazelcast exception that occurs due to the detected difference between the cache sizes.

This only affects Confluence 7.0.1 Data Center onwards.

Local or hybrid cache size changes do not trigger this issue, only caches marked in the advanced view as Distributed.

Steps to Reproduce

Start with all nodes in the cluster running
Modify one of the distributed caches:
- Confluence Administration >> General Configuration >> Cache Management >> Show Advanced View -> Find 'Content Properties' -> Change size from 10000 to 10050.
Shutdown Confluence on one of the nodes
Start Confluence on the node that was shutdown. When starting, it will attempt to configure the distributed cache with the new size of 10050 (specified in cache-settings-overrides.properties in the shared home directory) however this won't match the existing size that the other nodes are using (10000) and the error below will occur.

Expected Results

Like pre-7.0.1 versions (Hazelcast 3.8.6 and below), Confluence cluster nodes should start with no errors after cache tuning.

Actual Results

From 7.0.1 onwards (Hazelcast 3.11.4 and 3.11.6), after distributed cache sizes are modified, cluster nodes will not restart. The below exception is thrown in the atlassian-confluence.log file:

com.hazelcast.config.ConfigurationException: Cannot add a dynamic configuration 'MapConfig{name='atlassian-cache.Cache.com.atlassian.confluence.user.ConfluenceUserPropertySetFactory.propertysets', inMemoryFormat=BINARY', backupCount=0, asyncBackupCount=0, timeToLiveSeconds=3600, maxIdleSeconds=3600, evictionPolicy='LFU', mapEvictionPolicy='null', evictionPercentage=25, minEvictionCheckMillis=100, maxSizeConfig=MaxSizeConfig{maxSizePolicy='PER_NODE', size=10050}, readBackupData=false, hotRestart=HotRestartConfig{enabled=false, fsync=false}, nearCacheConfig=NearCacheConfig{name=default, inMemoryFormat=OBJECT, invalidateOnChange=true, timeToLiveSeconds=3600, maxIdleSeconds=3600, maxSize=10050, evictionPolicy='LFU', evictionConfig=EvictionConfig{size=10050, maxSizePolicy=ENTRY_COUNT, evictionPolicy=LFU, comparatorClassName=null, comparator=null}, cacheLocalEntries=true, localUpdatePolicy=INVALIDATE, preloaderConfig=NearCachePreloaderConfig{enabled=false, directory=, storeInitialDelaySeconds=600, storeIntervalSeconds=600}}, mapStoreConfig=MapStoreConfig{enabled=false, className='null', factoryClassName='null', writeDelaySeconds=0, writeBatchSize=1, implementation=null, factoryImplementation=null, properties={}, initialLoadMode=LAZY, writeCoalescing=true}, mergePolicyConfig=MergePolicyConfig{policy='com.atlassian.confluence.cluster.hazelcast.AlwaysNullMapMergePolicy', batchSize=100}, wanReplicationRef=null, entryListenerConfigs=[], mapIndexConfigs=[], mapAttributeConfigs=[], quorumName=null, queryCacheConfigs=[], cacheDeserializedValues=INDEX_ONLY}' as there is already a conflicting configuration 'MapConfig{name='atlassian-cache.Cache.com.atlassian.confluence.user.ConfluenceUserPropertySetFactory.propertysets', inMemoryFormat=BINARY', backupCount=0, asyncBackupCount=0, timeToLiveSeconds=3600, maxIdleSeconds=3600, evictionPolicy='LFU', mapEvictionPolicy='null', evictionPercentage=25, minEvictionCheckMillis=100, maxSizeConfig=MaxSizeConfig{maxSizePolicy='PER_NODE', size=10000}, readBackupData=false, hotRestart=HotRestartConfig{enabled=false, fsync=false}, nearCacheConfig=NearCacheConfig{name=default, inMemoryFormat=OBJECT, invalidateOnChange=true, timeToLiveSeconds=3600, maxIdleSeconds=3600, maxSize=10000, evictionPolicy='LFU', evictionConfig=EvictionConfig{size=10000, maxSizePolicy=ENTRY_COUNT, evictionPolicy=LFU, comparatorClassName=null, comparator=null}, cacheLocalEntries=true, localUpdatePolicy=INVALIDATE, preloaderConfig=NearCachePreloaderConfig{enabled=false, directory=, storeInitialDelaySeconds=600, storeIntervalSeconds=600}}, mapStoreConfig=MapStoreConfig{enabled=false, className='null', factoryClassName='null', writeDelaySeconds=0, writeBatchSize=1, implementation=null, factoryImplementation=null, properties={}, initialLoadMode=LAZY, writeCoalescing=true}, mergePolicyConfig=MergePolicyConfig{policy='com.atlassian.confluence.cluster.hazelcast.AlwaysNullMapMergePolicy', batchSize=100}, wanReplicationRef=null, entryListenerConfigs=[], mapIndexConfigs=[], mapAttributeConfigs=[], quorumName=null, queryCacheConfigs=[], cacheDeserializedValues=INDEX_ONLY}'

The actual 'name' in MapConfig will differ, depending on which cache value was adjusted.

Workaround

Please use one of the following workarounds to temporarily fix the problem:

Change the modified cache setting back to the previous value
or
Shutdown all cluster nodes, then restart them one by one.
Adding the JVM parameter '-Dhazelcast.dynamicconfig.ignore.conflicts=true' to all cluster nodes's startup command (in setenv.sh/bat).

Attachments

Issue Links

Discovered while testing

CONFSERVER-60416 The max entries of a distributed cache is reset to the default value (instead of the overridden value) if a cluster node is restarted

Short Term Backlog

links to

PSR-470

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

relates to: PSR-470 Loading...; PSR-515 Loading...

(36 mentioned in, 2 relates to)

Activity

People

Assignee:: Tam Tran

Reporter:: Malcolm Ninnes

Votes:: 4 Vote for this issue

Watchers:: 31 Start watching this issue

Dates

Created:: 04/Aug/2020 12:29 AM

Updated:: 21/Aug/2022 9:05 PM

Resolved:: 03/Nov/2020 7:38 AM