-
Bug
-
Resolution: Obsolete
-
Low
-
6.4.14, 7.2.13, 7.6.6
-
6.04
-
Severity 2 - Major
-
Problem
During startup, Jira performs a consistency check for various items, including a check to see if the index locations already have an existing lock. There are no explicit waits on this, so at times the check may run after existing services/job have already started. This may cause the consistency check to fail and the result is Jira startup will fail.
- This has been found to occur in a Data Center environment where the NodeReindexServiceThread starts indexing before the consistency checker. This results in index write.lock check errors.
Steps to Reproduce
One situation found to delay the consistency checker is JMX monitoring, see JRASERVER-67614 for details. So we'll use this as an example.
- Startup Jira Data Center Node with JMX enabled
- Notice that there is a long delay during startup, in this case ~20s:
2018-07-13 17:55:09,844 localhost-startStop-1 INFO [c.a.j.instrumentation.external.DatabaseExternalGauges] Installing DBCP monitoring instruments: DatabaseExternalGauges.JiraDbcpInstruments[instruments=[DBCP_MAX, DBCP_ACTIVE, DBCP_IDLE],objectName=com.atlassian.jira:name=BasicDataSource] 2018-07-13 17:55:38,300 localhost-startStop-1 WARN [c.a.jira.health.HealthChecks] Your database is using an unsupported collation
- Notice that there is a long delay during startup, in this case ~20s:
- After waiting for a longer period, we see:
******************************************************************************************************************************************************************************************************** Index lock file(s) found. This occurs either because JIRA was not cleanly shutdown or because there is another instance of this JIRA installation currently running. Please ensure that no other instance of this JIRA installation is running and then remove the following lock file(s) and restart JIRA: UtilConcurrentLock{lock=true, fullLockName=/opt/atlassian/home/caches/indexes/comments/write.lock} UtilConcurrentLock{lock=true, fullLockName=/opt/atlassian/home/caches/indexes/issues/write.lock} UtilConcurrentLock{lock=true, fullLockName=/opt/atlassian/home/caches/indexes/changes/write.lock} UtilConcurrentLock{lock=true, fullLockName=/opt/atlassian/home/caches/indexes/worklogs/write.lock} Once restarted you will need to reindex your data to ensure that indexes are up to date. Do NOT delete the lock file(s) if there is another JIRA running with the same index directory instead cleanly shutdown the other instance. ********************************************************************************************************************************************************************************************************
Diagnosing
If receiving the index write.lock errors, we can diagnose by:
- Enable DEBUG logging on the following packages via log4j.properties and restart Jira:
com.atlassian.jira.index.LuceneCorruptionChecker com.atlassian.jira.upgrade.ConsistencyCheckerImpl com.atlassian.jira.startup.JiraStartupLogger com.atlassian.jira.util.LuceneDirectoryUtilsImpl.UtilConcurrentLock
- This will detect if index operations are occurring before the the consistency checker
- Capture thread dumps in short 2 second intervals to examine what's happening during the startup delay.
Workaround
In the case where JMX monitoring is causing startup delays in Jira Data Center, disable JMX monitoring before starting up the new node. Once startup is complete, JMX monitoring can be re-enabled.
- is caused by
-
JRASERVER-67614 JMX Monitoring may take a long time to load during Jira Startup
- Gathering Impact
-
JRASERVER-67619 Start of Jira datacenter replication thread NodeReindexServiceThread is not synchronised with Jira start localhost-startStop thread
- Gathering Impact
- relates to
-
DELTA-374 Loading...