-
Bug
-
Resolution: Fixed
-
High
-
2.5.2
-
None
-
None
Frequent deadlocks are reported by (Oracle) DB when accessing the capability table.
Databases wait to actually write (flush) data until someone wants to read it.
The hibernate session is being flushed and for some reason the capability object is marked as dirty.
Hibernate issues an update to persist the capability. Not usually a big deal but with a large number of plans and the fact that this is invoked from an event, we could have events that try to lock competing resources.
- atlassian-bamboo-buildmaven2-sql.log.gz
- 581 kB
- Krystian Brazulewicz
- atlassian-bamboo-buildmaven2-sql-update.log.gz
- 0.6 kB
- Krystian Brazulewicz
- atlassian-bamboo-createbuildbamboo-sql.log.gz
- 339 kB
- Krystian Brazulewicz
- atlassian-bamboo-createbuildbamboo-sql-update.log.gz
- 0.9 kB
- Krystian Brazulewicz
- bamboo_audit.xls
- 999 kB
- Qualcomm Support
[BAM-5456] CAPABILITY and NOTIFICATIONS table access with Empty values causes (Oracle) Deadlock in larger instances
more similar cases
https://support.atlassian.com/browse/BSP-2349?focusedCommentId=903505&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_903505
https://support.atlassian.com/browse/BSP-2836 (NOtIFICATION_SETS table)
Update from qualcom:
Thanks for the work around. It appears to have solved that issue.
Looks like the second time in 2 weeks that we have been affected by this type of thing. The last one was the JIRA roadmap gadget and null descriptions.
Can I ask why I'm still seeing so many SELECTs that are identical? Shouldn't they be in the EHCache?
I think I saw some cache flushes for the entire CAPABILITY cache. However, the capabilities aren't changing at all. Why do they need to be flushed?
I haven't gone through the other SELECTs but there seems to be way more activity than we should be seeing for the builds that are running.
It seems that for each change detection there are exactly 3 writes to the Capability table. X-referencing the XLS of the database table audit, we see that the Capability being updated are:
update CAPABILITY set KEY_IDENTIFIER=:1, VALUE=:2, CAPABILITY_SET=:3 where CAPABILITY_ID=:4 #1(28):system.builder.script.Script #2(0): #3(6):163842 #4(6):196609 #1(3):pds #2(0): #3(6):163842 #4(8):17924133 #1(5):qctit #2(0): #3(6):163842 #4(8):13107233
from the logs, the 3 of 24 objects being updated matched to:
CapabilityImpl{id=196609, capabilitySet=LocalCapabilitySet#163842, value=, key=system.builder.script.Script} CapabilityImpl{id=17924133, capabilitySet=LocalCapabilitySet#163842, value=, key=pds} CapabilityImpl{id=13107233, capabilitySet=LocalCapabilitySet#163842, value=, key=qctit} ... CapabilityImpl{id=2621441, capabilitySet=LocalCapabilitySet#163842, value=/local/mnt/apps/java/qant-1.0.123, key=system.builder.ant.QAnt 1.0.123} CapabilityImpl{id=7012353, capabilitySet=LocalCapabilitySet#163842, value=/local/mnt/apps/java/jdk1.6.0_11_UE, key=system.jdk.JDK 1.6 Unlimited Encryption} CapabilityImpl{id=13107211, capabilitySet=LocalCapabilitySet#163842, value=/local/mnt/apps/java/java1.4, key=system.jdk.JDK 1.4} CapabilityImpl{id=13107212, capabilitySet=LocalCapabilitySet#163842, value=/local/mnt/apps/java/java5, key=system.jdk.JDK 1.5} ...
The only thing in common is that the the value is empty. We ensure that the CapabilityImpl's getValue is @NotNull by returning an empty String when value is null. Most dBs will store the empty String but I believe that Oracle will convert this to a null value, thus Hibernate will assume that these values are different, and mark the objects as dirty and try to persist them whenever it's retrieved.
So it looks like this innocuous looking piece of code in CapabilityImpl is the cause of the pain:
@NotNull public String getValue() { return StringUtils.defaultString(value); }
As a workaround, can you see if setting these capabilities to a non-null value or removing them will ease your problems?
Bug is due to NULL capability values being set to "". Setting the value to anything is a work-around. No more deadlocks.
Guys,
Thanks for the info here. We'll take a look and see what's happening.
Cheers,
Mark C
We just started an audit on the table and we are seeing 400 update statements per second. I think we need a patch.
We have around 500 plans. It seems each local build event reads the capabilities and somehow dirties the collection. That causes an update to all the local agent capabilities on every build event when the session is flushed. The only thing suspicious is the constructor of the CapabilitySetImpl that re-references the collection. Hibernate collections should not be re-referenced even if they contain the same objects.
log snippets:
2010-02-19 16:22:52,418 INFO [TP-Processor7] [AccessLogFilter] rsnover http://ci.qualcomm.com/bamboo/build/admin/create/createBuildRepository.action 630826kb
2010-02-19 16:22:55,674 INFO [TP-Processor7] [AccessLogFilter] rsnover http://ci.qualcomm.com/bamboo/build/admin/create/addDefaultsForBuilder.action 595990kb
2010-02-19 16:22:58,999 WARN [BAM::Events:pool-1-thread-11] [JDBCExceptionReporter] SQL Error: 60, SQLState: 61000
2010-02-19 16:22:58,999 ERROR [BAM::Events:pool-1-thread-11] [JDBCExceptionReporter] ORA-00060: deadlock detected while waiting for resource
2010-02-19 16:22:58,999 ERROR [BAM::Events:pool-1-thread-11] [SessionImpl] Could not synchronize database state with session
2010-02-19 16:22:59,007 ERROR [BAM::Events:pool-1-thread-11] [NamedThreadFactory] Uncaught exception in thread BAM::Events:pool-1-thread-11
org.springframework.jdbc.UncategorizedSQLException: Hibernate operation: could not update: com.atlassian.bamboo.v2.build.agent.capability.CapabilityImpl#196609; uncategorized SQLException for SQL []; SQL state [61000]; error code [60]; ORA-00060: deadlock detected while waiting for resource
; nested exception is java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
Caused by: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:125)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:316)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:282)
at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:639)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:185)
at oracle.jdbc.driver.T4CPreparedStatement.execute_for_rows(T4CPreparedStatement.java:633)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1086)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:2984)
at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:3057)
at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:102)
at net.sf.hibernate.impl.NonBatchingBatcher.addToBatch(NonBatchingBatcher.java:22)
at net.sf.hibernate.persister.EntityPersister.update(EntityPersister.java:661)
at net.sf.hibernate.persister.EntityPersister.update(EntityPersister.java:621)
at net.sf.hibernate.impl.ScheduledUpdate.execute(ScheduledUpdate.java:52)
at net.sf.hibernate.impl.SessionImpl.execute(SessionImpl.java:2449)
at net.sf.hibernate.impl.SessionImpl.executeAll(SessionImpl.java:2435)
at net.sf.hibernate.impl.SessionImpl.execute(SessionImpl.java:2393)
at net.sf.hibernate.impl.SessionImpl.flush(SessionImpl.java:2261)
at net.sf.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:61)
at org.springframework.orm.hibernate.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:510)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:662)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:632)
at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:314)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:116)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy15.getSharedLocalCapabilitySet(Unknown Source)
at com.atlassian.bamboo.v2.build.agent.capability.CapabilityContextImpl.getCapabilitySet(CapabilityContextImpl.java:33)
at com.atlassian.bamboo.build.VariableBuilderBeanImpl.getVariablesForAdminAndCapabilityContext(VariableBuilderBeanImpl.java:76)
at com.atlassian.bamboo.build.VariableBuilderBeanImpl.getVariables(VariableBuilderBeanImpl.java:39)
============================================================================
2010-02-25 07:37:54,453 INFO [TP-Processor23] [AccessLogFilter] devsingh http://ci.qualcomm.com/bamboo/build/admin/edit/updateBuildBuilder.action 294243kb
2010-02-25 07:37:57,817 WARN [BAM::Events:pool-1-thread-113] [JDBCExceptionReporter] SQL Error: 60, SQLState: 61000
2010-02-25 07:37:57,817 ERROR [BAM::Events:pool-1-thread-113] [JDBCExceptionReporter] ORA-00060: deadlock detected while waiting for resource
2010-02-25 07:37:57,817 ERROR [BAM::Events:pool-1-thread-113] [SessionImpl] Could not synchronize database state with session
2010-02-25 07:37:57,821 ERROR [BAM::Events:pool-1-thread-113] [NamedThreadFactory] Uncaught exception in thread BAM::Events:pool-1-thread-113
org.springframework.jdbc.UncategorizedSQLException: Hibernate operation: could not update: com.atlassian.bamboo.v2.build.agent.capability.CapabilityImpl#196609; uncategorized SQLException for SQL []; SQL state [61000]; error code [60]; ORA-00060: deadlock detected while waiting for resource
; nested exception is java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
Caused by: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
===============================================================
2010-03-03 14:16:56,136 INFO [TP-Processor14] [AccessLogFilter] jvictor http://ci.qualcomm.com/bamboo/build/admin/create/createBuild.action 230433kb
2010-03-03 14:16:56,356 INFO [TP-Processor14] [AccessLogFilter] jvictor http://ci.qualcomm.com/bamboo/build/admin/create/addBuildRepository.action 222776kb
1055639.796: [GC [PSYoungGen: 307900K->13608K(326848K)] 807469K->513354K(1025920K), 0.0287420 secs] [Times: user=0.07 sys=0.00, real=0.03 secs]
1055647.921: [GC [PSYoungGen: 317800K->12455K(326848K)] 817546K->520871K(1025920K), 0.0382920 secs] [Times: user=0.14 sys=0.01, real=0.04 secs]
1055652.501: [GC [PSYoungGen: 316647K->9283K(328192K)] 825063K->525663K(1027264K), 0.0315140 secs] [Times: user=0.11 sys=0.00, real=0.03 secs]
1055667.630: [GC [PSYoungGen: 315267K->6051K(327296K)] 831647K->527463K(1026368K), 0.0259940 secs] [Times: user=0.06 sys=0.00, real=0.02 secs]
2010-03-03 14:17:32,282 INFO [TP-Processor19] [AccessLogFilter] jvictor http://ci.qualcomm.com/bamboo/build/admin/create/createBuildRepository.action 433439kb
2010-03-03 14:17:32,614 INFO [TP-Processor19] [AccessLogFilter] jvictor http://ci.qualcomm.com/bamboo/build/admin/create/addDefaultsForBuilder.action 431461kb
2010-03-03 14:17:33,687 INFO [TP-Processor18] [AccessLogFilter] c_gerryc http://ci.qualcomm.com/bamboo/build/admin/triggerManualBuild.action?buildKey=QCTCOLLABCMN-DAT 379239kb
2010-03-03 14:17:33,961 INFO [TP-Processor18] [SvnRepository] Collecting changes for 'QCTCOLLABCMN-DAT' on path 'https://svnit.qualcomm.com/qctit/prd/common/documentum/dat/trunk' from version 45477 to 45612
2010-03-03 14:17:34,059 INFO [TP-Processor18] [DefaultChangeDetectionManager] Found one change for plan 'QCTCOLLABCMN-DAT', waiting 30 seconds to see if there are any more... (4 retries remaining)
1055683.681: [GC [PSYoungGen: 312035K->12673K(328640K)] 833447K->535046K(1027712K), 0.0289720 secs] [Times: user=0.07 sys=0.00, real=0.03 secs]
1055693.684: [GC [PSYoungGen: 320321K->4026K(328512K)] 842694K->530023K(1027584K), 0.0213580 secs] [Times: user=0.05 sys=0.01, real=0.02 secs]
2010-03-03 14:17:55,021 INFO [TP-Processor8] [AccessLogFilter] jvictor http://ci.qualcomm.com/bamboo/build/admin/create/createBuildBuilder.action 378724kb
2010-03-03 14:17:58,361 WARN [BAM::Events:pool-1-thread-213] [JDBCExceptionReporter] SQL Error: 60, SQLState: 61000
2010-03-03 14:17:58,361 ERROR [BAM::Events:pool-1-thread-213] [JDBCExceptionReporter] ORA-00060: deadlock detected while waiting for resource
2010-03-03 14:17:58,361 ERROR [BAM::Events:pool-1-thread-213] [SessionImpl] Could not synchronize database state with session
2010-03-03 14:17:58,367 ERROR [BAM::Events:pool-1-thread-213] [NamedThreadFactory] Uncaught exception in thread BAM::Events:pool-1-thread-213
org.springframework.jdbc.UncategorizedSQLException: Hibernate operation: could not update: com.atlassian.bamboo.v2.build.agent.capability.CapabilityImpl#196609; uncategorized SQLException for SQL []; SQL state [61000]; error code [60]; ORA-00060: deadlock detected while waiting for resource
; nested exception is java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
Caused by: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
=========================================================
in https://support.atlassian.com/browse/BSP-2836 non-existing Notification Recipients are stored as StringUtils.EMPTY in the NOTIFICATIONS table.