Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-18850

Change the default value of com.sun.jndi.ldap.connect.pool.timeout on Bitbucket from 0(unlimited) to a lower value

    XMLWordPrintable

Details

    • We collect Bitbucket feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

    Description

      Issue Summary

      Change the default value of `com.sun.jndi.ldap.connect.pool.timeout` from 0(unlimited) to a lower value, specifically 30 seconds. This adjustment aims to fix LDAP Directory sync issues and authentication failures reported by Bitbucket when stale LDAP pool connections are reused.

      The default value of 0 (zero) means that the idle time is unlimited, so connections will never be timed out.

      This adjustment aligns with the settings for Jira and Confluence, ensuring that all connections are evicted and preventing the persistence of stale connections in the LDAP pool.

      Crowd also recommends changing this value from 0

      This issue is reproducible on Data Center (yes).

      Steps to Reproduce

      1. Occasionally, when the LDAP pool connection is closed by the LDAP server or AD, and the same FIN, ACK is dropped due to network issues, Bitbucket continues to believe the connection isn't closed.
      2. Consequently, the same LDAP connection pool is subsequently used for authentication and LDAP syncs.
      3. This, in turn, causes Authentication or Directory Sync requests to fail due to timeouts set in Bitbucket.

      Expected Results

      Evict the LDAP connections every few minutes or seconds so stale connections are all evicted and not reused reducing the chance of it getting reused

      Actual Results

      Bitbucket currently reuses the stale LDAP pool connection, leading to timeouts when the read timeout or LDAP timeout value is reached.

      ldap.read.timeout: 360000
      
      2023-08-24 11:26:14,847 ERROR [Caesium-1-4]  c.a.c.d.DbCachingRemoteDirectory Exception occured when performing full synchronization
      com.atlassian.crowd.exception.OperationFailedException: java.util.concurrent.ExecutionException: com.atlassian.crowd.exception.OperationFailedException: org.springframework.ldap.UncategorizedLdapException: Uncategorized exception occured during LDAP processing; nested exception is javax.naming.NamingException: LDAP response read timed out, timeout used: 360000 ms.; remaining name 'DC=instenv-180103-ldne,DC=local'
      	at com.atlassian.crowd.directory.synchronisation.cache.UsnChangedCacheRefresher.synchroniseAllUsers(UsnChangedCacheRefresher.java:209)
      	at com.atlassian.crowd.directory.synchronisation.cache.AbstractCacheRefresher.synchroniseAll(AbstractCacheRefresher.java:45)
      	at com.atlassian.crowd.directory.synchronisation.cache.UsnChangedCacheRefresher.synchroniseAll(UsnChangedCacheRefresher.java:174)
      	at com.atlassian.crowd.directory.DbCachingRemoteDirectory.synchroniseCache(DbCachingRemoteDirectory.java:1098)
      	at com.atlassian.crowd.manager.directory.DirectorySynchroniserImpl.lambda$synchronise$0(DirectorySynchroniserImpl.java:85)
      	at com.atlassian.crowd.audit.NoOpAuditLogContext.withAuditLogSource(NoOpAuditLogContext.java:17)
      	at com.atlassian.crowd.manager.directory.DirectorySynchroniserImpl.synchronise(DirectorySynchroniserImpl.java:83)
      	at jdk.internal.reflect.GeneratedMethodAccessor1439.invoke(Unknown Source)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at com.atlassian.crowd.directory.DbCachingDirectoryPoller.pollChanges(DbCachingDirectoryPoller.java:48)
      	at com.atlassian.crowd.manager.directory.monitor.poller.DirectoryPollerJobRunner.runJob(DirectoryPollerJobRunner.java:92)
      	at com.atlassian.scheduler.core.JobLauncher.runJob(JobLauncher.java:134)
      	at com.atlassian.scheduler.core.JobLauncher.launchAndBuildResponse(JobLauncher.java:106)
      	at com.atlassian.scheduler.core.JobLauncher.launch(JobLauncher.java:90)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.launchJob(CaesiumSchedulerService.java:435)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeClusteredJob(CaesiumSchedulerService.java:430)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeClusteredJobWithRecoveryGuard(CaesiumSchedulerService.java:454)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeQueuedJob(CaesiumSchedulerService.java:382)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeJob(SchedulerQueueWorker.java:66)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeNextJob(SchedulerQueueWorker.java:60)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.run(SchedulerQueueWorker.java:35)
      	at java.base/java.lang.Thread.run(Thread.java:829)
      	... 10 frames trimmed
      Caused by: java.util.concurrent.ExecutionException: com.atlassian.crowd.exception.OperationFailedException: org.springframework.ldap.UncategorizedLdapException: Uncategorized exception occured during LDAP processing; nested exception is javax.naming.NamingException: LDAP response read timed out, timeout used: 360000 ms.; remaining name 'DC=instenv-180103-ldne,DC=local'
      	at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
      	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
      	at com.atlassian.crowd.directory.synchronisation.cache.UsnChangedCacheRefresher.synchroniseAllUsers(UsnChangedCacheRefresher.java:197)
      	... 22 common frames omitted
      Caused by: com.atlassian.crowd.exception.OperationFailedException: org.springframework.ldap.UncategorizedLdapException: Uncategorized exception occured during LDAP processing; nested exception is javax.naming.NamingException: LDAP response read timed out, timeout used: 360000 ms.; remaining name 'DC=instenv-180103-ldne,DC=local'
      	at com.atlassian.crowd.directory.SpringLDAPConnector.pageSearchResults(SpringLDAPConnector.java:366)
      	at com.atlassian.crowd.directory.SpringLDAPConnector.searchEntitiesWithRequestControls(SpringLDAPConnector.java:399)
      	at com.atlassian.crowd.directory.SpringLDAPConnector.searchEntities(SpringLDAPConnector.java:383)
      	at com.atlassian.crowd.directory.SpringLDAPConnector.searchUserObjects(SpringLDAPConnector.java:586)
      	at com.atlassian.crowd.directory.SpringLDAPConnector.searchUsers(SpringLDAPConnector.java:931)
      	at com.atlassian.crowd.directory.synchronisation.cache.UsnChangedCacheRefresher.lambda$synchroniseAll$0(UsnChangedCacheRefresher.java:148)
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	... 2 common frames omitted
      Caused by: org.springframework.ldap.UncategorizedLdapException: Uncategorized exception occured during LDAP processing; nested exception is javax.naming.NamingException: LDAP response read timed out, timeout used: 360000 ms.; remaining name 'DC=instenv-180103-ldne,DC=local'
      	at org.springframework.ldap.support.LdapUtils.convertLdapException(LdapUtils.java:228)
      	at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:397)
      	at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper$3.timedGet(SpringLdapTemplateWrapper.java:143)
      	at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper$3.timedGet(SpringLdapTemplateWrapper.java:139)
      	at com.atlassian.crowd.directory.ldap.monitoring.TimedSupplier.get(TimedSupplier.java:37)
      	at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper.invokeWithContextClassLoader(SpringLdapTemplateWrapper.java:85)
      	at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper.search(SpringLdapTemplateWrapper.java:139)
      	at com.atlassian.crowd.directory.SpringLDAPConnector.pageSearchResults(SpringLDAPConnector.java:340)
      	... 9 common frames omitted
      Caused by: javax.naming.NamingException: LDAP response read timed out, timeout used: 360000 ms.
      	at java.naming/com.sun.jndi.ldap.LdapRequest.getReplyBer(LdapRequest.java:129)
      	at java.naming/com.sun.jndi.ldap.Connection.readReply(Connection.java:443)
      	at java.naming/com.sun.jndi.ldap.LdapClient.getSearchReply(LdapClient.java:639)
      	at java.naming/com.sun.jndi.ldap.LdapClient.search(LdapClient.java:562)
      	at java.naming/com.sun.jndi.ldap.LdapCtx.doSearch(LdapCtx.java:2014)
      	at java.naming/com.sun.jndi.ldap.LdapCtx.searchAux(LdapCtx.java:1873)
      	at java.naming/com.sun.jndi.ldap.LdapCtx.c_search(LdapCtx.java:1798)
      	at java.naming/com.sun.jndi.toolkit.ctx.ComponentDirContext.p_search(ComponentDirContext.java:392)
      	at java.naming/com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(PartialCompositeDirContext.java:358)
      	at java.naming/javax.naming.directory.InitialDirContext.search(InitialDirContext.java:276)
      	at java.base/jdk.internal.reflect.GeneratedMethodAccessor1158.invoke(Unknown Source)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at org.springframework.ldap.transaction.compensating.manager.TransactionAwareDirContextInvocationHandler.invoke(TransactionAwareDirContextInvocationHandler.java:90)
      	at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper$3.lambda$timedGet$0(SpringLdapTemplateWrapper.java:141)
      	at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:363)
      	... 15 common frames omitted
      

      Workaround

      To address this issue, set the default value of `com.sun.jndi.ldap.connect.pool.timeout` to a lower value, such as 30 seconds.

      1. Append or add the JVM_SUPPORT_RECOMMENDED_ARGS in <BITBUCKET_INSTALL>/bin/_start-webapp.sh on Bitbucket node with the following value
        JVM_SUPPORT_RECOMMENDED_ARGS="-Dcom.sun.jndi.ldap.connect.pool.timeout=30000"
        
      2. Restart the Bitbucket application to apply the updated JVM arguments.

      Attachments

        Issue Links

          Activity

            People

              3314e5447912 Ben Longhurst
              5338c0aa34f9 Danny Samuel
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: