Uploaded image for project: 'Jira Service Management Data Center'
  1. Jira Service Management Data Center
  2. JSDSERVER-6808

Slowly typing query into JSD Project Workload search causes service unavailability

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Low
    • 5.12.6, 5.15.2
    • 4.5.4, 3.16.13
    • Reports

    Description

      Issue Summary

      Slowly typing query into JSD Project Workload search causes service unavailability

      Steps to Reproduce

      1. Visit /jira/projects/project-key/reports/workload
      2. Begin typing your search slowly (waiting more than 0.3 seconds between key presses)
        • e.g.: a..b..c..d..e..f..g..h..
      3. Observe the client initiating additional requests to JSD for every key press to this URL:
        • /jira/rest/servicedesk/1/pages/people/agents/project-key/search?query=abcdefg
      4. Observe slow response times to client:
        • Initial search (one character) returned in 10 seconds.
        • Subsequent searches returned results to client in 20 to 100 seconds.
      5. If the simultaneous requests to workload search continue, observe:
        • Increased response times.
        • High JVM memory usage
        • Continuous stop the world garbage collection
        • CPU usage on the server climbing to 100%
        • Service unavailability:
          • 500 errors from Tomcat
          • Timeouts from load balancer

      Expected Results

      JSD Project Workload search should be faster and the service should remain available.

      Actual Results

      Stuck thread exceptions in server logs:

      WARNING [ContainerBackgroundProcessor[StandardEngine[Catalina]]] WARNING [ContainerBackgroundProcessor[StandardEngine[Catalina]]] org.apache.catalina.valves.StuckThreadDetectionValve.notifyStuckThreadDetected Thread [http-nio-8080-exec-52] (id=[15055]) has been active for [122,276] milliseconds to serve the same request for [https://jira-hostname/jira/rest/servicedesk/1/pages/people/agents/project-key/search?query=abcdefghijklmnopqrstuvwxyz&_=1576189764646] and may be stuck (configured threshold for this StuckThreadDetectionValve is [120] seconds). There is/are [29] thread(s) in total that are monitored by this Valve and may be stuck. java.lang.Throwable at com.atlassian.crowd.manager.application.AggregatorImpl.constrainResults(ResultsAggregator.java:153) at com.atlassian.crowd.manager.application.AggregatorImpl.constrainResults(ResultsAggregator.java:141) at com.atlassian.crowd.manager.application.InMemoryNonAggregatingSearchStrategy.searchNestedGroupRelationships(InMemoryNonAggregatingSearchStrategy.java:164) at com.atlassian.crowd.manager.application.ApplicationServiceGeneric.searchNestedGroupRelationships(ApplicationServiceGeneric.java:1751) at com.atlassian.crowd.embedded.core.CrowdServiceImpl.searchNestedGroupRelationships(CrowdServiceImpl.java:216) at com.atlassian.crowd.embedded.core.CrowdServiceImpl.search(CrowdServiceImpl.java:157) at com.atlassian.jira.security.groups.DefaultGroupManager.getGroupsForUser(DefaultGroupManager.java:393) at com.atlassian.jira.security.groups.RequestCachingGroupManager.lambda$new$0(RequestCachingGroupManager.java:43) at com.atlassian.jira.security.groups.RequestCachingGroupManager$$Lambda$174/2024690402.load(Unknown Source) at com.atlassian.jira.cache.request.RequestCacheImpl.get(RequestCacheImpl.java:42) at com.atlassian.jira.security.groups.RequestCachingGroupManager.lambda$new$1(RequestCachingGroupManager.java:46) at com.atlassian.jira.security.groups.RequestCachingGroupManager$$Lambda$175/791274807.load(Unknown Source) at com.atlassian.jira.cache.request.RequestCacheImpl.get(RequestCacheImpl.java:42) at com.atlassian.jira.security.groups.RequestCachingGroupManager.getGroupNamesForUser(RequestCachingGroupManager.java:188) at com.atlassian.jira.security.groups.RequestCachingGroupManager.getGroupNamesForUser(RequestCachingGroupManager.java:193) at com.atlassian.jira.security.DefaultGlobalPermissionManager.loadPermissions(DefaultGlobalPermissionManager.java:332) at com.atlassian.jira.security.DefaultGlobalPermissionManager.hasPermissionIgnoreRecovery(DefaultGlobalPermissionManager.java:347) at com.atlassian.jira.security.DefaultGlobalPermissionManager.hasPermission(DefaultGlobalPermissionManager.java:289)
      

      Multiple threads simultaneously searching: 

      1 [http-nio-8080-exec-49] ab
      2 [http-nio-8080-exec-20] abcdef
      3 [http-nio-8080-exec-30] abc
      4 [http-nio-8080-exec-32] abcdefghijk
      5 [http-nio-8080-exec-6] a
      6 [http-nio-8080-exec-73] abcdefghi
      [...]
      

      Thread dumps indicated the common activity was related to getting permissions from Crowd:  

      --- com.atlassian.jira.security.DefaultGlobalPermissionManager.hasPermission
      --- com.atlassian.jira.security.DefaultGlobalPermissionManager.loadPermissions
      --- com.atlassian.jira.security.groups.RequestCachingGroupManager.getGroupNamesForUser(RequestCachingGroupManager.java:193)
      --- com.atlassian.jira.cache.request.RequestCacheImpl.get(RequestCacheImpl.java:42)
      --- com.atlassian.jira.security.groups.RequestCachingGroupManager$$Lambda$175/791274807.load(Unknown Source)
      --- com.atlassian.jira.cache.request.RequestCacheImpl.get(RequestCacheImpl.java:42)
      --- com.atlassian.jira.security.groups.RequestCachingGroupManager.lambda$new$0(RequestCachingGroupManager.java:43)
      --- com.atlassian.jira.security.groups.DefaultGroupManager.getGroupsForUser(DefaultGroupManager.java:393)
      --- com.atlassian.crowd.embedded.core.CrowdServiceImpl.search(CrowdServiceImpl.java:157)
      --- com.atlassian.crowd.embedded.core.CrowdServiceImpl.searchNestedGroupRelationships(CrowdServiceImpl.java:216)
      --- com.atlassian.crowd.manager.application.ApplicationServiceGeneric.searchNestedGroupRelationships(ApplicationServiceGeneric.java:1751)
      

      Workaround

      1. Type faster.  
      2. Block workload search API endpoint.
      3. Decrease search complexity by reducing number of users, SLAs, nested crowd groups, etc.
      4. Upgrade from JSD 3.x to JSD ER 4.5+ for approximately 25% better workload search performance.
      5. Use another JSD instance to handle only workload search requests.

      Attachments

        Issue Links

          Activity

            People

              c8bcca445054 Benjamin Suess
              wrichardson Wayne Richardson (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Backbone Issue Sync