With a large number of users (in our case, north of 1 million), typing to search for users in the "Alert user" field of a "THEN" automation workflow action causes a spike in CPU utilization, followed by constant garbage collection at a level that renders that node in the cluster unable to respond to any other requests until the service is manually restarted.
- Navigate to the "Project settings" area of a Service Desk project
- Click "Automation" in the sidebar
- Either edit an existing rule or create a new one
- If necessary (if creating a new rule), add an applicable "WHEN" action (such that adding a "THEN" action is possible)
- Add a "THEN" action; select the "Alert user" action
- In the resulting text field, type some characters to select a user
- Wait for list of users to return; see that the response takes a long time (30+ sec) and may not ever return (receiving error message: "The call to the Jira server did not complete within the timeout period. We are unsure of the result of this operation.")
Searching for a user returns results in a reasonable amount of time for the specificity of the query and the service should remain available.
- Each character typed into the field results in a request like the following: /rest/servicedesk/automation/servicedeskmodules/1/then/alertuser/search?usernameOrEmail=b&projectKey=ABC&_=1591399088734
- Every time the text field gains/loses focus, the same request is sent again
- If the user interprets the long response time as an indication that the user they're searching for doesn't exist, they may choose to clear the field with their backspace key; each press to erase a character also sends a request
- While a search for one or two characters, or even a copy/pasted username (a single request) may return successfully (though after a long delay; 30+ sec), a user who types a long username or email address, waits, sees no response, then backspaces their way back through the query can easily generate 20-30 requests in just a few seconds; this activity is enough to cause:
- High CPU
- High JVM memory usage
- Continuous garbage collection
- Service instability and potential unavailability (if the user attempting these searches bounces to other nodes in the cluster (as a result of nodes becoming unhealthy) and attempts their search again)
At the load balancer or other proxy in front of Jira, block requests on the path /rest/servicedesk/automation/servicedeskmodules/1/then/alertuser/search to prevent these search requests from reaching Jira. Obviously this has the side-effect of disabling the functionality entirely, but it does prevent users from triggering this issue.