Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-31971

Issue collector causes performance issues on host JIRA instance

      NOTE: This bug report is for JIRA Server. Using JIRA Cloud? See the corresponding bug report.

      TL;DR: JIC needs a "scalable" mode where collectors can be deployed into tens of thousands of websites without crashing the JIRA instance hosting JIC. And neither should JIC represent a single point of failure for those other websites.

      When JIC has a "scalable" mode, we have to update the jira-feedback-plugin in JIRA to use that mode.


      I'm currently investigating performance issues on JAC and the issue collector is causing quite a bit of overhead. (https://extranet.atlassian.com/jira/browse/ADM-33614)

      In particular the request for /rest/collectors/1.0/configuration/trigger/d3de7fb5 (which any client with an issue collector will send) causes quite a bit of load on the server hosting issue collectors.

      When this resource is hit it causes DB lookups:

      "http-127.0.0.1-9080-119" daemon prio=10 tid=0x0b4db000 nid=0x1d01 runnable [0x55699000]
         java.lang.Thread.State: RUNNABLE
      	at java.net.SocketInputStream.socketRead0(Native Method)
      	at java.net.SocketInputStream.read(SocketInputStream.java:129)
      	at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:135)
      	at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:104)
      	at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
      	at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:259)
      	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1620)
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
      	- locked <0x78e4b558> (a org.postgresql.core.v3.QueryExecutorImpl)
      	at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:479)
      	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:367)
      	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:271)
      	at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
      	at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
      	at org.ofbiz.core.entity.jdbc.SQLProcessor.executeQuery(SQLProcessor.java:597)
      	at org.ofbiz.core.entity.GenericDAO.selectListIteratorByCondition(GenericDAO.java:1061)
      	at org.ofbiz.core.entity.GenericDAO.selectByAnd(GenericDAO.java:608)
      	at org.ofbiz.core.entity.GenericHelperDAO.findByAnd(GenericHelperDAO.java:131)
      	at org.ofbiz.core.entity.GenericDelegator.findByAnd(GenericDelegator.java:788)
      	at org.ofbiz.core.entity.GenericDelegator.findByAnd(GenericDelegator.java:773)
      	at org.ofbiz.core.entity.GenericDelegator.findByAnd(GenericDelegator.java:750)
      	at com.opensymphony.module.propertyset.ofbiz.OFBizPropertySet.findPropertyEntry(OFBizPropertySet.java:298)
      	at com.opensymphony.module.propertyset.ofbiz.OFBizPropertySet.getType(OFBizPropertySet.java:159)
      	at com.atlassian.jira.propertyset.JiraCachingPropertySet.getType(JiraCachingPropertySet.java:661)
      	at sun.reflect.GeneratedMethodAccessor277.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at com.atlassian.sal.jira.pluginsettings.LazyProjectMigratingPropertySet$PropertySetInvocationHandler.invoke(LazyProjectMigratingPropertySet.java:57)
      	at $Proxy1841.getType(Unknown Source)
      	at com.atlassian.sal.jira.pluginsettings.JiraPluginSettings.getActual(JiraPluginSettings.java:47)
      	at com.atlassian.sal.core.pluginsettings.AbstractStringPluginSettings.get(AbstractStringPluginSettings.java:126)
      	at com.atlassian.jira.collector.plugin.components.CollectorStoreImpl.getCollector(CollectorStoreImpl.java:103)
      	at com.atlassian.jira.collector.plugin.components.CollectorServiceImpl.getCollector(CollectorServiceImpl.java:51)
      	at com.atlassian.jira.collector.plugin.rest.ConfigurationResource.findCollectorById(ConfigurationResource.java:55)
      	at com.atlassian.jira.collector.plugin.rest.ConfigurationResource.getTriggerConfiguration(ConfigurationResource.java:41)
      	at sun.reflect.GeneratedMethodAccessor315.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper$ResponseOutInvoker$1.invoke(DispatchProviderHelper.java:234)
      	at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper$1.intercept(DispatchProviderHelper.java:100)
      	at com.atlassian.plugins.rest.common.interceptor.impl.DefaultMethodInvocation.invoke(DefaultMethodInvocation.java:61)
      	at com.atlassian.plugins.rest.common.expand.interceptor.ExpandInterceptor.intercept(ExpandInterceptor.java:38)
      	at com.atlassian.plugins.rest.common.interceptor.impl.DefaultMethodInvocation.invoke(DefaultMethodInvocation.java:61)
      	at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper.invokeMethodWithInterceptors(DispatchProviderHelper.java:132)
      	at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper$ResponseOutInvoker._dispatch(DispatchProviderHelper.java:230)
      	at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
      	at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
      	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
      	at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
      	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
      	at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
      	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
      	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
      	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
      	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImp
      

      Also the response is returned with no-cache headers:

      Cache-Control:no-cache, no-store, no-transform

      This means that if an issue collector is enabled in OnDemand for example, every single OnDemand user will trigger a request to JAC for every single page-pop which will potentially hit the DB if the propertysets aren't cached in memory.

      I think we should consider:

      • Moving issue collectors to ActiveObjects which may be more efficient than the currently ugly pluginsettings storage (where everything is split across multiple DB rows)
      • Adding temporary cache headers to this response so not every single end-user page-pop hits JAC. I think we don't have cache headers to allow editing of collectors, but surely several hours to see edits on the client is acceptable for a feature like this.

      Effectively the issue collector in OnDemand is launching a DDOS attack on JAC currently.

            [JRASERVER-31971] Issue collector causes performance issues on host JIRA instance

            Yes davidrr,

            The request and database caching added to the issue collector in JIRA 6.0.3 should cause a significant improvement in the performance of fetching the trigger configuration for the collector.

            Regards,

            Oswaldo Hernández.
            JIRA Bugmaster.
            [Atlassian].

            Oswaldo Hernandez (Inactive) added a comment - Yes davidrr , The request and database caching added to the issue collector in JIRA 6.0.3 should cause a significant improvement in the performance of fetching the trigger configuration for the collector. Regards, Oswaldo Hernández. JIRA Bugmaster. [Atlassian] .

            David Yu added a comment -

            Hi guys, I noticed this is resolved but has anything actually been changed that would be applicable for non-OnDemand users?

            David Yu added a comment - Hi guys, I noticed this is resolved but has anything actually been changed that would be applicable for non-OnDemand users?

            I think we could go in one of two directions here.

            Continue with the caching approach
            Pros Cons
            • This is a generic solution that would benefit Atlassian and our customers alike.
            • More effort than the other solution.
            • We risk gold-plating a feature when we could be doing something else that customers care more about.

            The idea would be to continue lwlodarczyk's work and:

            • add a configuration to collectors to allow setting trigger TTL (0 to disable. maybe use a slider)
            • in WebDriver tests, use this configuration option to disable caching
            • test the caching functionality using REST tests where we verify the cache-control headers directly (e.g. using Jersey-client)
            Use the same approach as OnDemand
            Pros Cons
            • Fast to implement
            • Only helps us out with JAC, not a generic solution for our customers

            For our OnDemand deployments we have included the jira-feedback-plugin that has a static trigger. Therefore JAC does not get contacted until a user clicks the "Got Feedback" button. It would be very easy to do the same for our EAP releases. See JIRA's use of 'Feedback' (Issue Collector) in JIRA itself for how it works.

            Luis Miranda (Inactive) added a comment - I think we could go in one of two directions here. Continue with the caching approach Pros Cons This is a generic solution that would benefit Atlassian and our customers alike. More effort than the other solution. We risk gold-plating a feature when we could be doing something else that customers care more about. The idea would be to continue lwlodarczyk 's work and: add a configuration to collectors to allow setting trigger TTL (0 to disable. maybe use a slider) in WebDriver tests, use this configuration option to disable caching test the caching functionality using REST tests where we verify the cache-control headers directly (e.g. using Jersey-client) Use the same approach as OnDemand Pros Cons Fast to implement Only helps us out with JAC, not a generic solution for our customers For our OnDemand deployments we have included the jira-feedback-plugin that has a static trigger. Therefore JAC does not get contacted until a user clicks the "Got Feedback" button. It would be very easy to do the same for our EAP releases. See JIRA's use of 'Feedback' (Issue Collector) in JIRA itself for how it works.

            So I had a look in the access logs on JAC and we're now getting around 450-500K requests for "/rest/collectors" in a 24 hour period (which is far better than the ~20M requests in 24 hours previously). JAC isn't affected by this now but I'm not sure that's going to remain the case as our number of BTF evaluators grows. We may just be delaying the problem if we don't fix this now by adding caching instructions etc.

            Andreas Knecht (Inactive) added a comment - So I had a look in the access logs on JAC and we're now getting around 450-500K requests for "/rest/collectors" in a 24 hour period (which is far better than the ~20M requests in 24 hours previously). JAC isn't affected by this now but I'm not sure that's going to remain the case as our number of BTF evaluators grows. We may just be delaying the problem if we don't fix this now by adding caching instructions etc.

            Taking a step back for a moment, the motivation for implementing this has been greatly reduced since JRADEV-19583 has made it into OnDemand back in JIRA 6.0-OD9. I've just been browsing NewRelic and can't find any trace of the the JIC REST in the slow traces view, but maybe that's just me being a NR noob.

            andreask@atlassian.com do you know if JAC is still affected by this problem at all? If not, I would say the problem is not worth fixing.

            Luis Miranda (Inactive) added a comment - Taking a step back for a moment, the motivation for implementing this has been greatly reduced since JRADEV-19583 has made it into OnDemand back in JIRA 6.0-OD9. I've just been browsing NewRelic and can't find any trace of the the JIC REST in the slow traces view, but maybe that's just me being a NR noob. andreask@atlassian.com do you know if JAC is still affected by this problem at all? If not, I would say the problem is not worth fixing.

            they have to re-deploy their site

            Or use a "feature flag" or any other configuration on their side. Yes I agree this does not seem unreasonable for a "scale" mode.

            Matt Quail (Inactive) added a comment - they have to re-deploy their site Or use a "feature flag" or any other configuration on their side. Yes I agree this does not seem unreasonable for a "scale" mode.

            I'm with the JIC "scalable" mode where no configuration calls are made until you click the button. It can display an error message at that point saying that feedback via this button has been disabled. If people want the button to go away on their site once they've disabled their issue collector, they have to re-deploy their site. That doesn't seem like an unreasonable restriction.

            ig (Inactive) added a comment - I'm with the JIC "scalable" mode where no configuration calls are made until you click the button. It can display an error message at that point saying that feedback via this button has been disabled. If people want the button to go away on their site once they've disabled their issue collector, they have to re-deploy their site. That doesn't seem like an unreasonable restriction.

            He's done some recent investigation into AO vs. plugin settings and found that AO actually isn't significantly faster than plugin settings

            The wider company would love to see this investigation on EAC sginter. I've frequently heard AO being chosen over plugin settings due to it's perceived increased performance, but if this isn't the case at the moment that's an interesting finding.

            ig (Inactive) added a comment - He's done some recent investigation into AO vs. plugin settings and found that AO actually isn't significantly faster than plugin settings The wider company would love to see this investigation on EAC sginter . I've frequently heard AO being chosen over plugin settings due to it's perceived increased performance, but if this isn't the case at the moment that's an interesting finding.

            Gliffy had this problem too (spud has contact details)

            Matt Quail (Inactive) added a comment - Gliffy had this problem too (spud has contact details)

            Whenever JIC gets a "scalable" mode, we have to update the jira-feedback-plugin in JIRA to use that mode.

            Matt Quail (Inactive) added a comment - Whenever JIC gets a "scalable" mode, we have to update the jira-feedback-plugin in JIRA to use that mode.

              ohernandez@atlassian.com Oswaldo Hernandez (Inactive)
              andreask@atlassian.com Andreas Knecht (Inactive)
              Affected customers:
              1 This affects my team
              Watchers:
              15 Start watching this issue

                Created:
                Updated:
                Resolved: