Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-41559

Some REST calls return 200 with no body and AUTHENTICATED_FAILED

      NOTE: This bug report is for JIRA Server. Using JIRA Cloud? See the corresponding bug report.

      For the moment this bug(s) was only reported OnDemand and we do have some reasons to believe that is also related to the server load.

      Expected behaviour: return a JSON response.

      Problems:

      • 200 means success and should never have an empty body. Empty body success responses are supposed to use code 204 – http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
      • An empty body is an invalid JSON response, this not being allowed.
      • If it is an authentication failure, this MUST return a 401 or 403 code.
      • This is not a real authentication failure because we are 100% sure that the credentials are right (using basic_auth).
      • If the server is not able to respond due to other causes it MUST reply with a 503 code and optionally with a Retry-After header that tell the client when to retry the request.

      As stated above this bug uncovers several serious HTTP standard deviations, probably caused by several broken pieces of code.

      It may be useful not remark this response header and the fact that, so far, all reports were happening while using basic_auth

      'x-seraph-loginreason': 'OUT, AUTHENTICATED_FAILED'

      Atlassian support suggested, as a temporary workaround, to use alternative authentication options. Still our tests proved that other auth ways are even more prone to fail. Also BASIC_AUTH is documented in several places as the recommended authentication to use with REST, that being one of the reasons we call it REST.

        1. Authenticated-Failed.zip
          21 kB
          Matthias Gaiser [K15t]
        2. modified-pycontrib.zip
          114 kB
          Julia Simon

            [JRASERVER-41559] Some REST calls return 200 with no body and AUTHENTICATED_FAILED

            Having the same issue in Jira Server

            Dorene Watson added a comment - Having the same issue in Jira Server

            Moxarth Rathod added a comment - - edited

            Facing the same issue while calling an endpoint `/projects` with an invalid username/service account.

            Moxarth Rathod added a comment - - edited Facing the same issue while calling an endpoint `/projects` with an invalid username/service account.

            Madison added a comment -

            Having the same issue. 

             

            Madison added a comment - Having the same issue.   

            Having the same issue on Jira Server

             

            Matthew Bradbury added a comment - Having the same issue on Jira Server  

            Has there been an update to correcting the problem on either Crowd or Jira side?

            Jason Kemp added a comment - Has there been an update to correcting the problem on either Crowd or Jira side?

            Hey alexander.weickmann!

            We have an idea of what's happening and are tackling the issue on the Crowd's side at the moment.

            There's a race condition that can cause concurrent Basic Auth requests to fail with a 500 error. Our Crowd engineers are working on changing the way we handle the Basic Auth requests to fix this. A workaround, for now, would be to use OAuth based requests.

            Then, on the Jira's side, we do not handle the 500 error from Crowd correctly. We set the error header, but do not change the response code, which results in an empty 200 response. Unfortunately, a fix is not straightforward, because the response object passes so many levels of abstraction. Thus, we are focusing on fixing the root cause (handling of Basic Auth requests) for now, without prioritizing this bug in Jira.

            I hope this answers some of your questions.

            Daniel Rauf added a comment - Hey alexander.weickmann ! We have an idea of what's happening and are tackling the issue on the Crowd's side at the moment. There's a race condition that can cause concurrent Basic Auth requests to fail with a 500 error. Our Crowd engineers are working on changing the way we handle the Basic Auth requests to fix this. A workaround, for now, would be to use OAuth based requests . Then, on the Jira's side, we do not handle the 500 error from Crowd correctly. We set the error header, but do not change the response code, which results in an empty 200 response. Unfortunately, a fix is not straightforward, because the response object passes so many levels of abstraction. Thus, we are focusing on fixing the root cause (handling of Basic Auth requests) for now, without prioritizing this bug in Jira. I hope this answers some of your questions.

            Any news on this from official side? Running into this when performing Gatling performance tests (about 10% of requests fail due to this).

             

            Deleted Account (Inactive) added a comment - Any news on this from official side? Running into this when performing Gatling performance tests (about 10% of requests fail due to this).  

            Haven't seen this listed so I'll post it in case it helps someone:

            I was having this same problem (200 status, no response body, AUTH_FAILED).

            Local Jira server. 

            When we create accounts, we leave the password blank and it authenticates to active directory.

            I was using my active directory account the whole time. As soon as I created a local account that wasn't in AD (with a password) it worked fine.

            thebradness added a comment - Haven't seen this listed so I'll post it in case it helps someone: I was having this same problem (200 status, no response body, AUTH_FAILED). Local Jira server.  When we create accounts, we leave the password blank and it authenticates to active directory. I was using my active directory account the whole time. As soon as I created a local account that wasn't in AD (with a password) it worked fine.

            Rob Russo added a comment - - edited

            Finally I fixed this issue. Definitely a bug on their part. What's happening here is that they're using CSRF/XSRF protection on the endpoint. What does this mean for you?

            Essentially you need to create a session before you can make a request. Here's my code using Node JS and the request module:

             

            Sorry if it's messy. The important part is the request.jar() method, which creates a session called cookieJar. This cookieJar object is then passed with each request, ensuring the CSRF token is passed with each response.

             

            var request = require('request');/* GET home page. */
            router.get('/', function(req, res, next) {
            {{ var cookieJar = request.jar();}}var options = { method: 'GET',
            {{ url: 'https:/test.atlassian.net/wiki/rest/api/content',}}
            {{ jar: cookieJar,}}
            {{ qs:}}
            {{ { type: 'page',}}
            {{ title: 'title' },}}
            {{ headers:}}
            {{ {'cache-control': 'no-cache',}}
            {{ authorization: 'Basic yourEncodedAuthHere } };}}request(options, function (error, response, body) {
            {{ if (error) throw new Error(error);}}
            {{ request(options, function (error, response, body) {}}
            {{ if (error) throw new Error(error);}}
            {{ console.log(body);}}
            {{ });}}
            {{ });}}res.send('hello');
            });

            Rob Russo added a comment - - edited Finally I fixed this issue. Definitely a bug on their part. What's happening here is that they're using CSRF/XSRF protection on the endpoint. What does this mean for you? Essentially you need to create a session before you can make a request. Here's my code using Node JS and the request module:   Sorry if it's messy. The important part is the request.jar() method, which creates a session called cookieJar. This cookieJar object is then passed with each request, ensuring the CSRF token is passed with each response.   var request = require('request'); /* GET home page. */ router.get('/', function(req, res, next) { {{ var cookieJar = request.jar();}} var options = { method: 'GET', {{ url: 'https:/test.atlassian.net/wiki/rest/api/content',}} {{ jar: cookieJar,}} {{ qs:}} {{ { type: 'page',}} {{ title: 'title' },}} {{ headers:}} {{ {'cache-control': 'no-cache',}} {{ authorization: 'Basic yourEncodedAuthHere } };}} request(options, function (error, response, body) { {{ if (error) throw new Error(error);}} {{ request(options, function (error, response, body) {}} {{ if (error) throw new Error(error);}} {{ console.log(body);}} {{ });}} {{ });}} res.send('hello'); });

            Bill Gray added a comment -

            I, too, am getting this.  PHP 5.3 using Curl (using the "@" method.

            Bill Gray added a comment - I, too, am getting this.  PHP 5.3 using Curl (using the "@" method.

            Daniel Jurek added a comment - - edited

            @jeescuyos1702111598, being fairly new to the Confluence API, I kept digging and resolved my issue by appending this to the request url:

            &expand=body.view

             

             

            Daniel Jurek added a comment - - edited @jeescuyos1702111598, being fairly new to the Confluence API, I kept digging and resolved my issue by appending this to the request url: &expand=body.view    

            John Edward ESCUYOS added a comment - - edited

            I don't know exactly what is happening, in Confluence REST api response body is empty when I tried to use Javascript AJAX libraries (request, axios). But in browser and Postman works fine.

            John Edward ESCUYOS added a comment - - edited I don't know exactly what is happening, in Confluence REST api response body is empty when I tried to use Javascript AJAX libraries (request, axios). But in browser and Postman works fine.

            Daniel Jurek added a comment - - edited

            I ended up here by searching for the same issue, but in I'm seeing it in Confluence.  I don't see a failed authentication anywhere, though.

            Daniel Jurek added a comment - - edited I ended up here by searching for the same issue, but in I'm seeing it in Confluence.  I don't see a failed authentication anywhere, though.

            Naveen Kasthuri added a comment - - edited

            This has been happening with a frequency of ~ 20%. Even after implementing a retry mechanism, the same request fails again. Using JIRA cloud.

            Naveen Kasthuri added a comment - - edited This has been happening with a frequency of ~ 20%. Even after implementing a retry mechanism, the same request fails again. Using JIRA cloud.

            Hi matthias13

            I've took over the issue watch from Oswaldo. Unfortunately there are no updates on the issue since we've been investigating critical problems.
            Hopefully we'll get back to this in mid-term timeline. 

            Cheers,
            Ignat
            JIRA Bugmaster.

            Ignat (Inactive) added a comment - Hi matthias13 ,  I've took over the issue watch from Oswaldo. Unfortunately there are no updates on the issue since we've been investigating critical problems. Hopefully we'll get back to this in mid-term timeline.  Cheers, Ignat JIRA Bugmaster.

            Hi Oswaldo,

            are there any news in this regard?
            This is quite annoying since we have to implement some workarounds like retrying the request but cannot distinguish between cases where the authentication actually fails and where only this error happens without any reason.

            Cheers,
            Matthias

            Matthias Gaiser [K15t] added a comment - Hi Oswaldo, are there any news in this regard? This is quite annoying since we have to implement some workarounds like retrying the request but cannot distinguish between cases where the authentication actually fails and where only this error happens without any reason. Cheers, Matthias

            Hi jmorgan5,

            We were able to reproduce it here but so far we have been unable to pinpoint an exact cause, we will need to investigate more to do so.

            Consequently, we are unable to provide any ETA for a fix. Please do watch this issue, as we will update it as soon as we know more.

            Regards,

            Oswaldo Hernández.
            JIRA Bugmaster.
            [Atlassian].

            Oswaldo Hernandez (Inactive) added a comment - Hi jmorgan5 , We were able to reproduce it here but so far we have been unable to pinpoint an exact cause, we will need to investigate more to do so. Consequently, we are unable to provide any ETA for a fix. Please do watch this issue, as we will update it as soon as we know more. Regards, Oswaldo Hernández. JIRA Bugmaster. [Atlassian] .

            Hi Oswaldo,
            This bug is very reproducible in our environment (and very annoying) - please can you give an ETA on a fix?

            James Morgan added a comment - Hi Oswaldo, This bug is very reproducible in our environment (and very annoying) - please can you give an ETA on a fix?

            It's fairly reproducable for me as well on our JIRA cloud when accessing the HTTP api with basic auth. The symptoms and issue are the exact same.

            Jason Spafford added a comment - It's fairly reproducable for me as well on our JIRA cloud when accessing the HTTP api with basic auth. The symptoms and issue are the exact same.

            Matthias Gaiser [K15t] added a comment - - edited

            Hi Oswaldo,

            I've attached a sample project which makes some REST calls in concurrent threads and after some tries, it fails with the AUTHENTICATION_FAILED error.

            My test environment is a JIRA 6.4.13 (Server) with an embedded database and Crowd/SSO-enabled. Until now, we've experienced this behaviour only in Crowd/SSO-enabled scenarios, so you might want to look in that direction.
            One of our customers is having this behaviour for 6.4.11 (Server), I'm still waiting for more details from another customer.

            As a workaround solution, I retry the request and it works fine - so I wonder what the error tries to tell me.

            Best wishes,
            Matthias.

            Matthias Gaiser [K15t] added a comment - - edited Hi Oswaldo, I've attached a sample project which makes some REST calls in concurrent threads and after some tries, it fails with the AUTHENTICATION_FAILED error. My test environment is a JIRA 6.4.13 (Server) with an embedded database and Crowd/SSO-enabled. Until now, we've experienced this behaviour only in Crowd/SSO-enabled scenarios, so you might want to look in that direction. One of our customers is having this behaviour for 6.4.11 (Server), I'm still waiting for more details from another customer. As a workaround solution, I retry the request and it works fine - so I wonder what the error tries to tell me. Best wishes, Matthias.

            Hi matthias13,

            We were never able to fully comprehend the rationale behind this and hence were unable to come up with a root cause or a fix for this.

            If you are able to reliably reproduce this in your environment, could you please raise a support ticket with us at https://support.atlassian.com so one of our engineers can work with you on obtaining the root cause for the development team to be able to understand what the potential fix would be.

            Thanks

            Regards,

            Oswaldo Hernández.
            JIRA Bugmaster.
            [Atlassian].

            Oswaldo Hernandez (Inactive) added a comment - Hi matthias13 , We were never able to fully comprehend the rationale behind this and hence were unable to come up with a root cause or a fix for this. If you are able to reliably reproduce this in your environment, could you please raise a support ticket with us at https://support.atlassian.com so one of our engineers can work with you on obtaining the root cause for the development team to be able to understand what the potential fix would be. Thanks Regards, Oswaldo Hernández. JIRA Bugmaster. [Atlassian] .

            This bug is not only valid for JIRA Cloud, but also for JIRA Server. I have encountered this so far only after I've enabled crowd integration. I have two issues.

            1. Why is it happening at all? Because of too many requests?
            2. The HTTP code 200 and the empty response breaks my jersey implementation.

            Any status updates?

            Matthias Gaiser [K15t] added a comment - This bug is not only valid for JIRA Cloud, but also for JIRA Server. I have encountered this so far only after I've enabled crowd integration. I have two issues. Why is it happening at all? Because of too many requests? The HTTP code 200 and the empty response breaks my jersey implementation. Any status updates?

            I can get this bug to reproduce reliable when i try to access the HTTP api using 8 threads at a time.

            Lawrence Luo added a comment - I can get this bug to reproduce reliable when i try to access the HTTP api using 8 threads at a time.

            skalsi Would my attached python script help?
            You just have to run python setup.py test in parallel in 3-4 different terminals

            Julia Simon (Inactive) added a comment - skalsi Would my attached python script help? You just have to run python setup.py test in parallel in 3-4 different terminals

            I am not surprised, the problem is there and requires good operations skills to debug. I didn't complain much about the unreliability of the REST api of the cloud hosted Jira instances just because I implemented an implicit retry mechanism and workaround inside python-jira library, not because these issues were resolved.

            The travis logs are public and it is easy to check them for the errors.

            Sorin Sbarnea (Citrix) added a comment - I am not surprised, the problem is there and requires good operations skills to debug. I didn't complain much about the unreliability of the REST api of the cloud hosted Jira instances just because I implemented an implicit retry mechanism and workaround inside python-jira library, not because these issues were resolved. The travis logs are public and it is easy to check them for the errors.

            The plot thickens. I've found another issue that is more serious than the 200+no-body.

            I have my code setup to retry the rest action for the 200+no body and for any 5xx error. This has been working fine for a few months until last week when we got a pile of 504 errors from JIRA.

            Unlike the 200+no-body issue where you know that the rest action actually failed, in the 504 case you can't tell if the action worked or not. My scripts crashed when attempting to redo rest actions that actually worked. See https://groups.google.com/forum/#!topic/atlassian-connect-dev/s9CuuCGvThk for another report of the same thing.

            I'm not sure if there is an easy code workaround that can automatically recover from this non-deterministic 504 issue. Would possibly mean something like this

            1. run REST action X
            2. 504 detected
            3. run a verification rest action (Y) to check if action X worked
            4. if Y worked the do nothing
            5. If Y failed rerun action X

            Just been through this in ticket JST-139269. Only advice given was to minimise the impact by stopping all rest traffic for while and check JIRA manually to see if it is working properly.

            Paul Marquess added a comment - The plot thickens. I've found another issue that is more serious than the 200+no-body. I have my code setup to retry the rest action for the 200+no body and for any 5xx error. This has been working fine for a few months until last week when we got a pile of 504 errors from JIRA. Unlike the 200+no-body issue where you know that the rest action actually failed, in the 504 case you can't tell if the action worked or not. My scripts crashed when attempting to redo rest actions that actually worked. See https://groups.google.com/forum/#!topic/atlassian-connect-dev/s9CuuCGvThk for another report of the same thing. I'm not sure if there is an easy code workaround that can automatically recover from this non-deterministic 504 issue. Would possibly mean something like this run REST action X 504 detected run a verification rest action (Y) to check if action X worked if Y worked the do nothing If Y failed rerun action X Just been through this in ticket JST-139269. Only advice given was to minimise the impact by stopping all rest traffic for while and check JIRA manually to see if it is working properly.

            In case this helps, it would be good to know that recent versions of Python JIRA library are able to bypass failures caused by this BUG by retrying, even if this does not follow the HTTP specification.

            From my experience of running unit/integration tests of Python JIRA using Travis, up to 1% of REST call do fail due to this bug. You are welcome to see the logs, https://travis-ci.org/pycontribs/jira

            Sorin Sbarnea (Citrix) added a comment - In case this helps, it would be good to know that recent versions of Python JIRA library are able to bypass failures caused by this BUG by retrying, even if this does not follow the HTTP specification. From my experience of running unit/integration tests of Python JIRA using Travis, up to 1% of REST call do fail due to this bug. You are welcome to see the logs, https://travis-ci.org/pycontribs/jira

            This is still happening. It seems that the operations team has much bigger problems that this more or less random authentication failure.

            Sorin Sbarnea (Citrix) added a comment - This is still happening. It seems that the operations team has much bigger problems that this more or less random authentication failure.

            Paul Marquess added a comment - FYI I reported the same issue using https://support.atlassian.com/servicedesk/customer/portal/23/JST-114687

            Added link to HTML RFC, something that the REST team developers should know word by word.

            Sorin Sbarnea added a comment - Added link to HTML RFC, something that the REST team developers should know word by word.

              Unassigned Unassigned
              73f0b2e75f82 Sorin Sbarnea (Citrix)
              Affected customers:
              78 This affects my team
              Watchers:
              74 Start watching this issue

                Created:
                Updated: