Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-7398

Performance regression for instances with a large disparity of licensed to unlicensed users

    XMLWordPrintable

Details

    Description

      Stash 3.8 introduced filtering of unlicensed users, which were previously being returned in user searches such as those used by the @mention functionality. There however exists a performance regression for Stash instances where there is a large disparity between the number of licensed vs unlicensed users.

      To reproduce this problem:

      1. Have just one or two licensed users in Stash
      2. Sync an LDAP directory of > 10,000 users to Stash (Say the users all have an email address with domain example.com)
      3. On a comment type "@ex"
      4. After 30s the following error dialog is presented:

      The server will continue running the search even after this error occurs, however the result set will never be utilised in the UI.

      It is also possible to reproduce this slowness via the REST interface. No error occurs but such a query on the same Stash instance as described above will take many minutes and eventually return the one or two licensed users (if they match the query string).

      curl -u admin:admin 'http://localhost:7990/stash/rest/api/latest/users?avatarSize=48&permission=LICENSED_USER&filter=ex'
      

      Root cause: Crowd search interface (ApplicationServiceGeneric.search()) is inefficient when paging is used. For each page the database is queried (with the filter text) and the entire result is read into memory. There is no limit put on the result, so in the above example the result set consists of all 10,000 users, even if we are interested in only the first 25.

      So with a page size of 25, and 10,000 users in the directory, the query is executed 400 times. If we set a page size of 10,000 the query is run just once.

      Although inefficient, Crowd does this for good reasons, described in CWD-2807. The obvious use of selection (via OFFSET & LIMIT) in the query is not viable.

      Currently this problem has the ability to seriously degrade the performance of a Stash instance. This issue intends to work around the issue by using a large page size.

      Attachments

        1. mention-error.png
          mention-error.png
          18 kB
        2. testing-notes.txt
          2 kB

        Issue Links

          Activity

            People

              behumphreys Ben Humphreys
              behumphreys Ben Humphreys
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: