Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-4829

Scaling Stash with lite Stash Balancer

    XMLWordPrintable

Details

    • Suggestion
    • Resolution: Duplicate
    • None
    • None
    • None
    • We collect Bitbucket feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

    Description

      Here is described another one solution. This one is pretty use rather that https://jira.atlassian.com/browse/STASH-4822

      Let me use follow declarations: Master Stash (it is usual Stash instance with license), Balancer Stash (Smart balancer with know about Git) it can replace TCP packets to perform redirect to other hosts transparently, and Stash Slaves (lite instance of Stash, that exactly copy of Master Stash, but only used to readonly access to repos).

      Master Stash and Slave Stash have REST projects/$PROJECT/repos/$repo/has-commit/?sha1=$sha1
      This REST return true is Stash instance contains sha1 commit in specfied repo.

      Master Stash has Git Push hook, that send to Stash Balancer information: project/repo and sha1 hash.

      Balancer have on board list of active reals, first in this list return Stash Slaves, and last one is always Stash Master. Balancer can serve requests to hosts from this list only.

      Then Balancer received new hash for one repo, it totally cleanup list on slave hosts and remain only Stash Master in this list. And then in loop, it make checking by REST if Stash Slave contains commit with sha1 in given repository on Slave. If slave does not contain hash, it perform git fetch from Master Stash, until REST return true.

      Balancer have full table of:

      project1/repo1: sha1_1
      project1/repo2: sha1_2
      project2/repo1: sha1_3
      ...
      

      The every one host in Balancer reals list must have sha1 in repositories from this table. It can't be added in this list until it not contains commit.

      So, in result. Developer make push into Master Stash, but read (git fetch) from Stash Balancer. In only few seconds after push Balancer will server request from Master Stash, but after a while, Balancer start serving requests from Stash Slaves also, and it will reduce CPU/NET activity.

      It usefull especially for CI servers, like Bamboo or Teamcity, where you can have 100500 build agent, each one of them can perform git fetch after big one git push into Stash. Now it totally hangup single Stash instance for big repos (~7GB) and 150 build agents. But with Stash Balancer it will be work more stable.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              3652ed9ede2e Alexey Efimov
              Votes:
              4 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: