Details
-
Suggestion
-
Resolution: Answered
-
None
-
None
-
None
Description
This is something like «rocketscience»
Torrents have an idea: some pieces of content located on several servers and downloaded separetely.
Git also have «pieces». If you need to get commits, you may download it by ranges from different servers. Only one thing — you need protocol support for this.
1. Client perform git fetch and send their leafs of git tree as sha1 ids.
2. Server calculate deltas, compress and send it back.
In stage 2, server can ask connected peers — «who can serve this ranges?», and each peer tell for fast — «me, i have ranges: from..to, from..to, etc»
Then server analyse it and split one request to several (small) deltas, then it send instructions to client in form:
range aaa..bbb: first.server.acme.com/project/repo.git range ccc..ddd: second.server.acme.com/project/repo.git range eee..fff: main.server.acme.com/project/repo.git ...
And then client make asynch requests to each one server and join results.
This make possibility to split Stash instance per several machines. Stash main instance and connected Stash slave git peers. Then Stash get update (git push), it send push to peers by standard git protocol.
And most hard thing — we need to patch git client to make ability work with two phases fetch — ask where deltas located, and then download from several servers. Linus will be angry, i suppose
This ticket is only proposal to discuss. Our team is under edge of one instance of Stash limits. Today we upgrade channel to 2Gbits from 1Gbit and it start working, then we will upgrade to 10Gbits. But we need to think about universal Scaling solution. Network upgrading is not good way, you know.