Uploaded image for project: 'FishEye'
  1. FishEye
  2. FE-3071

HG: Fix slurping issues for hgconvert/hgsubversion

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Low Low
    • None
    • 2.3.0
    • Indexing

      hgconvert and hgsubversion produce repos that hg does not when used natively, with file revs included in the changeset when they were just merged.

      Document this more and come up with some way to work more optimally for ex-svn repos.

            [FE-3071] HG: Fix slurping issues for hgconvert/hgsubversion

            Marek Parfianowicz made changes -
            Component/s New: Indexing [ 41890 ]
            Component/s Original: vcs-mercurial [ 13281 ]
            Labels Original: hg red New: hg mercurial red
            Owen made changes -
            Workflow Original: FE-CRUC Bug Workflow [ 2943787 ] New: JAC Bug Workflow v3 [ 2958511 ]
            Owen made changes -
            Workflow Original: FECRU Development Workflow - Triage - Restricted [ 1517321 ] New: FE-CRUC Bug Workflow [ 2943787 ]
            Owen made changes -
            Workflow Original: FECRU Development Workflow - Triage [ 941076 ] New: FECRU Development Workflow - Triage - Restricted [ 1517321 ]
            Piotr Swiecicki made changes -
            Workflow Original: FECRU Development Workflow (Triage) [ 311674 ] New: FECRU Development Workflow - Triage [ 941076 ]
            Seb Ruiz (Inactive) made changes -
            Workflow Original: Simple review flow with triage [ 210121 ] New: FECRU Development Workflow (Triage) [ 311674 ]
            mwatson made changes -
            Component/s New: vcs-mercurial [ 13281 ]
            Component/s Original: FE-vcs-hg [ 13191 ]
            Key Original: CRUC-3468 New: FE-3071
            Project Original: Crucible [ 11771 ] New: FishEye [ 11830 ]
            Affects Version/s New: 2.3.0 [ 15272 ]
            Affects Version/s Original: 2.3-M3 [ 15097 ]
            Reporter Original: mwatson [ mwatson@atlassian.com ]
            mwatson made changes -
            Resolution New: Won't Fix [ 2 ]
            Status Original: Open [ 1 ] New: Closed [ 6 ]

            mwatson added a comment -

            See the last comment:

            The proposed fix here was really a hack to reduce indexing times that can sometimes misrepresent the data we have. A better way to speed indexing times is as previously mentioned, to only pull certain branches into a "wroking" repository and index that.

            mwatson added a comment - See the last comment: The proposed fix here was really a hack to reduce indexing times that can sometimes misrepresent the data we have. A better way to speed indexing times is as previously mentioned, to only pull certain branches into a "wroking" repository and index that.

            mwatson added a comment -

            Hi Tim,

            I see the support engineers are looking into your indexing issues. We had similar issues ourselves, so hopefully they can come up with an answer for you.

            Great to know your developers loved FishEye and Crucible!

            The main issue this JIRA is addressing is when hgconvert/hgsubversion creates file revisions in the converted repository, where the underlying file revision in a merge commit is actually the same as one created on a branch - we can try to detect this and instead of processing a load of diffs as though it is a new version of the file on the merged to branch, actually just use the file revision created on the merged from branch as the parent revision for subsequent changes. Note this "merge" happens in SVN and there is not a merge commit in the converted hg repo, but the underlying file revisions in hg contain enough info to do this.

            It would mean a speedup improvement in indexing, but at the cost of not accurately representing what had happened in the underlying hg repo. It also may not apply to you at all (we haven't done testing against repos created using cvs2hg).

            Our testing has indicated that the speedup gained is not significant enough for us to implement this feature. We realised much better indexing speed (and which may work better for repos converted from cvs using cvs2hg) by having a "full" repository, which was the result of a complete conversion, but then cloning only certain active branches from it to a "light" or "working" clone that people develop on and we get FishEye to index this - this excluded a lot of closed heads that hgsubversion created to represent complex tags in subversion (which in turn were mostly produced by cvs2svn years ago) which had HUGE diffs and took a long time to index. The beauty of this approach is that the tags are still available if we want to migrate them to the light repo (by hg pull full-repo -r TAG; hg push light-repo) and FishEye can index them as you need them, rather than spending ages on them all at once.

            This is not optimal, but we are limited somewhat by the speed of mercurial producing these huge diffs (just doing a hg diff between the tag and some other point (like it's parent commit takes a long time) and that we need to get these diffs per-file rather than for a whole commit at once. We are working on other performance improvements (http://jira.atlassian.com/browse/CRUC-3883) that should speed up indexing in other ways.

            Hope this helps,
            Matt

            mwatson added a comment - Hi Tim, I see the support engineers are looking into your indexing issues. We had similar issues ourselves, so hopefully they can come up with an answer for you. Great to know your developers loved FishEye and Crucible! The main issue this JIRA is addressing is when hgconvert/hgsubversion creates file revisions in the converted repository, where the underlying file revision in a merge commit is actually the same as one created on a branch - we can try to detect this and instead of processing a load of diffs as though it is a new version of the file on the merged to branch, actually just use the file revision created on the merged from branch as the parent revision for subsequent changes. Note this "merge" happens in SVN and there is not a merge commit in the converted hg repo, but the underlying file revisions in hg contain enough info to do this. It would mean a speedup improvement in indexing, but at the cost of not accurately representing what had happened in the underlying hg repo. It also may not apply to you at all (we haven't done testing against repos created using cvs2hg). Our testing has indicated that the speedup gained is not significant enough for us to implement this feature. We realised much better indexing speed (and which may work better for repos converted from cvs using cvs2hg) by having a "full" repository, which was the result of a complete conversion, but then cloning only certain active branches from it to a "light" or "working" clone that people develop on and we get FishEye to index this - this excluded a lot of closed heads that hgsubversion created to represent complex tags in subversion (which in turn were mostly produced by cvs2svn years ago) which had HUGE diffs and took a long time to index. The beauty of this approach is that the tags are still available if we want to migrate them to the light repo (by hg pull full-repo -r TAG; hg push light-repo ) and FishEye can index them as you need them, rather than spending ages on them all at once. This is not optimal, but we are limited somewhat by the speed of mercurial producing these huge diffs (just doing a hg diff between the tag and some other point (like it's parent commit takes a long time) and that we need to get these diffs per-file rather than for a whole commit at once. We are working on other performance improvements ( http://jira.atlassian.com/browse/CRUC-3883 ) that should speed up indexing in other ways. Hope this helps, Matt

              Unassigned Unassigned
              Anonymous Anonymous
              Affected customers:
              1 This affects my team
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved:

                  Estimated:
                  Original Estimate - 4h
                  4h
                  Remaining:
                  Remaining Estimate - 4h
                  4h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified