When a large branch is copied within a SVN repository (let's say, a typical branch being around 8,000-24,000 directories and 90,000-350,000 files, not counting deleted content), generating a single changeset, and FishEye starts to index this single changeset, the instance may become unusable for several hours afterwards, although the instance may be well tuned:
- Decrease Block size: A subversion copy is a single revision, so decreasing the number of fetched revisions from 400 to (say) 100 should not logically have any effect on this particular problem.
- Increase heap size: After increasing to -Xmx1536m the problem may get worse, because garbage collection may take longer, and may eventually cause timeouts even for accessing other repositories.
- Decrease the number of initial / incremental indexing threads: Depending on the number of repositories an instance may have, decreasing below 2 would mean that other repositories would have to wait if one repository is doing a time consuming job, like the example given above.
- Exclude files and directories: This is unfeasible in such a big repository, because Fisheye/Crucible only allows fixed paths for exclusion, and also we would have to spend days setting this up for a single branch, then repeat for hundreds of branches, and every new branch after this.
- Split the repository into logical components: This may also be unfeasible when customer's need to be able to follow history back to when files were first created, and the commit messages may also contain the associated Jira ticket number.
So the suggestions here are:
- FishEye/Crucible understanding links, which copies in subversion are, so it won't reindex.
- FishEye/Crucible copying the index data from the source to the destination instead of regenerating it.
- FishEye/Crucible not bottlenecking while indexing, especially when there are threads available. For example, an instance may have 3 threads available, but only one gets used per repository.