Description
Issue Summary
Elasticsearch was upgraded to version 7.5.2 in Bitbucket 7.1. Elasticsearch has introduced a new maximum scroll context number by default. Bitbucket may exceed that maximum due to a bug where scroll contexts are not closed correctly.
Steps to Reproduce
- Run Bitbucket 7.1.0 with the bundled Elasticsearch 7.5.2 with >500 repositories containing at least 50 files
- Clear the Elasticsearch index and let Bitbucket reindex (do not use the REST endpoint to synchronize indexes)
- Restart Bitbucket
Expected Results
- Bitbucket comes up and all repositories are indexed
Actual Results
- Bitbucket fails to index repositories
The below exception is thrown in the atlassian-bitbucket.log file:
2020-04-09 17:01:36,208 ERROR [search-indexing:thread-1] c.a.b.i.s.i.e.DefaultIndexEventWorker Indexing - Failed for repository TEST/my-repo-424 (id: 425) with error: The request to Elasticsearch failed with HTTP status code: 500 com.atlassian.bitbucket.internal.search.indexing.exceptions.IndexException: The request to Elasticsearch failed with HTTP status code: 500 at com.atlassian.bitbucket.internal.search.indexing.util.ElderScroll$ScrollProducer.lambda$null$2(ElderScroll.java:135) at rx.internal.operators.OnSubscribeMap$MapSubscriber.onNext(OnSubscribeMap.java:69) at rx.internal.producers.SingleProducer.request(SingleProducer.java:65) ... lots of RXJava calls ... at rx.Observable.subscribe(Observable.java:10390) at rx.Observable.subscribe(Observable.java:10271) at com.atlassian.bitbucket.internal.search.indexing.util.Observables.consume(Observables.java:34) at com.atlassian.bitbucket.internal.search.indexing.util.ElderScroll$ScrollProducer.fetchMore(ElderScroll.java:159) at com.atlassian.bitbucket.internal.search.indexing.util.ElderScroll$ScrollProducer.call(ElderScroll.java:84) at com.atlassian.bitbucket.internal.search.indexing.util.ElderScroll.lambda$search$0(ElderScroll.java:56) at rx.Observable.unsafeSubscribe(Observable.java:10327) ... lots of RXJava calls ... at rx.Observable.subscribe(Observable.java:10423) at rx.Observable.subscribe(Observable.java:10390) at rx.Observable.subscribe(Observable.java:10230) at com.atlassian.bitbucket.internal.search.indexing.event.DefaultIndexEventQueueProcessor.processEvent(DefaultIndexEventQueueProcessor.java:204) at com.atlassian.bitbucket.internal.search.indexing.event.DefaultIndexEventQueueProcessor.processEvents(DefaultIndexEventQueueProcessor.java:229) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.lang.Thread.run(Thread.java:834) ... 1 frame trimmed Caused by: rx.exceptions.OnErrorThrowable$OnNextValue: OnError while emitting onNext value: com.atlassian.elasticsearch.client.search.SearchResponse.class at rx.exceptions.OnErrorThrowable.addValueAsLastCause(OnErrorThrowable.java:118) at rx.internal.operators.OnSubscribeMap$MapSubscriber.onNext(OnSubscribeMap.java:73) ... 270 common frames omitted
The below exception is thrown in the search/bitbucket_search.log file:
[2020-04-09T17:01:36,176][WARN ][r.suppressed ] [bitbucket_bundled]path: /bitbucket-search/_search, params: {scroll=300s, index=bitbucket-search} org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:534) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:305) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:563) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:384) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$200(AbstractSearchAsyncAction.java:65) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:241) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:423) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1120) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1229) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1203) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:60) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:56) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:71) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:65) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:58) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:73) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773) [elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.5.2.jar:7.5.2] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] Caused by: org.elasticsearch.ElasticsearchException: Trying to create too many scroll contexts. Must be less than or equal to: [1024]. This limit can be set by changing the [search.max_open_scroll_context] setting. at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:542) ~[elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:348) ~[elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.search.SearchService.lambda$executeQueryPhase$1(SearchService.java:340) ~[elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.ActionListener.lambda$map$2(ActionListener.java:146) ~[elasticsearch-7.5.2.jar:7.5.2] at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) ~[elasticsearch-7.5.2.jar:7.5.2] ... 9 more
Workaround
Increase the setting search.max_open_scroll_context in Elasticsearch to a value that is significantly higher. Elasticsearch versions prior to 7.0 had no limit set. We recommend a value that is greater than 10,000. You can use this code snippet to set the value, or use any other method to make a REST call to Elasticsearch.
curl -u bitbucket:(password from $BITBUCKET_HOME/shared/search/buckler/buckler.yml) -X PUT http://localhost:7992/_cluster/settings -H 'Content-Type: application/json' -d '{ "persistent" : { "search.max_open_scroll_context": 10000 }, "transient": { "search.max_open_scroll_context": 10000 } }'