Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-13368

Searching for text/code on Bitbucket Search is not returning all the results

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • High
    • 7.21.0
    • 7.18.0, 7.19.0, 7.17.6
    • Search

    Description

      Issue Summary

      We have identified this issue on Bitbucket with embedded ElasticSearch (that comes with ElasticSearch version: 7.10.2). When using the Bitbucket search and looking for some text/code, all the results are not being displayed. For example, that same text/code is present in 4 files, but only 3 files are being displayed. 

      We could not find any specific pattern that could cause those texts to be ignored, but in resume, it looks like that whenever we have something like text_text_c in one line, followed by anothertext_anothertext_c in the next following line, ElasticSearch is not able to query for any other text in that file anymore. 

      Steps to Reproduce

      • Brought a new instance of Bitbucket 7.17.6 using Instant Environment
      • Created a new Project and Repository
      • Pushed a new text file (test.txt) with the following contents:
        Territory_Type_Geo_Name__c
        Territory_Type_Region_Name__c
        Territory_Type_Territory_Name__c
        Territory_Type_Sub_Territory_Name__c 
        
      • Search for Territory_Type_Territory_Name_c or Territory_Type_Sub_Territory_Name__c and that simply does not return the file test.txt which contains that string.

       
      Bitbucket is doing the following query:

      '{
      "query": {
      "bool": {
      "must": {
      "bool": {
      "should": [
      {
      "match": {
      "content":
      
      { "query": "Territory_Type_Sub_Territory_Name__c", "operator": "and" }
      
      }
      },
      {
      "match": {
      "path":
      
      { "query": "Territory_Type_Sub_Territory_Name__c", "operator": "and" }
      
      }
      },
      {
      "match_phrase":
      
      { "filename": "Territory_Type_Sub_Territory_Name__c" }
      
      }
      ]
      }
      },
      "should": {
      "term":
      
      { "fork": false }
      
      },
      "filter": {
      "terms":
      
      { "repositoryId": [ 1 ] }
      
      }
      }
      }
      }'
      

      Executed that same query using curl against ElasticSearch and that brings the same result as Bitbucket. Seems that this is something from the ElasticSearch side as it works with OpenSearch.

      Here is the weird part:

      • Searching for Territory_Type_Geo_Name__c returns that file.
      • Searching for Territory_Type_Region_Name__c also returns that file.
      • Searching for anything that is after that Territory_Type_Region_Name__c then nothing else is returned. Even if I add apple after that, and query for apple, it does not return me anything.

       
      Tried several things here and could not conclude any pattern that could cause that, only that it needs to have two lines with text_text_c, and everything after that is ignored. If I remove the _c from the first two lines, then that query works.

      Expected Results

      • Search should return all files with the above pattern.

      Actual Results

      • All the files that have the above pattern are being ignored by ElasticSearch and not being displayed on the Bitbucket Search.

      Workaround

      • Use an external Elasticsearch 7.16.2 server

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gfranchi@atlassian.com Guilherme Franchi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: