• Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      To be able to handle unicode characters, I want the default build environment to use a better locale. Right now it uses POSIX, and using en_US.UTF-8 would help.

      #!bash
      
      pipelines:
        branches:
          master:
            - step:
                script:
                  - locale
      
      

      gives

      #!bash
      
      
      
      + locale
      LANG=
      LANGUAGE=
      LC_CTYPE="POSIX"
      LC_NUMERIC="POSIX"
      LC_TIME="POSIX"
      LC_COLLATE="POSIX"
      LC_MONETARY="POSIX"
      LC_MESSAGES="POSIX"
      LC_PAPER="POSIX"
      LC_NAME="POSIX"
      LC_ADDRESS="POSIX"
      LC_TELEPHONE="POSIX"
      LC_MEASUREMENT="POSIX"
      LC_IDENTIFICATION="POSIX"
      LC_ALL=
      
      

      Workarounds

      One workaround is to create your own docker image like this
      https://answers.atlassian.com/questions/39140980/how-do-i-create-a-docker-image-for-bitbucket-pipelines

      Using a Docker image that can looks something like this:

      #!bash
      
      FROM gcc #Some smart base image
       
      # Whatever you need more than what is on the base image required by your project
       
      # Set the locale
      RUN locale-gen en_US.UTF-8  
      ENV LANG en_US.UTF-8  
      ENV LANGUAGE en_US:en  
      ENV LC_ALL en_US.UTF-8     
      

      Another even easier workaround is to set these environment variables in your build script or Pipelines settings.

            [BCLOUD-13085] Change Pipelines default image locale to C.UTF-8

            Raul Gomis added a comment -

            c95b0e829572 what version of the image are you using?

            Raul Gomis added a comment - c95b0e829572  what version of the image are you using?

            I just encountered an issue on a package because of this.

             

            lululombard added a comment - I just encountered an issue on a package because of this.  
            Katherine Yabut made changes -
            Workflow Original: JAC Suggestion Workflow [ 3538495 ] New: JAC Suggestion Workflow 3 [ 3592863 ]

            Matt Ryall added a comment -

            Thanks for your patience on this issue. We're planning a few updates to the default image in the coming months, so I'll see if we can get this one in as well (cc @rgomish).

            Agree that setting LC_ALL=C.UTF-8 seems like the correct fix to use UTF-8 without selecting a specific language. We'll go with that if we do it.

            Also a reminder that there are two good workarounds for this issue in the meantime:

            • Use or build a proper Docker image for your build environment. Our default image doesn't get frequently updated, so if you want the latest tool versions, you should be using another image.
            • Alternatively, include LC_ALL (and others as needed) in your pipeline environment variables, either via Pipelines settings or as an export line in your build script.

            Matt Ryall added a comment - Thanks for your patience on this issue. We're planning a few updates to the default image in the coming months, so I'll see if we can get this one in as well (cc @rgomish). Agree that setting LC_ALL=C.UTF-8 seems like the correct fix to use UTF-8 without selecting a specific language. We'll go with that if we do it. Also a reminder that there are two good workarounds for this issue in the meantime: Use or build a proper Docker image for your build environment. Our default image doesn't get frequently updated, so if you want the latest tool versions, you should be using another image. Alternatively, include LC_ALL (and others as needed) in your pipeline environment variables, either via Pipelines settings or as an export line in your build script.

            ncoghlan added a comment -

            For me, the main issue is the fact that having the legacy "C" locale configured tells tools like Python 3 that they should use ASCII to interface with the operating system for things like filesystem paths and environment variables. Armin Ronacher has a decent write-up of the problems this can cause in the click documentation: http://click.pocoo.org/5/python3/#python-3-surrogate-handling

            While I have some changes in the works to tell Python 3 to coerce the C locale to C.UTF-8 instead (as discussed at http://bugs.python.org/issue28180 ), a more immediate solution is for infrastructure providers to configure C.UTF-8 as their default rather than relying on components to either override or ignore the locale setting.

            ncoghlan added a comment - For me, the main issue is the fact that having the legacy "C" locale configured tells tools like Python 3 that they should use ASCII to interface with the operating system for things like filesystem paths and environment variables. Armin Ronacher has a decent write-up of the problems this can cause in the click documentation: http://click.pocoo.org/5/python3/#python-3-surrogate-handling While I have some changes in the works to tell Python 3 to coerce the C locale to C.UTF-8 instead (as discussed at http://bugs.python.org/issue28180 ), a more immediate solution is for infrastructure providers to configure C.UTF-8 as their default rather than relying on components to either override or ignore the locale setting.

            Matt Ryall added a comment -

            We'd like to see additional information on specific problems that this causes. If you hit issues with locale/character encoding in your pipeline, please provide information here.

            Opening to consider for work in the future.

            Matt Ryall added a comment - We'd like to see additional information on specific problems that this causes. If you hit issues with locale/character encoding in your pipeline, please provide information here. Opening to consider for work in the future.

            ncoghlan added a comment -

            I came to report the same problem, but my suggested resolution would be different: use the C.UTF-8 locale, as this scenario is exactly what it's for (i.e. properly handling UTF-8 encoded text without making any other locale specific assumptions)

            ncoghlan added a comment - I came to report the same problem, but my suggested resolution would be different: use the C.UTF-8 locale, as this scenario is exactly what it's for (i.e. properly handling UTF-8 encoded text without making any other locale specific assumptions)

            dismine added a comment -

            I confirm that this solution works for me. Thanks Sigge.

            dismine added a comment - I confirm that this solution works for me. Thanks Sigge.
            Geoff created issue -

              Unassigned Unassigned
              sbirgisson Sigurdur Birgisson (Inactive)
              Votes:
              6 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: