Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-42670

REST API GET title parameter not working with UTF-8 characters

      NOTE: This bug report is for Confluence Server. Using Confluence Cloud? See the corresponding bug report.

      Summary

      REST API GET title parameter doesn't work with UTF-8 characters

      Steps to Reproduce

      1. Create a page with the "Обзор" title
      2. Run the following cURL
        curl -u admin:admin -X GET "http://localhost:8090/confluence/rest/api/content?title=Обзор&spaceKey=<space-key>"
        

      Expected Results

      The GET command returns all the page's information.

      Actual Results

      The code doesn't work and returns:

      {"results":[],"start":0,"limit":25,"size":0,"_links":{"self":"http://localhost:8090/confluence/rest/api/content?spaceKey=GP&title=%C3%90%C2%9E%C3%90%C2%B1%C3%90%C2%B7%C3%90%C2%BE%C3%91%C2%80","base":"http://localhost:8090/confluence","context":"/confluence"}}
      

      Notes

      POST cURL works for the "Обзор" word:

      curl -u admin:admin -X POST -H 'Content-Type: application/json' -d'{"type":"page","title":"1. Обзор","space":{"key":"GP"},"body":{"storage":{"value":"<p>This is a new page</p>","representation":"storage"}}}' http://localhost:8090/confluence/rest/api/content/
      

      Workaround

      Use the pageid= instead of the title=

      curl -u admin:admin -X GET "http://localhost:8090/confluence/rest/api/content?pageid=10813474&spaceKey=GP"
      

            [CONFSERVER-42670] REST API GET title parameter not working with UTF-8 characters

            I don't think this is a bug, running the following curl command results in a legit server exception

            curl -u admin:admin -X GET "http://localhost:8080/confluence/rest/api/content?spaceKey=HAS&title=æøå"
            java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986
            at org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:467)
            at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:667)
            at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)

            which is a hint that there is an issue with the way we make the request, as a matter of fact "æøå" and other similar characters need to be encoded in the URL.

            The correct way to achieve this through a curl command is as follow

            curl -u admin:admin -X GET "http://localhost:8080/confluence/rest/api/content?spaceKey=HAS" --data-urlencode "title=æøå" -v -G
            * Trying ::1...
            * Connected to localhost (::1) port 8080 (#0)
            * Server auth using Basic with user 'admin'
            > GET /confluence/rest/api/content?spaceKey=HAS&title=%C3%A6%C3%B8%C3%A5 HTTP/1.1
            > Host: localhost:8080
            > Authorization: Basic YWRtaW46YWRtaW4=
            > User-Agent: curl/7.43.0
            > Accept: */*
            >
            < HTTP/1.1 200
            < X-ASEN: SEN-3390403
            < Set-Cookie: JSESSIONID=833525524BD2F548E4BA5F12118B8606;path=/confluence;HttpOnly
            < X-Seraph-LoginReason: OK
            < X-AUSERNAME: admin
            < Cache-Control: no-cache, must-revalidate
            < Expires: Thu, 01 Jan 1970 00:00:00 GMT
            < X-Content-Type-Options: nosniff
            < Content-Type: application/json
            < Transfer-Encoding: chunked
            < Date: Thu, 08 Jun 2017 22:10:13 GMT
            <
            * Connection #0 to host localhost left intact
            {"results":[{"id":"2147680291","type":"page","status":"current","title":"æøå","extensions":{"position":"none"},"_links":{"webui":"/pages/viewpage.action?pageId=2147680291","edit":"/pages/resumedraft.action?draftId=2147680291&draftShareId=c8462d98-743c-44de-b9ea-d4182650a8d2","tinyui":"/x/IwADg","self":"http://localhost:8080/confluence/rest/api/content/2147680291"},"_expandable":{"container":"/rest/api/space/HAS","metadata":"","operations":"","children":"/rest/api/content/2147680291/child","history":"/rest/api/content/2147680291/history","ancestors":"","body":"","version":"","descendants":"/rest/api/content/2147680291/descendant","space":"/rest/api/space/HAS"}}],"start":0,"limit":25,"size":1,"_links":{"self":"http://localhost:8080/confluence/rest/api/content?spaceKey=HAS&title=%C3%A6%C3%B8%C3%A5","base":"http://localhost:8080/confluence","context":"/confluence"}}

            You have to use the --data-urlencode option, and -G to append it as a query parameter (otherwise it is treated as a POST request body)

            Below some curl docs :

            -G, --get
            When used, this option will make all data specified with -d, --data, --data-binary or --data-urlencode to be used in an HTTP GET request instead of the POST request that otherwise would be used. The data will be appended to the URL with a '?' separator.
            If used in combination with -I, --head, the POST data will instead be appended to the URL with a HEAD request.
            If this option is used several times, only the first one is used. This is because undoing a GET doesn't make sense, but you should then instead enforce the alternative method you prefer.

             

             

            Hasnae (Inactive) added a comment - I don't think this is a bug, running the following curl command results in a legit server exception curl -u admin:admin -X GET "http: //localhost:8080/confluence/ rest /api/content?spaceKey=HAS&title=æøå" java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986 at org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:467) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:667) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) which is a hint that there is an issue with the way we make the request, as a matter of fact "æøå" and other similar characters need to be encoded in the URL. The correct way to achieve this through a curl command is as follow curl -u admin:admin -X GET "http: //localhost:8080/confluence/ rest /api/content?spaceKey=HAS" --data-urlencode "title=æøå" -v -G * Trying ::1... * Connected to localhost (::1) port 8080 (#0) * Server auth using Basic with user 'admin' > GET /confluence/ rest /api/content?spaceKey=HAS&title=%C3%A6%C3%B8%C3%A5 HTTP/1.1 > Host: localhost:8080 > Authorization: Basic YWRtaW46YWRtaW4= > User-Agent: curl/7.43.0 > Accept: */* > < HTTP/1.1 200 < X-ASEN: SEN-3390403 < Set-Cookie: JSESSIONID=833525524BD2F548E4BA5F12118B8606;path=/confluence;HttpOnly < X-Seraph-LoginReason: OK < X-AUSERNAME: admin < Cache-Control: no-cache, must-revalidate < Expires: Thu, 01 Jan 1970 00:00:00 GMT < X-Content-Type-Options: nosniff < Content-Type: application/json < Transfer-Encoding: chunked < Date: Thu, 08 Jun 2017 22:10:13 GMT < * Connection #0 to host localhost left intact { "results" :[{ "id" : "2147680291" , "type" : "page" , "status" : "current" , "title" : "æøå" , "extensions" :{ "position" : "none" }, "_links" :{ "webui" : "/pages/viewpage.action?pageId=2147680291" , "edit" : "/pages/resumedraft.action?draftId=2147680291&draftShareId=c8462d98-743c-44de-b9ea-d4182650a8d2" , "tinyui" : "/x/IwADg" , "self" : "http: //localhost:8080/confluence/ rest /api/content/2147680291" }, "_expandable" :{ "container" : "/ rest /api/space/HAS" , "metadata" : ""," operations ":" "," children ":" / rest /api/content/2147680291/child "," history ":" / rest /api/content/2147680291/history "," ancestors ":" "," body ":" "," version ":" "," descendants ":" / rest /api/content/2147680291/descendant "," space ":" / rest /api/space/HAS "}}]," start ":0," limit ":25," size ":1," _links ":{" self ":" http://localhost:8080/confluence/ rest /api/content?spaceKey=HAS&title=%C3%A6%C3%B8%C3%A5 "," base ":" http://localhost:8080/confluence "," context ":" /confluence"}} You have to use the --data-urlencode option, and -G to append it as a query parameter (otherwise it is treated as a POST request body) Below some curl docs : -G, --get When used, this option will make all data specified with -d, --data, --data-binary or --data-urlencode to be used in an HTTP GET request instead of the POST request that otherwise would be used. The data will be appended to the URL with a '?' separator. If used in combination with -I, --head, the POST data will instead be appended to the URL with a HEAD request. If this option is used several times, only the first one is used. This is because undoing a GET doesn't make sense, but you should then instead enforce the alternative method you prefer.    

            I've observed the following during development in relation to UTF-8 Titles:

            1) Initial GET fails as shown above after creating via the REST API (no results returned even though it should match)

            2) Retrying the exact same request succeeds (it fixes itself). I have also observed a glitch in the Confluence UI itself - create an article as shown above via REST. But instead of calling a get after browse it in the UI. What I've seen on my instance is in the UI its title is corrupt as ????? but refreshing the page fixes it. After doing the refresh the GET works as well.

            3) Posting an Update to the content of the article with a UTF-8 title permanently corrupts the title. Refreshing and retrying does not correct it.

            I have made sure to set the following headers:
            Accept: application/json;charset=UTF-8
            Content-Type: application/json;charset=UTF-8

            I've also made sure that the title the query param is properly URL encoded, but the problem persists. This only happens with the title - any UTF-8 characters in the article content are fine.

            In my case, I am querying by title to see if an article exists yet as part of determining whether to do an create vs update for a given block of content, so I can't query by the page id because it is not known if it exists yet. As such, the suggested workaround of using page id does not work for me.

            I observed this on Confluence 5.9.7. I upgraded my instance to 5.9.10 and it did not fix the problem.

            Aaron Knight added a comment - I've observed the following during development in relation to UTF-8 Titles: 1) Initial GET fails as shown above after creating via the REST API (no results returned even though it should match) 2) Retrying the exact same request succeeds (it fixes itself). I have also observed a glitch in the Confluence UI itself - create an article as shown above via REST. But instead of calling a get after browse it in the UI. What I've seen on my instance is in the UI its title is corrupt as ????? but refreshing the page fixes it. After doing the refresh the GET works as well. 3) Posting an Update to the content of the article with a UTF-8 title permanently corrupts the title. Refreshing and retrying does not correct it. I have made sure to set the following headers: Accept: application/json;charset=UTF-8 Content-Type: application/json;charset=UTF-8 I've also made sure that the title the query param is properly URL encoded, but the problem persists. This only happens with the title - any UTF-8 characters in the article content are fine. In my case, I am querying by title to see if an article exists yet as part of determining whether to do an create vs update for a given block of content, so I can't query by the page id because it is not known if it exists yet. As such, the suggested workaround of using page id does not work for me. I observed this on Confluence 5.9.7. I upgraded my instance to 5.9.10 and it did not fix the problem.

              hrehioui Hasnae (Inactive)
              gviana Guilherme V. (Inactive)
              Affected customers:
              1 This affects my team
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: