|
Is there a simple way to change the platform default encoding? By platform default, do you mean the OS?
I believe you can set the system property file.encoding to the character encoding you want to use. I haven't 100% verified what's going on yet but changing it to UTF-8 might get the request to complete properly. I can't make guarantees that other bad things won't happen if you change the default; it would be wise to test first.
Upon further investigation, it seems XML-RPC handles Japanese characters fine if you set LANG to UTF-8. Perhaps we need to mention about this somewhere inside the startup.sh file?
That's because the VM uses a mix of locale and system properties to determine the platform default charset on startup. Unfortunately this doesn't really solve the problem - it only works in this case because your VM happens to be running with a default charset that is the same as the one the client happens to be sending the request in. The sanest way to do this is to parse the request using the normal rules of determining XML request encodings. Either by the encoding specified explicitly in the request, or following the rules of XML document encoding determination in its absence. (WIthout a byte order mark or explicit encoding, the XML should be decoded as UTF-8) Some more specific information form the investigation: it appears that this behaviour is related to the SAX parser used by default in the Apache XMLRPC server (MinML). Looking through the source of this parser shows that request input streams are used directly with an InputStreamReader and no charset is specified. InputStreamReader will use the platform default in this case.
We should investigate configuring XML RPC to use a more capable (read correct) SAX parser. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I'll investigate how we can prevent this stupidity.