I was having some internationalization (i18n) issues with one of my sites recently. At this (relatively) advanced level of development on the web, I was still using the ANSI charset to encode my views. This was a big problem, as the end-user often used characters that fell outside of ANSI. These characters were somehow being saved correctly in the database, yet were being rendered incorrectly each time. In fact, in Firefox, simply switching to ISO-8859-1 encoding would resolve all issues. But I didn’t want it to be like this, I wanted the browser to use ISO-885901 all the time, rather than having the user switch around.
To solve this problem, I tried four solutions:
1. Use the meta HTML tag.
This did not work at all in my trials. Firefox (or rather: Iceweasel) would switch to Windows-1252, even when I was on Debian 4.0r0. Opera would blithely use UTF-8, but words with accents and graves showed up as a Chinese character! Oops.
2. Use the header() call to set the header; e.g.
header("Content-Type: text/html; charset=UTF-8");
The browser would render the HTML page as plain text instead, no go.
3. Check php.ini to see if any encoding had been set there:
;default_charset = "ISO-8859-1"
Nope… it was commented out, so the developer is actually free to define the charset!
4. The final (and correct) solution lay in Google-ing (but of course): forum.mamboserver.com.
Turns out that the wget command is really useful, as you can see what character set is being sent out by the server.
$ wget --server-response https://kzhiwei.wordpress.com/
Look for the Content-Type flag. My server was setting the content type as UTF-8, whilst my views were saved in ANSI. This resulted in the vast shift in character set, therefore the solution was to open every view and save as UTF-8.