I have recently migrated a web server running an old PHP shop, which runs a little tricky: let me explain first. If you want to post that on thedailywtf or codinghorror, you're welcome.
The shop owner runs an exe on his own machine to update the products. The program generates (from an MDB Access database) its own PHP files, and finally uploads them to the target server, together with a special updatedb.php
and a db.sql
file (that is accessible publicly from the wwwroot
). The exe file then invokes a POST
to updatedb.sql
with a security parameter that prevents foreign people to invoke the PHP script (I won't explain which public text file holds the secret code in its first line).
What does the PHP do then? Just clear the DB and re-insert all the data, even if actual shop page don't load their data from MySQL but each generated PHP file contains static description and image links. The updatedb.php
loads data from the db.sql
file and runs each and every line in a new SQL connection.
Configuration differences
Old and new server both run Apache 2
and vsftpd
. I found that the locale is different between the two: old uses ISO-8859-15
charmap and new server UTF-8
.
The vsftpd
configuration is identical
The charset problem
I found that most non-ASCII symbols (like €
uro, °
and accented letters àèéìòù
) got screwed up so they won't run in SQL statements (and you may guess that no escaping is done, and I can confirm that!)
The question
Since the customer doesn't want to trash their shop, forcing me to add several special PHP configuration directives for their vhost, how do I fix the character encoding for files transferred over FTP?
I thought about configuration in vsftpd.conf
(source)
# Enable character convertion. Supported UTF-8 (Russian chars) = UTF8,
# Win-1251 = WIN1251 or 1251, Koi8-r = KOI8R or 878, IBM 866 = DOS or 866.
convert_charset_enable=1
# Define charset local
local_charset=UTF8
# Define default charset on remote host
remote_charset=ISO-8859-15
But forcing remote charset to iso-8859-15 doesn't seem great. Fortunately, that's the only customer who use FTP transfers from Windows XP.
[Update] I just found my old question Euro character messed up during FTP transfer about a similar problem. In that case, it was SQL strings being cut from the € sign. Now it's the queries being executed with a messed up € sign.