I'm not sure if I've grasped the issue here so if I haven't just say so and I'll edit the title.
My problem is the following:
I have an Ubuntu 12.04 server (UTF-8 locale) to which users upload files via a web app or through shell. So I have no control over naming conventions. These names are then placed into a UTF8 MYSQL database table.
Unfortunately it seems some of the files contain special characters that my database does not like.
One such example would be ́e
(eU+0301) in place of é
(U+00E9). My database does not enjoy this one bit and replaces such instances with e?
. The shell itself has either displayed the info correctly when ls
was used or has shown broken 'inexisting" symbols in the current folder route. And I've also seen the likes of E??
in place of́E
(EU+0301) (which FYI should be É
(U+00C9))
This is a headache as I can't even seem to run a find
command on files with such characters.
So my first question is: Is there a shell command I can use to convert filenames on upload? (Something I could run recursively on a folder) Idealy it would convert them to the appropriate equivalent, but I don't care if I have to replace any such unicode sequences with an arbitrary character such as "_" for example.
Thanks in advance.