iconv cannot convert between CP1256 -t ISO-8859-6

2

i tried to convert from CP1256 to ISO-8859-6 encoding using Sygwin but it cannot convert!

and help?

$ iconv -f CP1256 -t ISO-8859-6 cca.txt > cca1.txt

iconv: cca.txt:791:41: cannot convert

the result of sed -n '791p' cca.txt | od -c

is enter image description here

user1200219

Posted 2013-11-03T22:22:34.377

Reputation: 153

Please post the output of this command: sed -n '791p' cca.txt | od -c – jlliagre – 2013-11-03T23:20:06.177

Answers

3

If you look at the character maps for Windows-1256 and ISO-8859-6, you can see that 1256 has a character at every point, but 8859-6 has lots of gaps. So when converting, if there's a character that's not in the target encoding, iconv will complain.

Depending on which version of iconv you have, you may use the -c option, and those non-convertible characters will be dropped -- the file will get shorter. Or you can use something like --unicode-subst="@", which will substitute the invalid characters with @, which is convertible. Note that there is quite a bit of flexibility with that substitution, including expansion (e.g. "[%u]" will convert to the Unicode value in brackets).

If Cygwin does not have these options, you can try a recent Linux or Mac OS X.

In any case, the resulting file can only have characters that are actually in ISO-8859-6.

Ken

Posted 2013-11-03T22:22:34.377

Reputation: 7 497