26
17
I got ZIP file(s), which contains files, which filenames are in some encoding. Let's say I know encoding of those filenames, but I still dont know how to properly decompress them.
Here is example file, it contains one file "【SSK字幕组】The Vampire Diaries 吸血鬼日记S06E12.ass"
I know used encoding is GB18030 (Chinese)
Question is - how to unpack that file in FreeBSD using unzip or other CLI utility to get proper encoded filename? I tried everything what I could, but result was never good. Please help.
I tried on OSX:
MBP1:test 2ge$ bsdtar xf gb18030.zip
MBP1:test 2ge$ ls
%A1%BESSK%D7%D6Ļ%D7顿The Vampire Diaries %CE%FCѪ%B9%ED%C8ռ%C7S06E12/ gb18030.zip
MBP1:test 2ge$ cd %A1%BESSK%D7%D6Ļ%D7顿The\ Vampire\ Diaries\ %CE%FCѪ%B9%ED%C8ռ%C7S06E12/
MBP1:%A1%BESSK%D7%D6Ļ%D7顿The Vampire Diaries %CE%FCѪ%B9%ED%C8ռ%C7S06E12 2ge$ ls
%A1%BESSK%D7%D6Ļ%D7顿The Vampire Diaries %CE%FCѪ%B9%ED%C8ռ%C7S06E12.ass*
MBP1:%A1%BESSK%D7%D6Ļ%D7顿The Vampire Diaries %CE%FCѪ%B9%ED%C8ռ%C7S06E12 2ge$ find . | iconv -f gb18030 -t utf-8
.
./%A1%BESSK%D7%D6L抬%D7椤縏he Vampire Diaries %CE%FC血%B9%ED%C8占%C7S06E12.ass
MBP1:%A1%BESSK%D7%D6Ļ%D7顿The Vampire Diaries %CE%FCѪ%B9%ED%C8ռ%C7S06E12 2ge$ convmv -r -f gb18030 -t utf-8 --notest .
Skipping, already UTF-8: ./%A1%BESSK%D7%D6Ļ%D7顿The Vampire Diaries %CE%FCѪ%B9%ED%C8ռ%C7S06E12.ass
Ready!
I tried similar with unzip, but I get similar problem.
Thanks, now trying on FREE BSD, where I am connecting using SSH from OSX (Terminal):
# locale
LANG=
LC_CTYPE="C"
LC_COLLATE="C"
LC_TIME="C"
LC_NUMERIC="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=C
The first thing, I would like to is to proper show Chinese names. I changed
setenv LC_ALL zh_CN.GB18030
setenv LANG zh_CN.GB18030
Then I downloaded file and try to "ls" to see proper characters, but not luck. So I think I have to solve first Chinese locale to verify when I get proper result, actually I can compare it. Can you also help me please with this?
For zips created by Greek Windows I had success with this method and encoding CP737 – ndemou – 2017-09-21T09:01:18.967
Bravo! I double checked the man page, it actually works but totally undocumented, none the zsh completion have this parameter. – ttimasdf – 2018-03-29T06:46:46.533
3
unzip
does not have this option in Mac OS X and always creates percent-encoded filenames. @javacom'sunar
suggestion worked as a charm. – Phil Krylov – 2018-04-10T18:48:42.447Looks like a Debian-specific functionality. My
unzip
tells it'sUnZip 6.00 of 20 April 2009, by Info-ZIP. Maintained by C. Spieler
and doesn't provide such options. – L29Ah – 2019-04-11T19:06:11.9172@L29Ah My
unzip
in Debian 9 is exactly the same version and has no such options. Probably Ubuntu specific? – Arnie97 – 2019-04-16T14:20:48.907@Arnie97 and L29Ah: The unzip on CentOS 7.6.1810 (not Debian family) reports itself as
UnZip 6.00 of 20 April 2009, by Info-ZIP. Maintained by C. Spieler.
and it has these options. – mbdevpl – 2019-04-18T01:31:57.377why this is not accepted answer? – Wang – 2019-04-25T10:33:45.523
You can use
-O
option on any distributions. First, download the source byapt source unzip
on Ubuntu (live environment is enough). Second, copy theunzip-6.0
directory to your system. Third,cd
into the directory. Finally, executesudo make --file=unix/Makefile generic && sudo make --file=unix/Makefile install
to compile and install. The defaultprefix
is/usr/local
(not just/usr
). For the detailed explanation, readREADME
andINSTALL
. This procedure is confirmed on Arch Linux, whose originalunzip
doesn't supply-O
option. – ynn – 2019-09-19T16:33:33.710@ynn Or you can pick only
– ynn – 2019-09-19T18:04:25.800unzip-6.0/debian/patches/20-unzip60-alt-iconv-utf8.patch
and apply it to an official source by Info-ZIP and then compile and install. This procedure is also confirmed on Arch Linux. (On Arch, you canasp checkout unzip
and thenmakepkg -o
and then apply the patch and thenmakepkg -ei
.)