Commandline program to extract archives with automatic subdirectry detection

1

0

I'm looking for the pure command-line counterpart to ark -ba <path> (on KDE), or file-roller -h <path> (on GNOME/Unity).

Unfortunately, both ark and file-roller require X to be running. I'm aware that it is relatively simple to write a tool that detects archives based on their file extension, and then runs the appropriate program:

#!/bin/bash
if [[ -f "$1" ]] ; then
    case $1 in
        *.tar.bz2) tar xjvf $1 ;;
        *.tar.gz) tar xzvf $1 ;;
        *.bz2) bunzip2 $1 ;;
        *.rar) rar x $1 ;;
        *.gz) gunzip $1 ;;
        *.tar) tar xf $1 ;;
        *.tbz2) tar xjvf $1 ;;
        *.tgz) tar xzvf $1 ;;
        *.zip) unzip $1 ;;
        *.Z) uncompress $1 ;;
        *.7z) 7z x $1 ;;
        *) echo "'$1' cannot be extracted with this utility" ;;
    esac
else
    echo "path '$1' does not exist or is not a file"
fi

However, that doesn't take care of subdirectory detection (and in fact, many extraction programs do not even supply such an option).

So might there be a program that does exactly that?

该用户不存在

Posted 2012-12-09T04:44:05.957

Reputation: 23

The question is perfectly on topic and welcome to stay here. If you don't get a good answer, you can always flag it for migration to Unix & Linux.

– terdon – 2012-12-09T04:51:17.573

4What do you mean by "subdirectory detection"? – terdon – 2012-12-09T04:53:28.927

I am also unclear what you mean with subdirectory detection. Do you want to avoid extracting any subdirectories and dump all files in the current directory? Do you want to Avoid that and make sure it end up in a directory (without more subdirs in that dir), etc etc. --- Also, in the example: Why verbose on compressed tarballs but not on regular ones? And why #!/bin/bash rather than the more modern #!/usr/bin/env bash ? – Hennes – 2012-12-09T05:18:39.720

1Both ark and file-roller are able to automatically detect whether or whether not the archive stores the file in a subdirectory - i.e, (as virtual path) "somearchive.zip/somesubdirectory/whatever.html", compared to (again, virtual path) "somearchive.zip/whatever.html". In the latter case, both ark and file-roller would create a directory named "somearchive" (or "somearchive-<number>" if the directory already exists in $PWD). That's what I mean with "subdirectory detection". – 该用户不存在 – 2012-12-09T05:33:41.490

1

You could look into atool. The aunpack command should be able to do what you need. Also see here.

– xuhdev – 2014-02-20T07:59:56.197

Answers

3

What you describe as "subdirectory detection" should happen by default. In this example with GNU tar:

$ tree
.
├── dir1
│   └── file4
├── dir2
│   ├── file5
│   └── file6
├── file1
├── file2
└── file3

Archive:

$ tar cvf all.tar *
$ mkdir new_dir
$ mv all.tar new_dir
$ cd new_dir
$ tar xvf all.tar
$ tree
.
├── all.tar
├── dir1
│   └── file4
├── dir2
│   ├── file5
│   └── file6
├── file1
├── file2
└── file3

If you are using an archive program that does not keep the directory structure when creating an archive (are you sure about this by the way? I don't know of any that don't do this), then the information is lost. There is no way to recreate the directory structure unless it has been saved in the archive itself, in which case it should be recreated upon archive extraction by default.


If you want to mimic the behavior of ark -a:

  -a, --autosubfolder            Archive contents will be read, and if detected
                                 to not be a single folder archive, a subfolder
                                 with the name of the archive will be created.

You could create a wrapper script that extracts the archive to a temp directory, then if the temp dir contains just one other directory, move that directory into your current working directory and delete the tmp dir and, if there are multiple files/dirs in the temp dir, rename it to the name of the archive. Something like this:

#!/usr/bin/env bash

for file in "$@"
do
    ## Get the file's extension
    ext=${file##*.}
    ## Special case for compressed tar files. They sometimes
    ## have extensions like tar.bz2 or tar.gz etc.
    if [[ "$(basename "$file" ."$ext")" =~ \.tar$ ]]; then
                if [[ "$ext" = "gz" ]]; then
                        ext="tgz"
                elif [[ "$ext" = "bz2" ]]; then
                        ext="tbz"
                fi
        fi


        ## Create the temp dir
        tmpDir=$(mktemp -d XXXXXX);
    case $ext in
        7z)
            7z -o "$tmpDir" e "$file" 
            ;;
        tar)
            tar xf "$file" -C "$tmpDir" 
            ;;
                tbz)
                        tar xjf "$file" -C "$tmpDir" 
                        ;;

                tgz)
                        tar xzf "$file" -C "$tmpDir" 
                        ;;

        rar)
                        unrar e "$file" "$tmpDir"
            ;;
        zip)
            unzip "$file" -d "$tmpDir"
            ;;
        *)
            echo "Unknown extension: '$ext', skipping..."
            ;;
    esac

        ## Get the tmp dir's structure
        tmpContents=( "$tmpDir"/* )
        c=1
        ## If the tmpDir contains only one item and that is a directory
        if [[ ${#tmpContents[@]} = 1 ]] && [[ -d "${tmpContents[0]}" ]]
        then
                ## Move that directory to the current working directory
                ## and delete the tmpDir, renaming it if a file/directory with
                ## the same name already exists.
                dirName=${tmpContents[0]##*/}
                [[ -e "$dirName" ]] && dirName="$dirName.$c"
                while [[ -e "$dirName" ]]; do
                        ((c++))
                        dirName="${dirName/.*}.$c"
                done
                mv "${tmpContents[0]}" "$dirName"
        else
                ## If the tmpDir contains anything more than a single directory,
                ## rename thye tmpDir to the name of the archive sans extension.
                ## If a file/dir of that name already exists, add a counter.
                dirName="${file##*/}"     ## strip path
                dirName="${dirName%%.*}"  ## strip extension(s)
                [[ -e "$dirName" ]] && dirName="$dirName.$c"
                while [[ -e "$dirName" ]]; do
                        ((c++))
                        dirName="${dirName/.*}.$c"
                done
                mv "$tmpDir" "$dirName"
        fi
printf "Archive '%s' extracted to %s\n" "$file" "$dirName" >&2
done

terdon

Posted 2012-12-09T04:44:05.957

Reputation: 45 216

@terdon You might've misunderstood the question, OP probably has to deal with archives they didn't create themselves. Sometimes people create archives with several files/dirs in its root. What ark -ba does in this case is it creates a subdirectory with the same name as the archive sans extensions and unpacks inside this subdir. I too would welome a CLI program that would do this... – kralyk – 2016-05-04T09:03:43.537

@kralyk since the OP has accepted this answer, it presumably answers their question. Have a look at the comments under the question where I ask for clarification of what they mean "subdirectory detection". What you seem to be asking for is different. You might want to ask a new question, either here or on [unix.se]. – terdon – 2016-05-04T09:10:57.400

@terdon I belive I'm on the same page as OP - ark -a is what this is about. Check out ark --help to see what ark -a does. – kralyk – 2016-05-04T09:13:26.570

@kralyk have a look at the updated answer. I added a script that should do what you want. If it doesn't, please ask a separate question (here is fine but [unix.se] might be a better choice) and let me know about it and I'll try and answer. – terdon – 2016-05-04T10:19:01.430

No worries, I'll delete my comment. It was uncalled for, I agree. Nevertheless, I appreciate your effort. My apologies. – 该用户不存在 – 2012-12-09T22:50:38.700

Fair enough, deleting mine too, sorry I couldn't be of more help. – terdon – 2012-12-09T23:42:24.570

2

aunpack from atool does this by default.

Usage: aunpack <archive file>

Available from most linux distro repos.

kralyk

Posted 2012-12-09T04:44:05.957

Reputation: 211