How to tar directory and then remove originals including the directory?

32

7

I'm trying to tar a collection of files in a directory called 'my_directory' and remove the originals by using the command:

tar -cvf files.tar my_directory --remove-files

However it is only removing the individual files inside the directory and not the directory itself (which is what I specified in the command). What am I missing here?

EDIT:

Yes, I suppose the 'remove-files' option is fairly literal. Although I too found the man page unclear on that point. (In linux I tend not to really distinguish much between directories and files that much, and forget sometimes that they are not the same thing). It looks like the consensus is that it doesn't remove directories.

However, my major prompting point for asking this question stems from tar's handling of absolute paths. Because you must specify a relative path to a file/s to be compressed, you therefore must change to the parent directory to tar it properly. As I see it using any kind of follow-on 'rm' command is potentially dangerous in that situation. Thus I was hoping to simplify things by making tar itself do the remove.

For example, imagine a backup script where the directory to backup (ie. tar) is included as a shell variable. If that shell variable value was badly entered, it is possible that the result could be deleted files from whatever directory you happened to be in last.

Nicholas

Posted 2010-01-17T11:25:07.563

Reputation: 741

@isync I seem to be experiencing --remove-files deleting directories on Ubuntu 14.04. Except in my case I don't want it to. Haha – Bradley Odell – 2016-12-07T01:22:32.973

Nicholas, your point that it adds danger to have to remove the directory tree in an extra step is absolutely valid. I think it should be possible to have this done safely by the archiver. I also believe this was the intention of the creators of GNU tar, at least it should have been ;-) – mit – 2011-11-14T14:58:58.447

2I've found that the --remove-files option does indeed remove the containing dir - at least on some platforms/in some versions - and in my case. Might be that in your case the remaining dir wasn't completely empty due to some files being modified after being tar'ed. – isync – 2012-04-24T19:23:05.187

Answers

12

You are missing the part which says the --remove-files option removes files after adding them to the archive.

You could follow the archive and file-removal operation with a command like,

find /path/to/be/archived/ -depth -type d -empty -exec rmdir {} \;


Update: You may be interested in reading this short Debian discussion on,
Bug 424692: --remove-files complains that directories "changed as we read it".

nik

Posted 2010-01-17T11:25:07.563

Reputation: 50 788

4--remove-files bug was fixed in tar-1.19. – x-yuri – 2016-02-13T13:08:18.773

Maybe it's actually: -c changes directory before tar starts its work (and kind of does not return until done)? I guess it would have deleted subdirectories, if those were included in the archive (but I have not tested that). – Arjan – 2010-01-17T11:56:08.023

@Arajan, I do not think 'c' has anything to do with this; 'remove-files' intentionally does not remove directories. – nik – 2010-01-17T12:56:25.780

Aha, I find the short explanation "remove files after adding them to the archive" from the man pages not too clear on that, but I assume you're right. Still, I'd not expect the directory mentioned for -c to be removed even if tar did remove directories as well. (To me, that would be like removing the current directory, hence including the archive itself, when not using -c...?) But if -remove-files always leaves directories in place then I'm surely just complicating things here. ;-) – Arjan – 2010-01-17T16:21:22.127

19

Since the --remove-files option only removes files, you could try

tar -cvf files.tar my_directory && rm -R my_directory

so that the directory is removed only if the tar returns an exit status of 0

pavium

Posted 2010-01-17T11:25:07.563

Reputation: 5 956

--remove-files bug was fixed in tar-1.19. – x-yuri – 2016-02-13T13:08:41.513

&& will only run the following command, if the previous command exited 0 (success). If it exits > 0, the following command won't be run. You can also reverse that with || -- only run if the first command failed. Good way to do terrible content check-restarts that one. – Kirrus – 2019-10-15T10:28:04.917

9except that you should check the exit status of tar before doing the rm! otherwise you might be left with no tar archive and no files... – Kim – 2010-01-17T13:17:44.857

1When using one level directories, I beleive a safer option would be using 'rmdir' rather than 'rm' to as it would only remove an empty directory. [See question edits] – Nicholas – 2010-01-19T22:23:07.387

But rmdir only removes empty directories. The idea was to remove the directory and the files in it (provided tar is successful) – pavium – 2010-01-20T02:13:44.860

7

Have you tried to put --remove-files directive after archive name? It works for me.

tar -cvf files.tar --remove-files my_directory

Robert Grubba

Posted 2010-01-17T11:25:07.563

Reputation: 71

2It's more likely that the behavior of tar has changed since this question was posed. For me, there's no difference in putting --remove-files before or after my_directory; in both cases, the directory is removed. – redburn – 2015-05-17T09:04:57.017

5--remove-files bug was fixed in tar-1.19. – x-yuri – 2016-02-13T13:08:46.520

1

source={directory argument}

e.g.

source={FULL ABSOLUTE PATH}/my_directory

 

parent={parent directory of argument}

e.g.

parent={ABSOLUTE PATH of 'my_directory'/

 

logFile={path to a run log that captures status messages}

Then you could execute something along the lines of:

cd ${parent}

tar cvf Tar_File.`date%Y%M%D_%H%M%S` ${source}

if [ $? != 0 ]

then

 echo "Backup FAILED for ${source} at `date` >> ${logFile}

else

 echo "Backup SUCCESS for ${source} at `date` >> ${logFile}

 rm -rf ${source}

fi

shellking

Posted 2010-01-17T11:25:07.563

Reputation: 96

1

This was probably a bug.

Also the word "file" is ambigous in this case. But because this is a command line switch I would it expect to mean also directories, because in unix/lnux everything is a file, also a directory. (The other interpretation is of course also valid, but It makes no sense to keep directories in such a case. I would consider it unexpected and confusing behavior.)

But I have found that in gnu tar on some distributions gnu tar actually removes the directory tree. Another indication that keeping the tree was a bug. Or at least some workaround until they fixed it.

This is what I tried out on an ubuntu 10.04 console:

mit:/var/tmp$ mkdir tree1                                                                                               
mit:/var/tmp$ mkdir tree1/sub1                                                                                          
mit:/var/tmp$ > tree1/sub1/file1                                                                                        

mit:/var/tmp$ ls -la                                                                                                    
drwxrwxrwt  4 root root 4096 2011-11-14 15:40 .                                                                              
drwxr-xr-x 16 root root 4096 2011-02-25 03:15 ..
drwxr-xr-x  3 mit  mit  4096 2011-11-14 15:40 tree1

mit:/var/tmp$ tar -czf tree1.tar.gz tree1/ --remove-files

# AS YOU CAN SEE THE TREE IS GONE NOW:

mit:/var/tmp$ ls -la
drwxrwxrwt  3 root root 4096 2011-11-14 15:41 .
drwxr-xr-x 16 root root 4096 2011-02-25 03:15 ..
-rw-r--r--  1 mit   mit    159 2011-11-14 15:41 tree1.tar.gz                                                                   


mit:/var/tmp$ tar --version                                                                                             
tar (GNU tar) 1.22                                                                                                           
Copyright © 2009 Free Software Foundation, Inc.

If you want to see it on your machine, paste this into a console at your own risk:

tar --version                                                                                             
cd /var/tmp
mkdir -p tree1/sub1                                                                                          
> tree1/sub1/file1                                                                                        
tar -czf tree1.tar.gz tree1/ --remove-files
ls -la

mit

Posted 2010-01-17T11:25:07.563

Reputation: 1 369