How can I recursively copy files by file extension, preserving directory structure?

73

40

At the Linux command line, I'd like to copy a (very large) set of .txt files from one directory (and its subdirectories) to another.

I need the directory structure to stay intact, and I need to ignore files except those ending in .txt.

unclaimedbaggage

Posted 2011-06-21T04:41:29.883

Reputation: 845

2Having cp and find as tags in your question, does it mean that you're tied to these options? Since your dataset is very large, it makes sense to assume the copying process can get interrupted for some reasons and you'll have to restart it. I'm not sure the find/cp approach will be able to resume the transfer and copy only the missing part. If you aren't tied to find/cp, you could consider rsync, which is smarter. Its --exclude option will alow you to skip .txt files. – vtest – 2011-06-21T08:42:34.580

Fair call - rsync probably is the better option. Not tied to find/cp. (I used them anyway - rsync wasn't installed on the remote machine, it was a live web server & I wanted to leave as small of a footprint as possible) – unclaimedbaggage – 2011-06-22T00:08:17.757

Answers

98

You can use find and cpio to do this

cd /top/level/to/copy
find . -name '*.txt' | cpio -pdm /path/to/destdir

(-updm for overwrite destination content.)

user35787

Posted 2011-06-21T04:41:29.883

Reputation:

why m? i thought its just to retain the file modify date. – Mubashar – 2017-10-27T00:03:26.757

7

cd /source/path
find -type f -name \*.txt -exec install -D {} /dest/path/{} \;

sborsky

Posted 2011-06-21T04:41:29.883

Reputation: 391

1You're missing a . after find. Also on macOS 10.13.1, this worked: find . -type f -name "*.txt" -exec install -v {} /dest/path/{} \; – grim – 2017-12-02T18:43:06.607

3

Another approach

find . -name '*.txt' -exec rsync -R {} path/to/dext \;

Marc

Posted 2011-06-21T04:41:29.883

Reputation: 131

I like this solution. I used find . -iname '*.txt' -exec rsync -Rptgon {} path/to/dext \; to do a case insensitive match and to preserver ownership and permissions. – MountainX – 2018-03-04T08:25:19.233

1

I was trying to do the same thing on macOS, but none of the options really worked for me. Until i discovered ditto.

I had to copy many .wav files, and have it skip Video files... So here is what I came up with:

find . -type f -iname "*.wav" -ls -exec ditto {} /destination/folder/{} \;

  • find . - Runs find in current folder. make sure you cd /source/folder before you start

  • -type f - Specifies to only look for files

  • -iname "*.wav" - This tells it to look for case insensitive *.wav
  • -ls - This shows you the file that it is working on. Otherwase it shows nothing.
  • -exec ditto {} /destination/folder/{} \; - Does all the work of copying and creating the files with the same directory tree.

Benjamin McGuire

Posted 2011-06-21T04:41:29.883

Reputation: 11

1

how about you first copy it over with

cp -r /old/folder /new/folder

then go to the new folder and run

find . -type f ! -iname "*.txt" -delete

or just

cp -r /old/folder /new/folder && find . -type f ! -iname "*.txt" -delete

Edit: ok you want one command which filters (I have not tested this because my system doesn't have the cpio command!). Here is where I found it: http://www.gnu.org/software/findutils/manual/html_mono/find.html#Copying-A-Subset-of-Files

find . -name "*.txt" -print0 |
     cpio -pmd0 /dest-dir

Please test this first, because I haven't tried it yet. If someone would verify, that would be great.

Dennis

Posted 2011-06-21T04:41:29.883

Reputation: 308

You should keep the 0 in -pmd0 and add -print0 to the end of the find command (just before the |). – G-Man Says 'Reinstate Monica' – 2014-09-11T21:51:42.667

why? (this is classic of the voodoo involved in *sh command line invocations) – jheriko – 2019-10-31T16:02:05.490

nods Cheers - this would work, but without filtering to .txt I'm looking at a few million files (coming out at a few hundred GB). If need be I may have to, but I'd love to filter while copying if possible – unclaimedbaggage – 2011-06-21T05:09:11.383

1Cheers, edited version works if I remove the '0' from -pmd0 – unclaimedbaggage – 2011-06-22T00:06:28.820

1

Easiest way that worked for me:

cp --parents -R jobs/**/*.xml ./backup/

one catch is you have to navigate to the "desired" directory before so the "parent path" is correct.

Also make sure that you enabled recursive globs in bash:

shopt -s globstar

icyerasor

Posted 2011-06-21T04:41:29.883

Reputation: 229

0

Navigate to directory:

find . -regex '<regexp_to_get_directories_and_files_you_want>' | xargs -i cp -r --parents {} path/to/destination

It s a bit more straight forward and mighty, if you manage regular expressions.

keywalker

Posted 2011-06-21T04:41:29.883

Reputation: 1

-1

Navigate to directory:

cp '*.css' /path/to/destination

You'll have to navigate to each folder in the directory, but this is better than most of the options I've seen so far.

Phoenix

Posted 2011-06-21T04:41:29.883

Reputation: 1

This method isn't recursive, meaning that for large directories you could be doing this for quite a while... – Iain Reid – 2017-02-16T16:33:31.437