Traverse a directory tree and output a list of all files

2

1

Write the shortest code that traverses a directory tree and outputs a flat list of all files.

  • It should traverse the whole directory tree
  • The order in which it enters sub-directories doesn't matter
  • The directories should not be output
  • The output should be a plain list — the definition is flexible, but it shouldn't contain irrelevant data
  • You are free to use any programming technique

Mirzhan Irkegulov

Posted 2014-03-14T13:53:41.183

Reputation: 185

2Two questions. 1: what about files in subdirectories? 2: If I do it in bash, can I just use ls (or, recursively, ls -R)? – user12205 – 2014-03-14T14:04:24.140

Answers

8

find 14 characters (12 on GNU)

find . -type f

Run this in the directory you want to list. Add -printf '%f\n' to only print names without the path.

Non-portable: find -type f.

orion

Posted 2014-03-14T13:53:41.183

Reputation: 3 095

2GNU find seems to let you omit the . for a 2 char saving. Doesn't seem to work with the BSD find in OSX though. – Digital Trauma – 2014-03-14T18:16:47.000

'-printf' is a GNU find extension. GNU find has assimilated parts of the functionality of 'stat' and 'xargs'. These extensions are not available allwhere in tne unix world (as kindly being warned by the name GNU (GNU is not Unix!)). – None – 2014-03-20T14:16:24.807

I sometimes wonder if non-GNU tools have any functionality at all. Everything that I use seems to be GNU-specific :) – orion – 2014-03-20T14:28:58.063

6

Pure Bash, 7

"ls -R" and "find" are the obvious, already mentioned common shell utilities. For completeness, we also have this pure bash option (no utility programs):

echo **

This requires the "globstar" shell option to be set with shopt -s globstar.

Note:

This works only for Bash 4.0 and greater.

Digital Trauma

Posted 2014-03-14T13:53:41.183

Reputation: 64 644

4

Python

Aleksi Torhamo's answer is the correct and idiomatic one. My answer is just an exercise of what if there was no os.walk.

I was inspired by an elegant recursive function for Emacs Lisp that does the same.

from os import *
from os.path import *
def walk(path):
    return [basename(f) for f in listdir2(path) if not isdir(f)] +\
        sum([walk(f) for f in listdir2(path) if isdir(f)], [])

listdir2 here is a non-existent function that returns absolute paths of files in a directory. Unfortunately, listdir doesn't return absolute paths, therefore i had to manually enter and exit directories with chdir, so my result wasn't a slick one-liner.

from os import *
from os.path import *
def walk(path, old_path):
    chdir(path)
    result = [f for f in listdir(path) if not isdir(f)] +\
        sum([walk(join(path, f), getcwd()) for f in listdir(path) if isdir(f)], [])
    chdir(old_path)
    return result

To understand it you must read about os.chdir, os.getcwd, os.listdir, os.path.isdir, os.path.basename, os.path.join, sum. The function creates a list of files and recursively repeats this for sub-directories. The list of sub-directories is summed up and appended to the list of original files.

We iterate thru list 2 times, this is inefficient. Better to partition the list into files and directories in one go. This answer proposes a smart use of reduce:

reduce(lambda x, y: x[not p(y)].append(y) or x, l, ([], []))

Attempt #1 (179 171)

from os import*
def w(p,o):chdir(p);r=[f for f in listdir(p)if not path.isdir(f)]+sum([w(path.join(p,f),getcwd())for f in listdir(p)if path.isdir(f)],[]);chdir(o);return r

Attempt #2 (190 178)

from os import *
def u(p,o):chdir(p);a,b=reduce(lambda x,y:x[path.isdir(y)].append(y)or x,listdir(p),([],[]));r=a+sum((u(path.join(p,c),getcwd())for c in b),[]);chdir(o);return r

Mirzhan Irkegulov

Posted 2014-03-14T13:53:41.183

Reputation: 185

4Hi, and welcome to PPCG! In [tag:code-golf] it's OK to do the inefficient approach if it saves you bytes. :) – Jonathan Van Matre – 2014-03-14T14:20:49.510

I agree with @JonathanVanMatre here, a) Welcome, and b) code golf doesn't care so long as you bring it down to as few bytes as possible... The 179 approach is pretty cool... Very nice SindiKat! – WallyWest – 2014-03-15T07:39:50.330

Small tip for golfing in python that could save you a byte or two: You'd be surprised at some places that whitespace isn't required. For example, from os import* is valid. – undergroundmonorail – 2014-03-15T08:09:23.897

@sindikat: nice. when I wrote that comment, I actually thought changing the walk(f) to walk(path+'/'+f) would've been enough, I didn't realize the isdir() calls used f too. I think it might still be shorter to do the path+'/'+f thing in all three places instead of passing old_dir and doing two chdir()s, though. (Although os.path.join is definitely the correct way to do it :) Changing the strategy a bit and doing the isdir() straight to the path argument would probably be even shorter. – Aleksi Torhamo – 2014-03-17T06:04:49.503

4

DOS (11 characters)

dir/s/b/a-d

I love the simplicity, /s allows dir to traverse through all subfolders, while /b suppresses the header and directory information. /a-d suppresses outputting of directories. A quick check to see if the additional space was required, and I was happy enough to submit this...

WallyWest

Posted 2014-03-14T13:53:41.183

Reputation: 6 949

+1. Though this breaks the third rule The directories should not be output. Add /a-d to suppress the outputting of directories. – unclemeat – 2014-03-17T02:33:04.067

Or better yet /aa which will only show Files ready for archiving - which should be all files. Might not always be as accurate as /a-d but it saves a character. – unclemeat – 2014-03-17T02:50:17.970

1Hold on a second... "The directories should not be output" is a bit vague... Does that mean that the header information with regards to the directory should not be output, or that the directory path cannot be output?

Big difference... – WallyWest – 2014-03-17T10:12:10.270

@sindikat, can you please clarify? Are directory paths allowed in my case? What's the point of listing all files in a filesystem without a point of reference? Using /a-d strips the output of this reference point? So if anything UncleMeat's edit of my code screws this up, right? – WallyWest – 2014-03-17T10:14:40.313

good point, I took it as 'only enumerate all files in the tree'. Using /a-d doesn't ruin doesn't ruin the point of reference, as each file has it's entire directory path. It just removes outputting the directories themselves. – unclemeat – 2014-03-17T21:36:15.083

3

Python3 - 53 characters

Two equally long ways for doing it:

import os;print(*sum((x[2]for x in os.walk('.')),[]))

and

import os;print(*sum(list(zip(*os.walk('.')))[2],[]))

These print the filenames separated by spaces on one line, I assumed that counts as a "flat list".

Aleksi Torhamo

Posted 2014-03-14T13:53:41.183

Reputation: 1 871

3

C - 203 194

This example works when compiled with TCC:

t(char*p){DIR*d;struct dirent*e;if(d=opendir(p))while(e=readdir(d)){char*n=e->d_name,f[PATH_MAX];sprintf(f,"%s/%s",p,n);if((*n++-46|*n&(*n++-46|*n))&&!t(f))printf("%s\n",f);}return!closedir(d);}

This recursive function displays the path of every file present in the folder which path has been given as a parameter.

Note: one character had to be added to get the same result with GCC:

int t(char*p){DIR*d;struct dirent*e;if(d=opendir(p))while(e=readdir(d)){char f[PATH_MAX];char*n=e->d_name;sprintf(f,"%s/%s",p,n);if((*n++-46|*n&&(*n++-46|*n))&&!t(f))printf("%s\n",f);}return!closedir(d);}

I replaced & with && in the condition. I guess the & operator has priority over && in TCC syntax, while the opposite occurs with GCC.

See the full working example here: http://ideone.com/lGBtJw

C - 220 215 (full program with include)

I liked that Coaumdio used a full program to answer... so I started adapting mine, here is the result:

#include<dirent.h>
main(i,char**p){DIR*d=opendir(*++p);char*n,f[4096],*a=f;struct dirent*e;d||puts(*p);if(d)while(e=readdir(d))sprintf(f,"%s/%s",*p,n=e->d_name),(*n++-46|*n&&(*n++-46|*n))&&main(i,&a-1);closedir(d);}

Expanded version:

#include<dirent.h>

int main(int i,char**p){
    DIR*d=opendir(*++p);
    char f[4096],*n,*a=f;
    struct dirent*e;
    d||puts(*p);
    if(d)while(e=readdir(d))
        sprintf(f,"%s/%s",*p,n=e->d_name),
        (*n++-46|*n&&(*n++-46|*n))&&main(i,&a-1);
    closedir(d);
}

Usage:

gcc list_files.c -o list_files
./list_files /var

Mathieu Rodic

Posted 2014-03-14T13:53:41.183

Reputation: 1 170

2

Bash, 18

ls -R

Windows CMD.

dir /S

Compliant version

ls -R1|grep -v ^./

Clyde Lobo

Posted 2014-03-14T13:53:41.183

Reputation: 1 395

ls -R unfortunately prints the headers for each directory. It may not count as a "flat list". – orion – 2014-03-14T14:40:08.873

As does dir /S, this also displays the header information... – WallyWest – 2014-03-15T07:35:36.013

2

C, 216 228 chars (#include included)

The other C solution contians nice tricks, but as I like compilable complete C solutions, here is my shot:

#include <dirent.h>
main(int _,char**v){char n[4096],*a=n,*b;DIR*d;struct dirent*e;(d=opendir(*++v))||puts(*v);while(d&&(e=readdir(d)))b=e->d_name,sprintf(n,"%s/%s",*v,b),strcmp(".",b)&&strcmp("..",b)&&main(0,&a-1);}

It's also a recursive traversal algorithm, and it's using the main function as recursive function. The main function tries to open the path contained in argv[1]. If it fails, it prints it, otherwise, it calls main on every sub-element (if it's different from "." or "..").

Usage

It's a standalone, working C program:

$ gcc dir.c -o dir
$ ./dir /var.log   
/var/log/kern.log.1
/var/log/alternatives.log.1
/var/log/user.log.2.gz
/var/log/nvidia-installer.log
/var/log/debug.4.gz
/var/log/wtmp.1
/var/log/lastlog
/var/log/messages.2.gz
/var/log/vmware-installer
/var/log/pm-powersave.log
/var/log/syslog.4.gz
/var/log/news/news.err
/var/log/news/news.notice
/var/log/news/news.crit
/var/log/apt/history.log.3.gz
/var/log/apt/history.log.9.gz
/var/log/apt/history.log.4.gz
/var/log/apt/term.log.10.gz
/var/log/apt/history.log.10.gz
/var/log/apt/term.log.4.gz

Tips

I used a few (somewhat classic) C tricks:

  • replace if(condition)statement with contition && statement
  • replace if(!condition)statement with condition || statement
  • use the , operator to save { and } in the while code

Here is a readable version:

#include <dirent.h>
main(int _,char**v) {
    char n[4096], *a=n,*b;
    DIR *d; 
    struct dirent *e; 
    (d = opendir(*++v)) || puts(*v);
    while(d && (e = readdir(d)))
        b = e->d_name,
        sprintf(n, "%s/%s", *v, b), 
        strcmp(".", b)&&strcmp("..", b)&&
            main(0, &a-1);
}

Edit

I have to insert a closedir(d); statement at the end, because a limit is reached when too much folders are opened (wild guess: the open files limit) and the the program stops. This happens only for folder containing many other folders.

the code is now:

#include <dirent.h>
main(int _,char**v){char n[4096],*a=n,*b;DIR*d;struct dirent*e;(d=opendir(*++v))||puts(*v);while(d&&(e=readdir(d)))b=e->d_name,sprintf(n,"%s/%s",*v,b),strcmp(".",b)&&strcmp("..",b)&&main(0,&a-1);closedir(d);}

Coaumdio

Posted 2014-03-14T13:53:41.183

Reputation: 141

Without closedir(d), this will abruptely stop if you traverse a big directory (for instance, /)... I also tried to do without the closedir, but had to retract for this reason. – Mathieu Rodic – 2014-03-20T13:19:13.957

Oh c**p, you're absolutely right! I thought not using closedir was only dirty. Well that's 12 more chars then. – Coaumdio – 2014-03-20T13:37:50.830

1

Ruby 18 characters

puts Dir['./**/*']

peter

Posted 2014-03-14T13:53:41.183

Reputation: 111

1

Groovy 58

new File("/").eachFileRecurse(){if(it.isFile())println it}

md_rasler

Posted 2014-03-14T13:53:41.183

Reputation: 201

1

CMD - 32 Bytes

Another CMD / DOS method -

for /r %a in (*)do @echo %~dpnxa

The /r switch recursively looks through a directory (defaulting to current directory), while %~dpnxa outputs %a, the current directory or file in the loop, to the following output format -

d - drive
p - path
n - name
x - extension

Because x is specified it will only output files, ignoring directories.

unclemeat

Posted 2014-03-14T13:53:41.183

Reputation: 2 302

1

Groovy 45

Now even shorter:

new File(".").traverse{if(it.file)println it}

Uses one of the traverse() methods available since Groovy 1.7.1.

Groovy 46 ;-)

Another nice solution, if you don't count the required import statement, is this one:

import groovy.io.FileType.*
new File(".").traverse type:FILES,{println it}

t0r0X

Posted 2014-03-14T13:53:41.183

Reputation: 153

0

SmileBASIC, 78 bytes

DIM F$[0],D$[0]FILES"//",D$@L
FILES POP(D$),F$WHILE LEN(F$)?POP(F$)WEND
GOTO@L

Displays a * or (please don't remove my spaces) before each filename, depending on whether it's a text file (is this allowed?)

12Me21

Posted 2014-03-14T13:53:41.183

Reputation: 6 110

0

JAVA 87

void t(File f){if(f.isFile())System.out.println(f);else for(File c:f.listFiles())t(c);}

Tobias

Posted 2014-03-14T13:53:41.183

Reputation: 121