Find and sort numerical filenames

1

I am trying to find files that match time=* and then need to display it by sorting it numerically.
The result file names would be :

first/path/time=001.jpg
first/path/time=002.jpg
second/path/time=001.jpg
...

which I want to see as,

first/path/time=001.jpg
second/path/time=001.jpg
first/path/time=002.jpg
...

sorted numerically with respect to the 3 digits on file name.

For now, i tried find . -name time=* | rev | sort | rev

which does work for single digits but with numbers like 019 021 it does not work.

Full path would be something like,

path/to/folder1/alpha=0.1_beta=0.2_gamma=1.0/time=001.jpg
path/to/folder1/alpha=0.1_beta=0.2_gamma=0.1/time=001.jpg
path/to/folder2/alpha=0.1_beta=0.2_gamma=0.1/time=001.jpg
.
.
.

I think it would be easiest if the files could be sorted using only last 7 characters. 001.jpg 010.jpg... however sadly sort does not support support negative indexing to get last 6 characters :(

hadi k

Posted 2018-01-23T10:07:47.393

Reputation: 203

1

(1) What OS? Looks like some Unix, still you should make it clear. (2) Quote the pattern, or else...

– Kamil Maciorowski – 2018-01-23T10:23:42.270

hey, Its Linux and what do you mean by quote the pattern? – hadi k – 2018-01-23T10:29:49.153

I mean exactly what my answer says under this link I have already gave you.

– Kamil Maciorowski – 2018-01-23T10:31:53.123

hi thanks, but that was not the problem with me. I dont have time=* file in current directory to have that problem. But definitely a new thing to learn! thanks again! i will keep this in find for future. – hadi k – 2018-01-23T11:45:19.673

@KamilMaciorowski oh no. i just meant time=001.jpg time=002.jpg and so on. – hadi k – 2018-01-23T13:57:14.790

Answers

2

This should do:

find . -name 'time=*' | sort -t= -k3

But this is a more safe sorting, as per Kamil Maciorowski's comment:

perl -C -F= -wnle 'push @a,[$_,split(/\./,$F[-1])]; END {$,="\n"; print map{$$_[0]} sort{$$a[1]<=>$$b[1]} @a}'

Use it in the pipeline afer find instead of sort in the first one.

user534635

Posted 2018-01-23T10:07:47.393

Reputation:

Unless you encounter third/path/a=foo/bar/.... – Kamil Maciorowski – 2018-01-23T12:58:34.100

@KamilMaciorowski I am trying to find files that match time=* says the question. – None – 2018-01-23T13:00:28.197

1Add third/path/a=foo/bar/time=001.jpg to the set provided by the OP. It will end up after first/path/time=002.jpg because of the extra =. Your answer may or may not be sufficient for the OP, it's good to know its limitations. Maybe the other answer is flawed as well; I don't know ruby enough to tell. – Kamil Maciorowski – 2018-01-23T13:09:14.907

hi @tomasz, thanks for your reply but as @Kamil says it actually does not work for me as my actual path is something like first/path/parameter=1/time=001.jpg first/path/parameter=0.1/time=001.jpg second/path/parameter=1/time=001.jpg ... :( – hadi k – 2018-01-23T13:51:30.393

@hadik If that's regular and as you present, then change -k2 to -k3. – None – 2018-01-23T14:12:02.700

@tomasz cool solution this actually worked. nice combination of -t and -k, I do have regular names, so this is perfect thanks ! :) – hadi k – 2018-01-23T14:16:45.827

Hah, did not know that option of sort. +1 – mvw – 2018-01-23T14:49:07.763

@hadik Welcome. I added Perl too for the weird cases. – None – 2018-01-23T15:30:42.327

1

A quickie with left-overs I picked from the command line :-)

find . -name time="*" -exec ruby -e "s='{}'; puts s.split('=')[-1].split('.')[0]+s" \; |
sort -n | colrm 1 3

Explanation:

My friend Ruby stores the path string given by find as {} and stores it into a variable s. Then she splits the string along = characters and keeps the last part (index -1 in the result array), e.g. 002.jpg. Then she splits this string on . characters and keeps the first part (index 0 in the result array), assuming the files are named ddd.<ext>, which results is the three digit number part, e.g. 002.

Finally she prints this and adds the original path string. This would give:

002./alpha=0.1_beta=0.2_gamma=1.0/path/time=002.jpg
001./alpha=0.1_beta=0.2_gamma=1.0/path/time=001.jpg
021./alpha=0.1_beta=0.2_gamma=0.1/path/time=021.jpg
001./alpha=0.1_beta=0.2_gamma=0.1/path/time=001.jpg
019./alpha=0.1_beta=0.2_gamma=0.1/path/time=019.jpg

The additional pipe commands sort the output numerically (sort -n) and finally remove the first three columns of the output (colrm 1 3).

Example:

test$ find . -name time="*" -exec ruby -e "s='{}'; puts s.split('=')[-1].split('.')[0]+s" \; 
| sort -n | colrm 1 3
./alpha=0.1_beta=0.2_gamma=0.1/path/time=001.jpg
./alpha=0.1_beta=0.2_gamma=1.0/path/time=001.jpg
./alpha=0.1_beta=0.2_gamma=1.0/path/time=002.jpg
./alpha=0.1_beta=0.2_gamma=0.1/path/time=019.jpg
./alpha=0.1_beta=0.2_gamma=0.1/path/time=021.jpg

mvw

Posted 2018-01-23T10:07:47.393

Reputation: 721

this one has the same problem as the other solution suggested by @tomasz . as my path contains multiple = this sorts through the numbers after first = and then the second = and so on. So my files are actually not sorted as i like. – hadi k – 2018-01-23T14:01:32.493

OK, can I assume the last "="? (Note that I updated the index from "1" to "-1") – mvw – 2018-01-23T14:03:49.657

@hadik Note that you added that requirement about multiple "=" after I posted my first version. – mvw – 2018-01-23T14:25:44.797

It looks like Ruby's called for each file found. – None – 2018-01-23T15:47:37.463

Yes, this was a quick hack. It is not optimized. – mvw – 2018-01-23T16:04:43.890

@mvw thanks for the reply. this works now but solution by tomasz was easiest for my problem :) – hadi k – 2018-01-23T20:56:30.873

1At least I had f1rst p0st :-) – mvw – 2018-01-23T20:58:31.993

1

Assuming no path is insane enough to require find ... -print0:

find . -type f -name "time=*" | awk -F '=' '{ print $NF "=" $0 }' | sort -n | cut -d "=" -f 2-

I used awk to extract parts behind the last =, it outputs full lines with additional relevant part in front, separating by additional =. E.g.:

001.jpg=path/to/folder1/alpha=0.1_beta=0.2_gamma=1.0/time=001.jpg

These are sorted numerically with sort. Then cut extracts parts after this additional (first) =; these are the original paths.

There are exactly four processes created: find, awk, sort, cut. Alternatives that use the syntax of find ... -exec some_tool ... \; create one some_tool process per matching file.

Kamil Maciorowski

Posted 2018-01-23T10:07:47.393

Reputation: 38 429

i didnt know about awk, thanks for suggestions :) – hadi k – 2018-01-23T20:57:35.843