How to join files on the command line without creating temp files?

7

5

I have two files in a Linux / Bash environment:

# Example data
$ cat abc
2       a
1       b
3       c

$ cat bcd
5       c
2       b
1       d

I'm trying to join the two files on the first column. The following does not work because the input files must be sorted on the match field.

# Wrong: join on unsorted input does not work
$ join abc bcd

I can get around this by creating two temp files and joining them

$ sort abc > temp1
$ sort bcd > temp2
$ join temp1 temp2
1 b d
2 a b

But is there a way to do this without creating temp files?

dggoldst

Posted 2009-08-19T04:12:00.053

Reputation: 2 372

Answers

18

The following will work in the bash shell:

# Join two files
$ join <(sort abc) <(sort bcd)
1 b d
2 a b

You can join on any column as long as you sort the input files on that column

# Join on the second field
$ join -j2 <(sort -k2 abc) <(sort -k2 bcd)
b 1 2
c 3 5

The -k2 argument to sort means sort on the second column. The -j2 argument to join means join on the second columns. Alternatively join -1 x -2 y file1 file2 will join on the xth column of file1 and the yth column of file2.

dggoldst

Posted 2009-08-19T04:12:00.053

Reputation: 2 372

3

Zsh answer:

join =(sort abc) =(sort bcd)

Aaron F.

Posted 2009-08-19T04:12:00.053

Reputation: 673

1

This will work in bash shell:

# Join two files
$ sort abc | join - <(sort bcd)
1 b d
2 a b

OR

# Join two files
$ sort bcd | join <(sort abc) -
1 b d
2 a b

Because join can read standard input by using '-'.

jianpx

Posted 2009-08-19T04:12:00.053

Reputation: 111