set intersection of two lists

9

Your goal is to compute the set intersection of two lists of integers. The intersection is defined as the unique un-ordered group of integers found at least once in both input list.

Input

The input may be in any format desired (function parameter, stdio, etc.), and consists of two lists of integers. You many not assume anything about each list other than they may contain any non-negative number of integers (i.e. they are unsorted, possibly may contain duplicates, may have different lengths, and may even be empty). It is assumed that each integer will fit in your language's native signed integer type, may be more than 1 decimal digit long, and are signed.

Example input:

1 4 3 9 8 8 3 7 0
10 1 4 4 8 -1

Output

The output is any list-like of integers representing the set intersection of the two lists to any format desired (return value, stdio, etc.). There is no requirement that the output be sorted, though you are welcome to provide an implementation that happens to always be sorted. The output must form a valid un-ordered set (e.g. it must not contain any duplicate values).

Example test cases (note that the order of the output is not important):

First two lines are the input lists, third line is the output. (empty) denotes the empty list.

(empty)
(empty)
(empty)

1000
(empty)
(empty)

3 1 2 4 3 1 1 1 1 3
3 1 -1 0 8 3 3 1
1 3

1 2 1
3 3 4
(empty)

Scoring

This is code golf; shortest answer in bytes wins.

Standard loop-holes are prohibited. You may use any built-in features not designed for set-like operations.

Prohibited built-in features:

  • set creation/removing duplicates
  • set difference/intersection/union
  • Generalized membership testing (e.g. anything similar to the in keyword in Python, indexOf-like functions, etc). Note that the use of "foreach item in list" constructs are allowed (assuming they don't violate any of the other restrictions), despite the fact that Python re-uses the in keyword to create this construct.
  • These prohibited built-ins are "viral", i.e. if there's a larger built-in containing any of these sub-features, it is similarly prohibited (e.g. filtering by membership in a list).

Any built-ins not on the above list are allowed (ex. sorting, integer equality testing, list append/remove by index, filtering, etc.).

For example, take the following two example snippets (Python-like code):

# prohibited: filters by testing if each value in tmpList is a member of listA
result = tmpList.filter(listA)

# ok: filtering by a lambda which manually iterates over listA and checks for equality
def my_in_func(val, slist):
    for a in slist:
        if(val == a):
            return True
    return False
result = filter(lambda v: my_in_func(val, listA), tmpList)

You are welcome to implement any of these set-like features yourself and they will count towards your score.

Your solution should complete in a reasonable amount of time (say, less than a minutes on whatever hardware you have for two lists ~length 1000 each).

helloworld922

Posted 2016-03-05T18:59:41.270

Reputation: 2 503

5

By the way, confusion and miscommunication are common in do X without Y, which is why they are officially one of the thing to avoid when writing challenges.

– Dennis – 2016-03-06T00:06:15.193

2@Dennis Yeah, I guess this problem really has become one of those :( When I first wrote it, I was hoping it might be an interesting problem, but as soon as I started working out a ruleset I should have just killed the challenge. – helloworld922 – 2016-03-06T00:18:05.407

Is a builtin which performs run length encoding allowed? – isaacg – 2016-03-06T19:46:11.020

That should be fine. – helloworld922 – 2016-03-06T20:08:32.330

1May there be duplicates in the output? – Adám – 2016-03-08T21:16:55.247

@Nᴮᶻ No, the output is not allowed to have duplicate values, e.g. the output must be a set. – helloworld922 – 2016-03-08T23:11:22.730

Too restrictive: If you have in your mind one solution... It is not so good.... – RosLuP – 2017-11-04T13:35:50.587

Answers

6

Haskell, 45 42 bytes

y#(e:s)=[e|any(==e)y,all(/=e)s]++y#s
_#x=x

Try it online!

Edit: -2 bytes thanks to @Ørjan Johansen, -1 byte thanks to @dfeuer.

nimi

Posted 2016-03-05T18:59:41.270

Reputation: 34 639

This is 2 bytes shorter with explicit recursion.

– Ørjan Johansen – 2019-04-02T00:44:40.937

@ØrjanJohansen, 1 more.

– dfeuer – 2019-04-02T05:52:50.040

3

MATL, 18 bytes

YY_iti!=Xa)hStdfQ)

Try it online!

This works in two steps. First the intersection is computed, possibly with duplicates. This is based on comparing all elements of one array with all elements of the other, and keeping elements of the first that are present in the second.

Then duplicates are removed. For this, the array from the previous step is sorted, and entries are kept if different from the preceding. A -inf value is prepended so that the first (i.e. lowest) value is not lost.

YY_                 % push -infinity
   it               % take first input. Duplicate
     i!             % take second input. Transpose
        =           % test all combinations of elements of the two inputs for equality
        Xa          % row vector that contains true for elements of first array that 
                    % are present in the second, possibly duplicated
          )         % index into first array to keep only those elements. Now we need
                    % to remove duplicates
           h        % append -infinity
            S       % sort
             tdf    % duplicate. Find entries that differ from the preceding
                Q)  % add 1 and index into array to keep only non-duplicates

Luis Mendo

Posted 2016-03-05T18:59:41.270

Reputation: 87 464

3

Jelly, 13 bytes

=S¥Ðf
ṂrṀ{ç³ç

Try it online!

How it works

ṂrṀ{ç³ç  Main link. Arguments: A (list 1), B (list 2)

Ṃ        Yield m, the minimum of A.
  Ṁ{     Yield M, the maxmimum of A.
 r       Create an inclusive range from m to M.
    f³   Apply the helper link with right argument A.
      f  Apply the helper link with right argument B.


=S¥Ðf    Helper link. Arguments: n (integer in range), L (list, A or B)

=        Test all elements of L for equality with n.
 S       Add the results.
  ¥      Combine the two previous links into a dyadic chain.
   Ðf    Filter by the result of the sums.

Dennis

Posted 2016-03-05T18:59:41.270

Reputation: 196 637

@isaacg Fixed now. – Dennis – 2016-03-06T00:02:42.987

2

golflua, 68 chars

\f(a,b)k={}~@_,v p(a)~@_,w p(b)?w==v k[w]=w$$$~@_,v p(k)I.w(v," ")$$

which is called as

> f({1,2,3,4},{3,4,5})
3 4
> f({3,1,2,4,3,1,1,1,1,3},{3,1,-1,0,8,3,3,1})
3 1

In regular Lua, this would be

function foo(a,b)
   local k={}
   for i,v in pairs(a)
      for j,w in pairs(b)
         if v==w then
            k[v] = v
         end
      end
   end
   for i,v in pairs(k)
      print(v," ")
   end
end

So basically I'm iterating over the each element of the two tables and only storing the equivalent values. By using the value as the key (k[w]=w), I'm eliminating all duplicates. We then output the new table by iterating over the index & value of pairs

Kyle Kanos

Posted 2016-03-05T18:59:41.270

Reputation: 4 270

2

JavaScript (ES6), 66 bytes

(a,b)=>a.filter((e,i)=>b.some(f=>e==f)&a.slice(0,i).every(f=>e-f))

Without using indexOf, as I'm not convinced that it's permitted.

Neil

Posted 2016-03-05T18:59:41.270

Reputation: 95 035

2

Pyth, 12 11 bytes

eMrSsq#RQE8

Demonstration

Explanation:

eMrSsq#RQE8
               Implicit: Q is one of the lists.
     q#RQE     For every element in the first list, filter the second list on
               equality with that element.
    s          Concatenate. We now have the intersection, with duplicates.
  rS      8    Sort and run length encode, giving counts and elements.
eM             Take just the elements.

isaacg

Posted 2016-03-05T18:59:41.270

Reputation: 39 268

Sorting and rle saves one byte. – Jakube – 2016-03-06T18:54:55.020

@Jakube I would say rle is a builtin which removes duplicates. – isaacg – 2016-03-06T19:38:44.763

It only removes duplicates if you sort it before and remove the counts of the rle afterwards. It's a little bit in a grey area, but I think so is using a dictionary. It's basically a set that stores additional data for each element. – Jakube – 2016-03-06T19:42:46.097

@Jakube OP says it's fine. Thanks! – isaacg – 2016-03-08T20:52:22.030

1

Jq 1.5, 99 bytes

def f(a;b):(a+b|min)as$m|[range($m;a+b|max)|[]]|.[a[]-$m][0]=1|.[b[]-$m][1]=1|.[[[1,1]]]|map(.+$m);

Expanded

def f(a;b):
     (a+b|min) as $m         # find smallest value in either array
   | [range($m;a+b|max)|[]]  # create array of [] for indices [min,max]
   | .[ a[]-$m ][0]=1        # store 1 in [0] at corresponding indices of a
   | .[ b[]-$m ][1]=1        # store 1 in [1] at corresponding indices of b
   | .[[[1,1]]]              # find all the indices where we stored a 1 for a and b
   | map(.+$m)               # convert back from indices to values
;

This avoids using {} objects and since jq does not have bit operations this emulates them with an array.

Try it online!

jq170727

Posted 2016-03-05T18:59:41.270

Reputation: 411

1

Axiom, 411 bytes

b(x,v)==(l:=1;h:=#v;repeat(l>h=>break;m:=(l+h)quo 2;x<v.m=>(h:=m-1);x>v.m=>(l:=m+1);return m);0);g(a,b)==(if #a>#b then(v:=a;w:=b)else(v:=b;w:=a);c:=sort(v);for x in w repeat(if binSearch(x,c)~=0 then return 1);0)
f(a:List INT,b:List INT):List INT==(r:List INT:=[];#a*#b=0=>r;x:=sort(a);y:=sort(b);i:=1;repeat(i>#x=>break;v:=x.i;binSearch(v,y)=0=>(i:=i+1);r:=concat(r,v);while i<=#x and x.i=v repeat i:=i+1);r)

ungolf and test

--suppose v.1<=v.2<=....<=v.#v as the default function sort() produce
--   binary serch of x in v, return the index i with v.i==x
--   return 0 if that index not exist
--traslated in Axiom from C  book
--Il Linguaggio C, II Edizione 
--Brian W.Kerninghan, Dennis M.Ritchie
binSearch(x,v)==
    l:=1;h:=#v
    repeat
       l>h=>break
       m:=(l+h)quo 2
       x<v.m=>(h:=m-1) 
       x>v.m=>(l:=m+1)
       return m
    0

--N*log(N)+n*log(n)+N*n*log(n) so it is N*n*log(n) or n*N*log(N)
ListIntersection(a:List INT,b:List INT):List INT==
    r:List INT:=[];#a*#b=0=>r
    x:=sort(a);y:=sort(b)
    i:=1
    repeat
        i>#x=>break
        v:=x.i
        binSearch(v,y)=0=>(i:=i+1)
        r:=concat(r,v)
        while i<=#x and x.i=v repeat i:=i+1
    r

(5) -> f([],[])
   (5)  []
                                                       Type: List Integer
(6) -> f([1000],[])
   (6)  []
                                                       Type: List Integer
(7) -> f([3,1,2,4,3,1,1,1,1,3],[3,1,-1,0,8,3,3,1])
   (7)  [1,3]
                                                       Type: List Integer
(8) -> f([1,2,1],[3,3,4])
   (8)  []
                                                       Type: List Integer

RosLuP

Posted 2016-03-05T18:59:41.270

Reputation: 3 036

1

Axiom, 257 bytes

W(x,y)==>while x repeat y;f(a,b)==(r:List INT:=[];#a*#b=0=>r;x:=sort(a);y:=sort(b);i:=1;j:=1;repeat(j>#y=>break;W(i<=#x and x.i<y.j,i:=i+1);i>#x=>break;W(j<=#y and y.j<x.i,j:=j+1);j>#y=>break;v:=y.j;if x.i=v then(r:=concat(r,v);W(j<=#y and y.j=v,j:=j+1)));r)

This without the use of binsearch... But i don't know the big O... Unglofed and results:

--N*log(N)+n*log(n)+???
ListIntersection(a:List INT,b:List INT):List INT==
    r:List INT:=[];#a*#b=0=>r
    x:=sort(a);y:=sort(b)
    i:=1;j:=1
    repeat
        j>#y=>break
        while i<=#x and x.i<y.j repeat i:=i+1
        i>#x=>break
        while j<=#y and y.j<x.i repeat j:=j+1
        j>#y=>break
        v:=y.j;if x.i=v then 
                        r:=concat(r,v)
                        while j<=#y and y.j=v repeat j:=j+1
    r

(3) -> f([3,1,2,4,3,1,1,1,1,3],[3,1,-1,0,8,3,3,1])
   (3)  [1,3]
                                                       Type: List Integer
(4) -> f([],[])
   (4)  []
                                                       Type: List Integer
(5) -> f([1,2,1],[3,3,4])
   (5)  []
                                                       Type: List Integer

Not executed many tests so could be bugged...

RosLuP

Posted 2016-03-05T18:59:41.270

Reputation: 3 036

1

Jelly, 12 bytes

pEÐfḢ€ĠḢ€$ị$

Try it online!

HyperNeutrino

Posted 2016-03-05T18:59:41.270

Reputation: 26 575

In Tio 3,1,2,4,3,1,1,1,1,3 input and 3 input return the output [1,2,3] instead of [3] – RosLuP – 2017-11-06T16:17:16.327

@RosLuP Try [3] instead of 3 – HyperNeutrino – 2017-11-06T16:50:24.230

It would be ok in my opinion, if the result of intersection of 2 lists return (as other cases) [] if the result is void set, [1] if 2 lists have 1 in common – RosLuP – 2017-11-06T18:33:41.780

@RosLuP I can't help it, that's how Jelly does output. Empty for [] and the element for singleton lists. You can go to the wiki page (atoms) and append the Python Stringify builtin but that makes my answer longer and strict I/O is dumb – HyperNeutrino – 2017-11-07T02:15:22.927

It would be ok for me if it accept only input List in [] way (example [1], [1,2,3] [], [ ], [ ] etc) and output the list output only in List [] way (as its input) . If parenthesis for list are {} or () repeat above speech for the right one. This is only for what I think, the question possibly say otherwise and all is ok – RosLuP – 2017-11-07T07:00:02.530

@RosLuP Question already says any reasonable format is accepted. This format is reasonable and it's the standard for Jelly. Anything else would be an absurd abomination of our standard rules to prevent languages from not being able to complete. – HyperNeutrino – 2017-11-07T13:19:05.100

1

Husk, 9 bytes

mo←←kIfE*

Try it online!

m            Map
 o←←         taking the first element from the first element
    kI       over lists of identical values from
        *    the Cartesian product of the two input lists
      f      filtered by
       E     both elements of a pair being equal.

Looking in Husk's source code on GitHub, k ("keyon") is implemented as the composition of sorting the list and grouping adjacent values, so although I can't actually find the implementation of "groupOn" it's probably safe to assume that it doesn't do anything setty, since Haskell is a functional language and grouping adjacent equal values is a pretty straightforward reduce-over-a-linked-list operation. (I can find the implementation of k's other type signature "keyby", which I could use here by changing I to =, but I don't know Haskell so I can't tell how exactly it works.)

Also, a nice little Brachylog answer I came up with before I realized set operations of all sorts were disallowed: ⟨∋∈⟩ᵘ

Unrelated String

Posted 2016-03-05T18:59:41.270

Reputation: 5 300

1

R, 141 83 bytes

l=sapply(strsplit(readLines(file("stdin"))," "),as.numeric)
r=rle(sort(unlist(l)))[[2]]
r[sapply(r,function(x)any(x==l[[1]])&any(x==l[[2]]))]

Improved by Giuseppe

function(a,b){r=rle(sort(c(a,b)))[[2]];r[sapply(r,function(x)any(x==a)&any(x==b))]}

Try online here here

d.b

Posted 2016-03-05T18:59:41.270

Reputation: 161

This doesn't seem to work. I'm probably trying to use it wrong, though, so maybe you should supply a link to Try It Online! showing how to use it and demonstrating that it fulfills the requirements of the challenge. (An explanation on the answer wouldn't hurt either.)

– Unrelated String – 2019-04-02T05:22:07.060

you can't assume inputs a and b are pre-defined, you must accept input, either by taking them as function arguments or from STDIN. – Giuseppe – 2019-04-02T17:58:34.187

1

I think golfier would be to just make this a function, like this

– Giuseppe – 2019-04-02T18:12:00.823

1@d.b The "header" names the anonymous function defined in the "Code" section (anonymous functions are perfectly acceptable), and the footer then calls it. The Header, Code, and Footer sections are all one piece of code, but only the part in the "code" section counts for bytes :-) – Giuseppe – 2019-04-02T18:16:43.533

1

bash + GNU coreutils, 184 Bytes

[ -z "$1" ] && exit
p='{if(a[$0]++==0)print $0}'
while read A; do
while read B; do
[ $A = $B ] && echo $A
done < <(grep -oP '\d*'<<<$1|awk "$p")
done < <(grep -oP '\d*'<<<$2|awk "$p")

Invocation:

./codegolf.sh '12 4 654 12 3 56' '4 4 56 33 3 3 3'

Output:

4
56
3

No output if intersection is empty. This script does not sort and makes sanity check if first set is empty. Explanation:

[ -z "$1" ] && exit  # Exit if first set is empty
p='{if(a[$0]++==0)print $0}' # The AWK program we will use
while read A; do   # read the list with two
while read B; do   # encapsulated loops
[ $A = $B ] && echo $A   # if they match, then print
done < <(grep -oP '\d*'<<<$1|awk "$p")
done < <(grep -oP '\d*'<<<$2|awk "$p")
# the process substitution greps the numbers and pipes them to awk. Our awk program makes them unique without sorting; it uses associative arrays with index names same as lines (our numbers here).

Bonus to know: You could change to grep -o . to do this with random strings instead of numbers.

rexkogitans

Posted 2016-03-05T18:59:41.270

Reputation: 589

1

Perl 6, 26 37 bytes

{%(@^a.grep(any(@^b)):p.invert).keys}

usage

> my &f = {%(@^a.grep(any(@^b)):p.invert).keys}
-> @a, @b { #`(Block|559823336) ... }
> f([3,1,2,4,3,1,1,1,1,3], [3,1,-1,0,8,3,3,1])
(1 3)

Cheeky non-competing answer

> [3,1,2,4,3,1,1,1,1,3] ∩ [3,1,-1,0,8,3,3,1]
set(3, 1)

or if you like it in a boring ol' f function

> my &f = &infix:<∩>
sub infix:<∩> (|p is raw) { #`(Sub+{<anon|130874592>}+{Precedence}|102325600) ... }
> f([3,1,2,4,3,1,1,1,1,3], [3,1,-1,0,8,3,3,1])
set(3, 1)

Hotkeys

Posted 2016-03-05T18:59:41.270

Reputation: 1 015

I've updated my answer to not use .unique – Hotkeys – 2016-03-06T01:07:47.890

1

You don't really need the invert if you take the values instead. 24 bytes

– Jo King – 2019-04-02T03:28:52.837

1

Retina, 63 bytes

The last two lines remove duplicates. The input is two space-delimited lists separated by a comma. The output is whitespace-delimited.

+`( -?\d+)\b(.*,.*)\1\b
$1_$2
-?\d+\b|_|,

+`(-?\d+)(.*)\1
$1$2

Try it online

If duplicates in the output are allowed, my program would be 42 bytes.

mbomb007

Posted 2016-03-05T18:59:41.270

Reputation: 21 944

0

Python3, 51 bytes

lambda a,b:[v for v in a if{n:1 for n in b}.get(v)]

If the input lists can contain duplicates:

Python3, 67 bytes

lambda a,b:list({v:1 for v in a if {n:1 for n in b}.get(v)}.keys())

PieCot

Posted 2016-03-05T18:59:41.270

Reputation: 1 039

0

PHP, 78, 77 bytes

function($a,$b){foreach($a as$x)foreach($b as$y)$x==$y?$c[$x]=$x:0;return$c;}

Try it online!

No frills, but rule-compliant (I think).

Output

[], []
[]

[1000], []
[]

[3,1,2,4,3,1,1,1,1,3], [3,1,-1,0,8,3,3,1]
[3,1]

[1,2,1], [3,3,4]
[]

640KB

Posted 2016-03-05T18:59:41.270

Reputation: 7 149