Make sets disjoint without emptying them

8

Suppose you have a set of sets of integers. It's possible that some of the sets will overlap (i.e. sharing elements). You could get rid of the overlaps by deleting elements from the sets, but then some of them might end up empty; that would be a shame. Can we make all the sets disjoint without emptying any of them?

Note that in this situation, there's never any reason to leave multiple elements in a set, so this problem can always be solved by reducing each set to just one element. That's the version of the problem we're solving here.

The task

Write a program or function, as follows:

Input: A list of sets of integers.

Output: A list of integers, of the same length as the input, for which:

  • All integers in the output are distinct; and
  • Each integer in the output is an element of the corresponding set of the input.

Clarifications

  • You can represent a set as a list if you wish (or whatever's appropriate for your language), disregarding the order of elements.
  • You don't have to handle the case where no solution exists (i.e. there will always be at least one solution).
  • There might be more than one solution. Your algorithm must always produce a valid solution, but is allowed to be nondeterministic (i.e. it's OK if it picks a different valid solution each time it runs).
  • The number of distinct integers appearing in the input, n, will be equal to the number of sets in the input, and for simplicity, will be the integers from 1 to n inclusive (as their actual values don't matter). It's up to you whether you wish to exploit this fact or not.

Testcases

[{1,2},{1,3},{1,4},{3,4}] -> [2,3,1,4] or [2,1,4,3]
[{1,3},{1,2,4},{2,3},{3},{2,3,4,5}] -> [1,4,2,3,5]
[{1,3,4},{2,3,5},{1,2},{4,5},{4,5}] -> [1,3,2,4,5] or [3,2,1,4,5] or [1,3,2,5,4] or [3,2,1,5,4]

Victory condition

A program requires an optimal time complexity to win, i.e. if an algorithm with a better time complexity is found, it disqualifies all slower entries. (You can assume that your language's builtins run as fast as possible, e.g. you can assume that a sorting builtin runs in time O(n log n). Likewise, assume that all integers of comparable size to n can be added, multiplied, etc. in constant time.)

Because an optimal time complexity is likely fairly easy to obtain in most languages, the winner will therefore be the shortest program among those with the winning time complexity, measured in bytes.

fred russell

Posted 2017-12-05T13:23:20.143

Reputation: 123

If it makes no sense then why is it that they understood it over here http://www.dreamincode.net/forums/topic/408035-has-this-algorithm-been-written-yet/

– fred russell – 2017-12-05T13:43:42.157

@fredrussell maybe, just maybe, it's about how you explain it, and not about, e. g. the notation on the image? – NieDzejkob – 2017-12-05T13:44:53.107

@fredrussell This is the wrong stackexchange for this question. On our site we host programming competitions. You may be able to get help on Computer Science Stackexchange, but be sure to read their help center first. – user202729 – 2017-12-05T13:46:27.097

7@fredrussell your explanation of the "challenge" is unclear and not formatted. On this site you'll usually find properly formatted and ordered questions, for example following a layout like "Input; Output; Rules; Testcases", but you aren't providing anything of that. Additionally, you don't have a winning criteria which could determine a winner. And after your insult I don't think anyone is willing to solve the question now. Even on SO, you should always keep in mind that the people answering are doing this in their free time, so you should not be rude like that. – Luca H – 2017-12-05T13:49:34.597

Open question: is an optimal time complexity (i.e. without actually giving an upper bound) really covered by the restricted-complexity tag? Right now, it looks more like a fastest-code challenge with code-golf as the 2nd winning criterion. – Arnauld – 2017-12-05T17:30:00.247

1@Arnauld [tag:fastest-code] would imply that if we both write O(n) algorithms, but yours is 100 times faster, then you win. If we only require an optimal time complexity, then it's okay if my code is 100 times slower, as long as it's one byte smaller. But this challenge might well count as [tag:fastest-algorithm]. – Misha Lavrov – 2017-12-05T17:32:26.617

Wikipedia leads me to think that the optimal known time complexity for this problem is O(n^ω), the speed of fast matrix multiplication. I'm looking forward to seeing solutions based on that. – Misha Lavrov – 2017-12-05T17:49:18.603

2

This is exactly the problem of finding a full matching in a bipartite graph. The time complexities of the best known algorithms depend on how large the sets are compared to the number of sets. Related challenge.

– Zgarb – 2017-12-05T20:38:19.717

I don't like that because perfect matching is a famous problem, likely the thing to do is to copy published algorithms. – xnor – 2017-12-05T23:49:55.620

@fredrussell Do you want the question to be formed this way? 1 If you do, then just keep it this way, and keep in mind how to write a challenge next time. (Sandbox) 2 If you don't like "obscure answers written in obscure languages", and what you do is purely asking for homework/algorithm/problem, then this is not the correct SE. Go to Computer-science SE for that. 3 Be nice when comment.

– user202729 – 2017-12-06T01:23:14.903

Answers

2

Jelly, 8 bytes

Œp⁼Q$ÐfḢ

Try it online!

Explanation

Œp⁼Q$ÐfḢ  Main Link
Œp        Cartesian Product of the elements; all possible lists
     Ðf   Filter; keep elements that are
  ⁼Q$     Equal to themselves uniquified
      Ḣ   Take the first one

Extremely inefficient. Asymptotic to ϴ(n^(n+1)) according to Misha Lavrov; I think it's right.

HyperNeutrino

Posted 2017-12-05T13:23:20.143

Reputation: 26 575

I think this is at least Ω(n^n): the maximum possible size of the Cartesian product. – Misha Lavrov – 2017-12-05T17:58:27.757

More precisely, ϴ(n^(n+1)), since the "equal to itself uniquified" check should be ϴ(n) time. – Misha Lavrov – 2017-12-05T18:20:10.903

@MishaLavrov Ah okay, thanks. – HyperNeutrino – 2017-12-05T18:20:37.280

@HyperNeutrino unique function in Jelly use O(n) containment (x in s) check, each one should take O(n) according to this page, so Q should take O(n^2) worst case / average case time complexity. Therefore the algorithm is O(n^(n+2)). (unique can be O(n) in case all elements are equal, where each containment check runs in O(1)) --- On an unrelated note, it is possible to implement unique in O(n) using Python builtin set data structure which is hash-set. Anyway Jelly is not designed to be efficient.

– user202729 – 2017-12-07T12:10:58.383

2

Haskell, 48 bytes

-1 byte thanks to nimi.

import Data.List
head.filter((==)<*>nub).mapM id

Try it online!

totallyhuman

Posted 2017-12-05T13:23:20.143

Reputation: 15 378

2mapM id instead of sequence – nimi – 2017-12-05T19:58:54.700

2

Wolfram Language (Mathematica), 87 bytes and ϴ(n3)

-Range[n=Length@#]/.Rule@@@FindIndependentEdgeSet[Join@@Table[-i->j,{i,n},{j,#[[i]]}]]&

Try it online!

Builds a bipartite graph whose vertices on one side are the sets (indexed by -1,-2,...,-n) and whose vertices on the other side are the elements 1,2,...,n, with an edge from -i to j when j is contained in the i-th set. Finds a perfect matching in this graph using a built-in. Then lists out the elements corresponding to -1,-2,...,-n in that order in the perfect matching.

Mathematica's FindIndependentEdgeSet is the bottleneck here; everything else takes O(n2) operations to do. Mathematica probably uses the Hungarian algorithm, and so I'll assume that it runs in ϴ(n3) time, though it's possible that Mathematica has a naive implementation with O(n4) complexity.

Misha Lavrov

Posted 2017-12-05T13:23:20.143

Reputation: 4 846

Well... Mathematica is bad at [tag:restricted-complexity], not for the first time. – user202729 – 2017-12-06T01:22:28.073

1

Mathematica 39 Bytes

Last@Union[DeleteDuplicates/@Tuples@#]&

On the issue of complexity, I think it is very dependent upon the length of each sublist, as well as a measure of how disjoint the sublists are.

So, I think this algorithm is O(n Log n + n^2 Log m) where m is roughly the average length of each sublist.

Something like this would have complexity O(a^n) where a>1 is a measure of the overlap in sublists:

(For[x={1,1},!DuplicateFreeQ@x,x=RandomChoice/@#];x)&

Hard to say which is really faster without knowing the properties of possible inputs.

Kelly Lowder

Posted 2017-12-05T13:23:20.143

Reputation: 3 225

2The DeleteDuplicates /@ Tuples@# step takes ϴ(n^(n+1)) time by the same arguments as in the other solutions. Then Union has a list of length n^n to sort, which takes O(n^(n+1) log(n)) time - but maybe it's faster since at most 2^n n! elements in that list are distinct. Either way, the complexity is ϴ(n^(n+1)) up to a log(n) factor. – Misha Lavrov – 2017-12-05T20:33:45.963

I think this problem requires a multivariate definition of big O notation to have any practical meaning. Obviously the length of the sublists is incredibly important. – Kelly Lowder – 2017-12-05T21:46:22.570

1My interpretation of the problem statement is that only the dependence on n (the number of sublists) matters, and we assume the worst case about the length of the sublists. In any case, even if every sublist has length 2, Tuples@# has size 2^n, so your first asymptotic estimate can't possibly be true. – Misha Lavrov – 2017-12-05T21:57:56.077

Huh? Why is multivariate BigO problematic? It's done all the time. – user202729 – 2017-12-06T01:21:45.280

1

05AB1E, 13 bytes, O(n!*n)

āœʒ‚øε`QO}P}н

Try it online! Explanation:

ā               Make a range 1..n
 œ              Generate all permutations
  ʒ        }    Filter
   ‚ø           Zip with input
     ε   }      Loop over sets
      `QO       Check each element for membership
          P     All sets must match
            н   Take the first result

Neil

Posted 2017-12-05T13:23:20.143

Reputation: 95 035

1

Husk, 5 bytes

►oLuΠ

Try it online!

Explanation

Just saw the thing about complexity: As so often it is the case with golfing languages' solution they are not very efficient - this one has complexity O(n·nⁿ).

►(Lu)Π  -- input as a list of lists, for example: [[1,2],[1,3],[1,4],[3,4]]
     Π  -- cartesian product: [[1,1,1,3],...,[2,3,4,4]]
►(  )   -- maximum by the following function (eg. on [1,1,1,3]):
   u    --   deduplicate: [1,3]
  L     --   length: 2

ბიმო

Posted 2017-12-05T13:23:20.143

Reputation: 15 345

0

Pyth, 9 bytes (ϴ(nn+1))

Since this works exactly like the Jelly solution, it most likely has the same complexity.

h{I#.nM*F

Try it here!

How?

h{I#.nM*F | Full program.

       *F | Fold Cartesian product.
    .nM   | Flatten each.
   #      | Keep those:
 {I         That are invariant over duplicate elements removal.
h         | Take the first element.

Mr. Xcoder

Posted 2017-12-05T13:23:20.143

Reputation: 39 774

0

JavaScript (ES6), 74 73 bytes

1 byte saved, thanks to @Neil.

f=([a,...A],s=[],S)=>a?a.map(c=>s.includes(c)||(S=S||f(A,[...s,c])))&&S:s

Recursively iterates through the array looking for a solution.

Ungolfed:

f=(
   [a, ...A],                        //a is the first array, A is the rest
   s = [],                           //s is the current trial solution
   S                                 //S holds the solution (if one exists) at this branch
  )=>
     a ? a.map(                      //if we're not done, iterate through a
           c => s.includes(c) ||     //  if this element already exists, short-circuit
                (S = S ||            //  else if S has a solution, keep it
                 f(A, [...s, c])     //  else look for a solution down this branch
                )
         ) && S                      //return S

Test Cases:

f=([a,...A],s=[],S)=>a?a.map(c=>s.includes(c)||(S=S||f(A,[...s,c])))&&S:s

console.log(f([[1,2],[1,3],[1,4],[3,4]])) // [2,3,1,4] or [2,1,4,3]
console.log(f([[1,3],[1,2,4],[2,3],[3],[2,3,4,5]])) // [1,4,2,3,5]
console.log(f([[1,3,4],[2,3,5],[1,2],[4,5],[4,5]])) // [1,3,2,4,5] or [3,2,1,4,5] or [1,3,2,5,4] or [3,2,1,5,4],4]

Rick Hitchcock

Posted 2017-12-05T13:23:20.143

Reputation: 2 461

Why not a?a.map(...)&&S:s? – Neil – 2017-12-06T15:44:22.790

@Neil, because Duh. Thanks! – Rick Hitchcock – 2017-12-06T16:01:58.373

0

Python3, 93 84 bytes

-9 bytes thanks to caird coinheringaahing

lambda I:next(filter(lambda x:len(x)==len({*x}),product(*I)))
from itertools import*

Try it online!

Setop

Posted 2017-12-05T13:23:20.143

Reputation: 188

184 bytes – caird coinheringaahing – 2017-12-06T15:04:54.063