Find the sum of closest distances

10

1

For this task your code should take two sorted arrays of integers X and Y as input. It should compute the sum of the absolute distances between each integer in X and its closest number in Y.

Examples:

X = (1 5,9)
Y = (3,4,7)

The distance is 2 + 1 + 2.

X = (1,2,3)
Y = (0,8)

The distance is 1 + 2 + 3.

Your code can take input in any way that is convenient.

The main restriction is that your code must run in linear time in the sum of the length of the two arrays.. (You can assume that adding two integers takes constant time.)

Anush

Posted 2018-07-01T17:24:37.220

Reputation: 3 202

Can we use lists or streams instead of arrays? – Post Rock Garf Hunter – 2018-07-01T17:59:11.880

@CatWizard Yes you can! – Anush – 2018-07-01T17:59:33.497

1How is 1 + 2 + 3 derived from X = (1,2,3) and Y = (0,8)? – guest271314 – 2018-07-01T20:47:26.207

1@guest271314 the closest number two each of 1, 2, and 3 in Y is 0. Thus the differences are 1-0, 2-0, 3-0. – dylnan – 2018-07-01T21:03:35.427

@user202729 I clarified this in the question. In general, it is standard to assume you can add two O(\log n) bit integers in constant time unless you specify otherwise. – Anush – 2018-07-02T06:24:35.167

I'm not quite sure this is possible. Taking 2 arrays of numbers and finding the least differences will always take O(n) time, where n is the product of the length of the arrays. – FreezePhoenix – 2018-07-02T13:21:40.570

1@FreezePhoenix since both lists are sorted you can do it in O(n+m), because you iterate over list $X$, visiting each element once, and as long as you keep track of the element $Y_j$ closest to $X_i$, you can check against $Y_j$ and $Y_{j+1}$ since one of those is closest to $X_{i+1}$ – Giuseppe – 2018-07-02T13:51:49.897

Are the arrays always non-negative integers? – Giuseppe – 2018-07-02T13:53:59.487

@Giuseppe No, sorry. – Anush – 2018-07-02T19:41:06.413

Answers

6

Haskell, 70 64 bytes

a%b=abs$a-b
x@(a:b)#y@(c:d)|e:_<-d,a%c>a%e=x#d|1>0=a%c+b#y
_#_=0

Try it online!

Explanation

First we define (%) to be the absolute difference between two numbers. Then we define (#) to be the interesting function. In the first line we match when both lists are non-empty:

x@(a:b)#(c:d:e)

On our first case from here we bind d to e:_ with e:_<-d. This ensures that d is non-empty and sets it's first element to e.

Then if the second element of \$Y\$ (e) is closer than the first (c) to the first element of \$X\$ (a), we return x#d removing the first element of \$Y\$ and calling again with the same \$X\$.

If we match the pattern but don't pass the condition we do:

a%c+b#y

Which, removes the first item of \$X\$ and adds the absolute difference of it from the first element of \$X\$ to the remaining result.

Lastly if we don't match the pattern we return \$0\$. Not matching the pattern means that \$X\$ must be empty because \$Y\$ cannot be empty.

This algorithm has order notation \$O(|X|+|Y|)\$.

Haskell, 34 bytes

Here's how I would do it in \$O(\left|X\right|\times\left|Y\right|)\$ time:

x#y=sum[minimum$abs.(z-)<$>y|z<-x]

Try it online!

Post Rock Garf Hunter

Posted 2018-07-01T17:24:37.220

Reputation: 55 382

I clarified in the question that we can assume that adding two integers takes constant time. – Anush – 2018-07-02T06:19:57.257

2

Python 2, 124 120 bytes

X,Y=input()
i=j=s=0
while i<len(X):
 v=abs(Y[j]-X[i])
 if j+1<len(Y)and v>=abs(Y[j+1]-X[i]):j+=1
 else:s+=v;i+=1
print s

Try it online!

Saved 4 bytes by moving to program versus function.

Meeting the time-complexity constraint is possible because both lists are sorted. Note that each time around the loop, either i is incremented or j is incremented. Thus the loop is executed at most len(X)+len(Y) times.

Chas Brown

Posted 2018-07-01T17:24:37.220

Reputation: 8 959

I clarified in the question that we can assume adding two integers takes constant time. – Anush – 2018-07-02T06:18:47.487

1

C (gcc), 82 bytes

n;f(x,y,a,b)int*x,*y;{for(n=0;a;)--b&&*x*2-*y>y[1]?++y:(++b,--a,n+=abs(*x++-*y));}

This takes input as two integer arrays and their lengths (since C has no way to get their length otherwise). This can be shown to run in O(a+b) because either a or b is decremented on each iteration of the loop, which terminates when a reaches 0 (and b cannot be decremented below 0).

Try it online!

n;                     // define sum as an integer
f(x,y,a,b)             // function taking two arrays and two lengths
int*x,*y;              // use k&r style definitions to shorten function declaration
{
 for(n=0;              // initialize sum to 0
 a;)                   // keep looping until x (the first array) runs out
                       // we'll decrement a/b every time we increment x/y respectively
 --b&&                 // if y has ≥1 elements left (b>1, but decrements in-place)...
 *x*2-*y>y[1]?         // ... and x - y > [next y] - x, but rearranged for brevity...
 ++y:                  // increment y (we already decremented b earlier);
 (++b,                 // otherwise, undo the in-place decrement of b from before...
 --a,n+=abs(*x++-*y))  // decrement a instead, add |x-y| to n, and then increment x
;}

Some notes:

  • Instead of indexing into the arrays, incrementing the pointers and dereferencing directly saves enough bytes for it to be worth it (*x vs x[a] and y[1] vs y[b+1]).

  • The --b&& condition checks for b>1 in a roundabout way - if b is 1, it will evaluate to zero. Since this modifies b, we don't need to change it in the first branch of the ternary (which advances y), but we do need to change it back in the second (which advances x).

  • No return statement is needed, because black magic. (I think it's because the last statement to be evaluated will always be the n+=... expression, which uses the same register as the one used for return values.)

Doorknob

Posted 2018-07-01T17:24:37.220

Reputation: 68 138

0

Ruby, 88 bytes

->(p,q){a=q.each_cons(2).map{|a|a.sum/a.size}
x=a[0]
p.sum{|n|x=a.pop if n>x
(n-x).abs}}

Try it online!

Also, for fun, a shorter anonymous function that doesn't quite meet the complexity restrictions:

->(a,b){a.map{|x|x-b.min_by{|y|(x-y).abs}}.sum}

Tango

Posted 2018-07-01T17:24:37.220

Reputation: 121

Could you explain in simple terms how this code works? I can't tell if it runs in linear time. – Anush – 2018-07-01T19:50:34.733

2This fails the first test case in the question, as well as inputs such as [5, 6], [0, 1, 5]. – Doorknob – 2018-07-01T23:18:11.733

0

JavaScript (Node.js), 80 bytes

x=>g=(y,i=0,j=0,v=x[i],d=v-y[j],t=d>y[j+1]-v)=>1/v?g(y,i+!t,j+t)+!t*(d>0?d:-d):0
  • It run in O(|X|+|Y|): Every recursion run in O(1), and it recursive |X|+|Y| times.
    • x, y are passed by reference, which do not copy the content
  • 1/v is falsy if x[i] is out of range, truthy otherwise
  • t -> d>y[j+1]-v -> v+v>y[j]+y[j+1] is false as long as following conditions meet. And which means y[j] is the number closest to v in y
    • v is less than (y[j]+y[j+1])/2, or
    • y[j+1] is out of range, which would convert to NaN, and compare to NaN yield false
      • that's why we cannot flip the > sign to save 1 more byte
  • t is always a boolean value, and * convert it to 0/1 before calculating

Try it online!

tsh

Posted 2018-07-01T17:24:37.220

Reputation: 13 072

0

Mathematica, 40 bytes

x = {1, 5, 9};
y = {3, 4, 7};

Norm[Flatten[Nearest[y] /@ x] - x]

If you must create a full program, with inputs:

f[x_,y_]:= Norm[Flatten[Nearest[y] /@ x] - x]

Here's the timing for up to 1,000,000 points (sampled every 10,000) for y:

enter image description here

Close to linear.

David G. Stork

Posted 2018-07-01T17:24:37.220

Reputation: 213

1This answer is a code-snippet since your input is taken as pre-existing variables. You should reformat it to either be a sub-routine or a full program. – Post Rock Garf Hunter – 2018-07-04T01:54:16.273

I'm also a bit suspicious that this works in linear time, do you have any justification as to why it should? Mathematica tends to be rather opaque in the complexity of it's builtins. – Post Rock Garf Hunter – 2018-07-04T02:06:36.033