The distance between circles

6

1

Imagine you are given two lists of numbers which are to be interpreted as points on a circle of circumference 1000. Let us say

circleA = [10, 24, 44, 175, 321, 579, 618, 841, 871, 979]
circleB = [270, 283, 389, 406, 435, 457, 612, 684, 692, 732]

To measure the distance between these two circles we do the following.

For the first point in circleA, starting from the left, we find the next biggest point in circleB. So 10 finds 270 and is matched up with it. Now, using 270, we skip to the next biggest point in circle A which is 321. We then use 321 to find 389 and match up 321 and 389 and repeat until the end of circleA or we can't find a matching pair.

This gives us a list of pairs as follows.

[(10,270), (321, 389), (579,612), (618,684)]

We say the distance between circleA and circleB is just the average difference between pairs. That is (270-10 + 389-321+612-579+684-618)/4 = 106.75

Now comes the tricky part. As the point are on a circle, we can rotate circleB and recompute this distance. The task is to compute the following value.

For arbitrary inputs circleA and circleB, the task is to compute the average distance between circleA and all rotations of circleB.

In order to make it interesting I am setting this as fastest-code challenge. I will test with increasing size circles created using (in python)

import random
circumference = 1024
numberpoints = circumference/64
circleA = sorted(random.sample(range(circumference),numberpoints))
circleB = sorted(random.sample(range(circumference),numberpoints))

I will just double the size of circumference until the code takes more than 1 minute to run on my computer. You output should be accurate to machine precision.

Please provide full instructions for how to compile and run your code on a linux machine.


A rotation simply means adding an offset modulo circumference and then resorting. In python

rotatedcircleB = sorted([(c+offset)%circumference for c in circleB])

user9206

Posted 2014-08-10T06:32:51.930

Reputation:

You should add an example of what rotation does. Increment all points modulo 1000 and then resort? – John Dvorak – 2014-08-10T06:49:02.520

4Is rotation continuous or discrete? – Peter Taylor – 2014-08-10T06:49:35.957

@PeterTaylor I meant it to be continuous. If this causes a problem then I will rethink of course. – None – 2014-08-10T08:00:04.330

@JanDvorak Yes you are right. I updated the question. – None – 2014-08-10T08:04:05.677

@QuadrExAtt The difference can't be negative. Note that in my example 979 just isn't paired up with anything. – None – 2014-08-13T19:58:03.503

Answers

1

This algorithm is basically the same as @fredtantini's. The main difference is I keep a rolling sum instead of creating a list to hold the differences. I also forgo the list slicing as it is an O(k) operation.

from __future__ import division
from bisect import bisect_left
import timeit

def circular_distance(circle_a, circle_b):
    total = 0
    count = 0
    value_on_a = circle_a[0]
    value_on_b = 0
    while 1: 
        try:
            insertion_point = bisect_left(circle_b, value_on_a)           
            value_on_b = circle_b[insertion_point]

            total += value_on_b - value_on_a
            count += 1  

            value_on_a = circle_a[bisect_left(circle_a, value_on_b+1)]       
        except IndexError:
            break
    return total/count

def rotate(circle, offset, circumference=1024):
    return sorted((point + offset) % circumference for point in circle)

def get_average_distance(circle_a, circle_b, circumference=1024):
    total = 0
    for iteration_num in xrange(circumference):
        total += circular_distance(circle_a, rotate(circle_b, iteration_num))        
    return total/circumference

These changes resulted in about a 64% improvement for 10,000 iterations on my machine:

                     +------------------+-------------------------+
                     | List and slicing | Rolling sum and average |
+--------------------+------------------+-------------------------+
| Per Iteration (ms) |  0.0128661600274 |    0.00821646809725     |
+--------------------+------------------+-------------------------+
|          Total (s) |    128.661600274 |       82.1646809725     |
+--------------------+------------------+-------------------------+

BeetDemGuise

Posted 2014-08-10T06:32:51.930

Reputation: 442

1

My first try at a fatest-code question. Problem is, I didn't try any optimization…

If the rotation part isn't correct, please tell me.

Also, I don't know how to time test the code, so I tried to use timeit module to run 1000 times in order to have a correct timing. I am not sure if I have to do that, or if it is your part :þ

Anyway, here is my code:

import random
circumference = 1024
numberpoints = circumference/64
circleA = sorted(random.sample(range(circumference),numberpoints))
circleB = sorted(random.sample(range(circumference),numberpoints))

####
from  bisect import bisect_left

def find_next(a, x):
    i = bisect_left(a, x)
    if i != len(a):
        return i, a[i]
    raise ValueError


def dist1(cA, cB):
    l = []
    p = cA[0]
    cA = cA[1:]
    while True:
        try:
            (i, n) = find_next(cB, p)
            l.append(n-p)
            cB = cB[(i+1):]
            (i, p) = find_next(cA, n)
            cA = cA[(i+1):]
        except:
            break
    return (sum(l)/len(l))

def rotate(circleB,i):
    return sorted([(c+i)%circumference for c in circleB])

def distAll(circleA, circleB):
    l = [dist1(circleA, rotate(circleB,i)) for i in xrange(circumference)]
    return sum(l)/circumference

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("distAll(circleA,circleB)", setup="from __main__ import *", number=1000))

fredtantini

Posted 2014-08-10T06:32:51.930

Reputation: 169

1

C

The first argument is a filename which the program expects to read for input. The first line of the file contains the circumference and the number of points each of circle a and circle b. Then the values for circle a follow, one per line; then the values of circle b, one per line.

The time spent reading the input is a huge portion of the overall running time, but I wasn't sure how to get so much input into the program without resorting to file I/O. If anyone has a better solution to that (or any suggestions in general) please feel free to chime in.

Compile it with -O3 for extra speed!

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>    

static inline void rotateB(int_fast32_t* const, int_fast32_t, int_fast32_t, int_fast32_t);    

static inline void getDistances(const int_fast32_t* const, const int_fast32_t* const, int_fast32_t, int_fast32_t, int_fast32_t* const, int_fast32_t* const);    

static inline int comparePoints(const void*, const void*);    

static inline int_fast32_t getNextIndex(const int_fast32_t* const, int_fast32_t, int_fast32_t, int_fast32_t);    

int main(int argc, char* argv[]) {
    int_fast32_t counter;    

    int_fast32_t *circleA;
    int_fast32_t *circleB;    

    int_fast32_t CIRCUMFERENCE;
    int_fast32_t lengthOfA;
    int_fast32_t lengthOfB;    

    FILE *fp = fopen(argv[1], "r");
    fscanf(fp, "%d %d %d\n", &CIRCUMFERENCE, &lengthOfA, &lengthOfB);    

    circleA = malloc(sizeof(int_fast32_t)*lengthOfA);
    circleB = malloc(sizeof(int_fast32_t)*lengthOfB);    

    for(counter = 0; counter < lengthOfA; counter++)
        fscanf(fp, "%d\n", &(circleA[counter]));
    for(counter = 0; counter < lengthOfB; counter++)
        fscanf(fp, "%d\n", &(circleB[counter]));    

    fclose(fp);    

    int_fast32_t totalDistance = 0;
    int_fast32_t numberOfPairs = 0;    

    for(counter = 0; counter < CIRCUMFERENCE; counter++) {
        rotateB(circleB, counter, lengthOfB, CIRCUMFERENCE);
        getDistances(circleA, circleB, lengthOfA, lengthOfB, &totalDistance, &numberOfPairs);
    }    

    printf("%f\n", ((double)totalDistance)/((double)numberOfPairs));    

    free(circleA);
    free(circleB);    

    return 0;
}    

static inline void rotateB(int_fast32_t* const circleB, int_fast32_t offset, int_fast32_t length, int_fast32_t CIRCUMFERENCE) {
    int_fast32_t counter;
    for(counter = 0; counter < length; counter++)
        circleB[counter] = (circleB[counter] + offset) % CIRCUMFERENCE;
    qsort(circleB, length, sizeof(int_fast32_t), comparePoints);
}    

static inline int comparePoints(const void* a, const void* b) {
    return (*((int_fast32_t*)a)) - (*((int_fast32_t*)b));
}    

static inline void getDistances(const int_fast32_t* const circleA, const int_fast32_t* const circleB, int_fast32_t lengthOfA, int_fast32_t lengthOfB, int_fast32_t* const totalDistance, int_fast32_t* const numberOfPairs) {
    int_fast32_t currentAIndex = 0;
    int_fast32_t currentBIndex = getNextIndex(circleB, circleA[currentAIndex], 0, lengthOfB);
    while((currentAIndex < lengthOfA) && (currentBIndex < lengthOfB)) {
        *totalDistance += circleB[currentBIndex] - circleA[currentAIndex];
        *numberOfPairs += 1;
        currentAIndex = getNextIndex(circleA, circleB[currentBIndex], currentAIndex, lengthOfA);
        currentBIndex = getNextIndex(circleB, circleA[currentAIndex], currentBIndex, lengthOfB);
    }
}    

static inline int_fast32_t getNextIndex(const int_fast32_t* const targetCircle, int_fast32_t otherValue, int_fast32_t currentIndex, int_fast32_t length) {
    int_fast32_t lowerBound = currentIndex;
    int_fast32_t upperBound = length-1;
    int_fast32_t guess;
    if(targetCircle[upperBound] <= otherValue)
        return length;
    while(1) {
        if(upperBound - lowerBound < 4) {
            while(targetCircle[lowerBound] <= otherValue)
                lowerBound++;
            return lowerBound;
        }
        guess = (lowerBound + upperBound) >> 1;
        if(targetCircle[guess] > otherValue)
            upperBound = guess;
        else
            lowerBound = guess;
    }
    return lowerBound;
}

R.T.

Posted 2014-08-10T06:32:51.930

Reputation: 501