Autoscale complex function

5

1

Context: We are working on a complex image drawing program to produce mugs, t-shirts, and so on. Here is an example of such function

However, we are facing the following problem: Random generated functions are mostly white and black (where black is close to zero and white close to infinity). The idea of this contest is to find a number to scale all numbers so that corresponding colors are in the visible spectrum.

Problem

The input is, formatted the way you want:

  • An array of double precision A[1..n] randomly sorted of positive numbers (>0).
  • An expected inclusive lower bound L > 0
  • An expected inclusive upper bound U > 0

The expected output is

  • A number f so if we multiply all elements of A by f, it maximizes the number of the new elements between L and U. If there are multiple solutions, f should be as close to 1 as possible.

Write a program computing f in a few operations as possible.

To check, post the result of your program on the following (sorted for convenience) input:

A = [0.01, 0.02, 0.5, 1 , 5, 9.5, 9.8, 15, 18, 20, 50, 100]
L = 1
U = 5

The result should be f = 0.25

Winning conditions: The winner will be the one with the best time complexity. If there are multiple algorithms of the same complexity, the fastest one will win. If your program is obfuscated or hard to read, please also attach an unobfuscated version with explanations.

Mikaël Mayer

Posted 2014-03-09T20:59:27.257

Reputation: 1 765

I looked at the picture and thought you were asking a creative fractal drawing question! Unfortunately your actual question is somewhat less interesting. – Level River St – 2014-03-09T21:07:51.477

Well, that could be the next challenge! But not for the moment. – Mikaël Mayer – 2014-03-09T21:16:24.503

I would have expected the result to be 0.25. Then the same numbers of A (5 up to 20) are inside the bounds and 0.25 is closer to 1 than 0.2. – Heiko Oberdiek – 2014-03-09T21:20:25.633

How accurate need it be? I calculated it to be .2499. – None – 2014-03-09T21:48:15.723

Answers

2

Perl O(n log n)

#!/usr/bin/env perl
use strict;
$^W=1;

# calculate_f ( <lower bound>, <upper bound>, <array reference> )
sub calculate_f ($$$) {
    my $L = shift;
    my $U = shift;
    my $A = shift;
    my $n = @$A;

    # Catch trivial cases:
    # * Upper bound is lower than lower bound,
    # * array is empty.
    # In both cases the result array is empty and the value
    # of f does not matter. Therefore we can return the
    # "optimum" value 1.
    return 1 if $U < $L or $n == 0;

    # The array is sorted. accordign to the manual page, Perl since 5.7
    # uses a "stable mergesort algorithm whose worst-case behaviour
    # is O(N log N).
    my @A = sort {$a <=> $b} @$A;

    # Array @R stores the length of ranges:
    # $R[$i] is the number of elements that form a valid
    # range of input numbers ($A[$i], ..., $A[$i + $R[$i] - 1])
    my @R;
    my $lu = $L / $U;
    my $ul = $U / $L;
    # find ( <upper value>, <left position>, <right position> )
    # The function looks for the largest input value below or equal
    # <upper value>. Invariant: The input value at <left position>
    # in the sorted array belongs to the range below <upper value>,
    # the input value at <right position> is larger than <upper value>.
    # It is a binary search: O(log n)
    sub find ($$$) {
        my $u = shift;
        my $left = shift;
        my $right = shift;
        return $left if $right <= $left + 1;
        my $next = int(($left + $right)/2);
        return find($u, $next, $right) if $A[$next] <= $u;
        return find($u, $left, $next);
    }
    for (my $i = 0; $i < $n; $i++) {
        my $a = $A[$i];
        my $u = $a * $ul;
        if ($A[$n-1] <= $u) {
            $R[$i] = $n - $i;
            next;
        }
        $R[$i] = find($u, $i, $n - 1) - $i + 1;
    }
    # Complexity is n times the loop body with the binary search: O(n log n).

    # In the next step the maximum number of a valid range is
    # obtained: O(n)    
    my $max = 0;
    for (my $i = 0; $i < $n; $i++) {
        my $a = $R[$i];
        $max = $a if $a > $max;
    }
    print "(Debug) maximal number of elements in a valid range: $max\n";

    # Now each range is checked to find the optimum f
    my $f = 0;
    for (my $i = 0; $i < $n; $i++) {
        # ignore ranges with fewer elements
        next unless $R[$i] == $max;

        # print current range:
        print sprintf "(Debug) A[%d .. %d] = (%g .. %g)\n",
                $i, $i + $max - 1,
                $A[$i], $A[$i + $max - 1];

        # calculate f values based on the smallest/largest element        
        my $f_lower = $L / $A[$i];
        my $f_upper = $U / $A[$i + $max - 1];
        # if 1 is in the middle of the f values, then return
        # optimal value
        return 1 if $f_lower <= 1 and $f_upper >= 1;

        # Now $f_lower and $f_upper are both smaller or both greater than 1.
        # A candidate for $f is selected from the current range
        # by using the value that is closer to 1.
        my $f_candidate = $f_lower < 1 ? $f_upper : $f_lower;
        print sprintf "(Debug) f_candidate = %g from (%g .. %g)\n",
                $f_candidate, $f_lower, $f_upper;

        # Compare the candidate with the current f
        if (not $f or abs($f_candidate - 1) < abs($f - 1)) {
            $f = $f_candidate;
        }
    }
    # Complexity is O(n), because the loop body is O(1).

    return $f;
}

my $f = calculate_f(1, 5, [
    0.01, 0.02, 0.5, 1 , 5, 9.5, 9.8, 15, 18, 20, 50, 100
]);

print "(Result) f = $f\n";

__END__

Example output:

(Debug) maximal number of elements in a valid range: 6
(Debug) A[4 .. 9] = (5 .. 20)
(Debug) f_candidate = 0.25 from (0.2 .. 0.25)
(Result) f = 0.25

Algorithm is explained in the comments. The input numbers are sorted with O(n log n). Then for each number its valid range is obtained, O(n log n), then the maximum number of elements in the range is found, O(n) and the best fitting f value is calculated, O(n).

Complexity: O(n log n)


Older version with O(n2)

#!/usr/bin/env perl
use strict;
$^W=1;

# calculate_f ( <lower bound>, <upper bound>, <array reference> )
sub calculate_f ($$$) {
    my $L = shift;
    my $U = shift;
    my $A = shift;
    my $n = @$A;

    # Catch trivial cases:
    # * Upper bound is lower than lower bound,
    # * array is empty.
    # In both cases the result array is empty and the value
    # of f does not matter. Therefore we can return the
    # "optimum" value 1.
    return 1 if $U < $L or $n == 0;

    # The array is sorted. according to the manual page, Perl since 5.7
    # uses a "stable mergesort algorithm whose worst-case behaviour
    # is O(N log N)".
    my @A = sort {$a <=> $b} @$A;

    # Array @R stores the length of ranges for each element with
    # position $i in the sorted array, if the range starts with this
    # element:
    # $R[$i] is the number of elements that form a valid
    # range of input numbers ($A[$i], ..., $A[$i + $R[$i] - 1])
    my @R;
    my $lu = $L / $U;
    for (my $i = 0; $i < $n; $i++) {
        my $a = $A[$i];
        # Each input number is part of its own range.
        $R[$i]++;
        # Now we assume, that the current input number $A[$i] is
        # the largest element of the range. The minimum lower bound in
        # the input number space is calculated and the previous
        # input number are checked, if the current input number
        # can be added to their ranges.
        my $l = $a * $lu;
        for (my $j = $i - 1; $j >= 0; $j--) {
            last if $A[$j] < $l;
            $R[$j]++;
        }
    }
    # In the worst case, all input numbers are part of the  final
    # output range. Then we would have to look up all smaller input
    # numbers for each input number: O(n^2).
    # In the best case, the final output range is 1, e.g. if the lower
    # bound equals the upper bound. Then the going back is limited to
    # 1 and the complexity is O(n).

    # In the next step the maximum number of a valid range is
    # obtained: O(n)    
    my $max = 0;
    for (my $i = 0; $i < $n; $i++) {
        my $a = $R[$i];
        $max = $a if $a > $max;
    }
    print "(Debug) maximal number of elements in a valid range: $max\n";

    # Now each range is checked to find the optimum f
    my $f = 0;
    for (my $i = 0; $i < $n; $i++) {
        # ignore ranges with fewer elements
        next unless $R[$i] == $max;

        # print current range:
        print sprintf "(Debug) A[%d .. %d] = (%g .. %g)\n",
                $i, $i + $max - 1,
                $A[$i], $A[$i + $max - 1];

        # calculate f values based on the smallest/largest element        
        my $f_lower = $L / $A[$i];
        my $f_upper = $U / $A[$i + $max - 1];
        # if 1 is in the middle of the f values, then return
        # optimal value
        return 1 if $f_lower <= 1 and $f_upper >= 1;

        # Now $f_lower and $f_upper are both smaller or both greater than 1.
        # A candidate for $f is selected from the current range
        # by using the value that is closer to 1.
        my $f_candidate = $f_lower < 1 ? $f_upper : $f_lower;
        print sprintf "(Debug) f_candidate = %g from (%g .. %g)\n",
                $f_candidate, $f_lower, $f_upper;

        # Compare the candidate with the current f
        if (not $f or abs($f_candidate - 1) < abs($f - 1)) {
            $f = $f_candidate;
        }
    }
    # Complexity is O(n), because the loop body is O(1).

    return $f;
}

my $f = calculate_f(1, 5, [
    0.01, 0.02, 0.5, 1 , 5, 9.5, 9.8, 15, 18, 20, 50, 100
]);

print "(Result) f = $f\n";

__END__

The algorithm is explained in the comments. The complexity in the worst case is O(n2), in the best case O(n log n) because of the sorting.

Example output:

(Debug) maximal number of elements in a valid range: 6
(Debug) A[4 .. 9] = (5 .. 20)
(Debug) f_candidate = 0.25 from (0.2 .. 0.25)
(Result) f = 0.25

Heiko Oberdiek

Posted 2014-03-09T20:59:27.257

Reputation: 3 841

2

C++

Comments indicate how code works.

#include<iostream>
#include<vector>
using namespace std;

int main(){
    // input stores input, sum_values stores the sum of values in the valid range after multiplying
    vector<double>input;
    vector<double>sum_values;

    // stores input, and ends of range
    double num, lower, higher;

    cout<<"Enter the array of values, use 0 to exit: ";

    // input
    while(1){
        cin>>num;
        if(num==0)break;
        input.push_back(num);
    }

    cout<<"Enter the mimimum and maximum values: ";

    cin>>lower>>higher;

    // if array is empty or lower is greater than higher, print 1 and exit
    if(lower>higher||input.size()==0){cout<<"1";return 0;}

    // iterate through all values between .0001 and .9999
    for(double d=.0001;d<1;d+=.0001){

        // check each d for valid ranges
        for(int i=0,valid=0;i<input.size();i++){

            // check if input * d is in valid range
            if(d*input[i]>=lower&&d*input[i]<=higher)valid++;

            // store final number
            if(i==input.size()-1)sum_values.push_back(valid);
        }
    }

    // iterate through compiled sum of ranges and find highest
    int position=0, biggest=-.1;
    for(int i=0;i<sum_values.size();i++){

        // return value closest to 1 for equal values
        if(sum_values[i]>=biggest){
            biggest=sum_values[i];
            position=i;
        }
    }

    // print value
    cout<<(double)position*.0001;

    return 0;
}

user10766

Posted 2014-03-09T20:59:27.257

Reputation:

Really this works, but this is so sloow. And besides, the factor might be also between 1 and infinity, so you are missing a lot of values. Imagine that the first array was divided by 10... then the factor should be 2.5, which is not covered by your algorithm. – Mikaël Mayer – 2014-04-14T22:05:04.947

And the complexity is O(Big constant*input_size), which in practice is worse than O(n log n). – Mikaël Mayer – 2014-04-14T22:05:54.530

Oh dear. Well, you needn't accept it as the answer. – None – 2014-04-15T18:13:53.027

2

Python

import collections

def pop_ranges(ranges, max_factor, count, dist, factor):
        """Removes ranges that don't overlap with the next range to be added"""
        while ranges and max_factor < ranges[0][0]:
                cnt = -len(ranges)
                maxf = ranges[-1][1]
                minf = ranges.popleft()[0]
                # Finding the minimum from a constant amount of items (3) is O(1)
                count, dist, factor = min([
                        (count, dist, factor),
                        (cnt, abs(minf - 1), minf),
                        (cnt, abs(maxf - 1), maxf)
                ])
        return count, dist, factor

def scaler(vals, min_val, max_val):
        # Sorting is O(n log n)
        vals.sort()

        factor = dist = count = 0
        # Deque has O(1) operations to the beginning and the end
        ranges = collections.deque()
        # This loop is O(n); the outer loop does n passes and adds one item to
        # ranges on every pass. pop_ranges() contains a loop, but since the loop
        # pops items from ranges, it can only do a total of n passes during the
        # whole outer loop.
        for val in vals:
                # val is growing on every pass, so the factors are shrinking.
                min_factor = min_val / float(val)
                max_factor = max_val / float(val)
                count, dist, factor = pop_ranges(ranges, max_factor, count, dist, factor)
                ranges.append((min_factor, max_factor))
        count, dist, factor = pop_ranges(ranges, ranges[-1][0] - 1, count, dist, factor)

        return factor

print scaler([0.01, 0.02, 0.5, 1 , 5, 9.5, 9.8, 15, 18, 20, 50, 100], 1, 5)

This works by sorting the values and keeping track of overlapping factor ranges. New factor ranges are added at the end of the deque and old ranges are removed from the beginning when they no longer overlap with the next range that is about to be inserted. The number of items in the deque corresponds to the number of items on the range and the first and last items give the endpoints of the range. When removing items, the best factor is updated if the current range is better. The complexity should be O(n log n) as explained in the comments.

Aleksi Torhamo

Posted 2014-03-09T20:59:27.257

Reputation: 1 871

Thank you for participating. I don't understand how pop_range works. Normally because the first supplied ranges is empty, the while loop should never be executed. Can you explain more? – Mikaël Mayer – 2014-04-14T22:03:17.467

Yes. pop_ranges() removes all ranges that don't overlap with the next range to be added and updates the best factor seen so far. You are right that on the first call the while does not run at all, but that is expected behaviour. The while loop in pop_ranges() only ever removes ranges, they are added by the for loop in scaler(). pop_ranges() can remove anywhere between 0 to len(ranges) items per call depending on the data. – Aleksi Torhamo – 2014-04-15T21:21:45.887