Minimize Least Used Binary Bits in Fractions

7

0

Introduction

For a given rational number r, find a pair of integers p,q so that p/q=r and the number of less used bits in p and q are minimized (details below).

Challenge

A positive rational number r can be expressed as the ratio of two positive integers, r=p/q. The representation is not unique. For each of these representations, both p and q can be expressed in its binary form and the number of 0s and 1s in the representation can be counted (not including leading zeros).

We count the number of appearance of the less appeared digit for both p and q (denoted g(p) and g(q)) and finally define f(p,q)=max(g(p),g(q)).

For example, the number 35 can be written as 35/1 or 175/5, and we convert both the numerator and the denominator to binary. Then we can count

35 -> 100011, 3 zeros, 3 ones. g(35)=min(3,3)=3
 1 ->      1, 0 zero ,  1 one.  g(1)=min(1,0)=0

f(35,1)=max(g(35),g(1))=3.

And for 175/5

175 -> 10101111. 2 zeros, 6 ones. g(175)=min(2,6)=2
  5 ->      101. 1 zero,  2 ones.   g(5)=min(1,2)=1.

f(175,5)=max(g(175),g(5))=2.

Therefore, if we want to minimize f(p,q) while keeping the rate of p/q constant, p=175,q=5 is a better choice than p=35,q=1. An even better choice is p=1015,q=29 since f(1015,29)=1. It is also easy to prove that there are no valid choice that makes f(p,q) equals 0, so this the optimal choice.

Your task is to write a full program or function that computes a choice of (p,q) for a given rational number r such that f(p,q) is minimized (sub-optimal solutions are OK but there are penalties), while keeping the program as short as possible.

Test cases

The outputs are not unique.

Input
Output

63/1
63/1 0

17/9
119/63 1

35/1
1015/29 1

563/1
2815/5 2

You may also wish to see whether you can get an answer without penalty(see below) for 595/1.

Scoring

Your score will be composed of two parts: the code-length and the penalties. The code-length is measured in bytes.

The penalties are calculated by running your program on all test cases in the given test set. For each r in the test set, if your program outputs p,q, you get a penalty of max(f(p,q)-2,0). That is, you get no penalty (nor bounty) for any answer that gives a f value less than or equal to 2.

The final score is calculated by final_penalty*codelength^(1/4), with lower score better.

Test set

The test set for calculating the penalty can be found in this pastebin.

Basically, it contains all rational numbers which when expressed in reduced fraction p/q has the property 128>p>q (and of course, (p,q) coprime), along with all integers from 128 to 1023(inclusive). The second column in the pastebin are the f(p,q) calculated by a naive method. The final line in the pastebin shows the total penalty of this naive method. It is 2343.

Specs

  • Input can be given in any reasonable form. For example, a rational if your program supports it, or a tuple or a list of the numerators and denominators in any order (and you should assume them to be coprime). Same flexiblility for output.
  • You need to output p and q in a reasonable form and also f(p,q) (To avoid 0-byte programs, thanks to @Lynn).
  • You should be able to score your own program (Thanks for the suggestion by @user202729). So in your answer, please specify all of the following: the final penalty, the code length in bytes, and the score. Only the score needs to be included in the title of your answer.

Weijun Zhou

Posted 2018-03-25T13:41:15.877

Reputation: 3 396

Sandbox post: deleted

– Weijun Zhou – 2018-03-25T13:42:01.443

(which naive method?) – user202729 – 2018-03-25T13:49:22.170

1Enumerating all cases up to a limit. – Weijun Zhou – 2018-03-25T13:54:54.430

4There are languages in which the 0-byte program copies input to output. This is a valid answer that scores a perfect score of 0 by the final_penalty*codelength^(1/4) formula, so you might want to tweak it. – Lynn – 2018-03-25T14:17:18.257

Changed to require that f(p,q) be included in the output. – Weijun Zhou – 2018-03-25T14:32:44.200

Can I output only the multiplier (p'/p) and f(p',q')? – user202729 – 2018-04-03T08:59:34.743

1

If I programmed correctly, this program computes the optimal result, and total penalty is 2330, and the largest multiplier is 261599.

– user202729 – 2018-04-03T10:27:46.993

Answers

3

Node.js, Score = 7873

This is both naive and rather long...

  • Code size = 129 bytes
  • Total penalty = 2336
  • Score = round(1291/4 * 2336) = 7873

Takes input as [p, q]. Returns [p', q', f(p, q)].

f=(a,r=[k=6e3])=>k--?f(a,(i=Math.max(...a.map(n=>(g=n=>n?g(n>>1,n&1?++o:++z):o<z?o:z)(a=(p=a,n*k),o=z=0))))>r[2]?r:[p,a,i],k--):r

Try it online! for the test cases

or compute the total penalty by processing the whole test set


Alternate version, Score = 8426

A modified version to reach a total penalty of 2330. But it's not worth it...

  • Code size = 171 bytes
  • Total penalty = 2330
  • Score = round(1711/4 * 2330) = 8426

Same I/O format as the main version.

f=(a,r=[k=5e3])=>k--?f(a,(i=Math.max(...a.map(n=>(g=n=>n?g(n>>1,n&1?++o:++z):o<z?o:z)(a=(p=a,n*([7007,7901,16239,32695,130527,261599][k]||k-5)),o=z=0))))>r[2]?r:[p,a,i]):r

Try it online!

It includes some hardcoded values for the highest multipliers. They were found by generating all bitmasks m with up to 32 bits and no more than 5 0's, keeping the values for which m mod p = 0, checking the score of q * m / p and finally doing the same operation with the inverted bitmask of m of the same length (no more than 5 1's with all other bits set to 0).

Arnauld

Posted 2018-03-25T13:41:15.877

Reputation: 111 334

2

Julia 0.6, Score = 7702

g(x,b=[bin(x)...])=min(sum(b.=='0'),sum(b.=='1'))
f(a,b)=((r,i)=findmin(map(i->max(g(a*i),g(b*i)),1:10^5));[r,a*i,b*i])

Try it online!

A naive and slow solution. Output of f is [f(p,q),p',q']

  • Length = 119 bytes
  • Penalty = 2332
  • Score = round(119^.25*2332) = 7702

To run this on the test set, I've downloaded the paste bucket as a txt file called results.txt and removed the last row. I used this code to evaulate the penalty using all cores of my machine:

all_res = open("results.txt","r") do f
    map(row->parse.(Int,split(split(row," ")[1], "/")), readlines(f))
end

addprocs(7)
@everywhere g(x,b=[bin(x)...])=min(sum(b.=='0'),sum(b.=='1'))
@everywhere f(a,b)=((r,i)=findmin(map(i->max(g(a*i),g(b*i)),1:10^5));[r,a*i,b*i])

penalty = @parallel (+) for iter in all_res
    max(f(iter[1],iter[2])[1]-2,0)
end

niczky12

Posted 2018-03-25T13:41:15.877

Reputation: 301

1

C++ (GCC), score = 11222

  • Length: 538 (<-- less verbose than Java)
  • Penalty: 2330
#import<bits/stdc++.h>
using namespace std;uint32_t f(uint32_t x){int s=__builtin_popcount(x);return min(s,32-__builtin_clz(x)-s);}int i(vector<int>const&a,vector<int>const&b){int i=0,j=0;while(i<a.size()&j<b.size()){if(a[i]==b[j])return a[i];else++(a[i]<b[j]?i:j);}return 0;}int main(){vector<array<vector<int>,6>>T(1024);int p,q,a;for(a=1;a<1024;++a){for(unsigned m=1;m<500000;++m)for(p=f(a*m);p<6;++p)T[a][p].push_back(m);}while(cin>>p>>q){for(a=1;;++a){unsigned m=i(T[p][a],T[q][a]);if(m){cout<<p*m<<' '<<q*m<<' '<<a<<'\n';break;}}}}

Runs in 5 seconds on my machine.

Input format: something like this

2 1
3 1
3 2
4 1
4 3
5 1
5 2

Output format: something like this

2 1 1
3 1 1
3 2 1
4 1 1
4 3 1
5 1 1
5 2 1

(p', q', f(p', q') on each line)


Ungolfed code:

/* the program will not work without END_ANS and END_INPUT being 
correctly set but it can be easily calculated by a naive algorithm
END_ANS must be greater than all of the answers,and END_INPUT
must be greater than all of the inputs */
int constexpr END_ANS=6,//<-- according to the naive algorithm
END_INPUT=1024,END_M=4194304;
/* END_M is an parameter to the algorithm. As this gets larger,
the algorithm takes more resource but the answer gets better.
NOTE: (END_INPUT-1) * (END_M-1) must not overflow integer.
*/

static_assert((END_INPUT-1)*(long long)(END_M-1)<4294967296LL);

#include<iostream>
#include<vector>
#include<array>
#include<algorithm>

uint32_t f(uint32_t x){
    uint32_t len=32-__builtin_clz(x),
    sum=__builtin_popcount(x);
    return std::min(sum,len-sum);
}

/**Return a common value between two vectors,or 0 if not found.
The input vectors must be sorted.*/
int intersect(std::vector<int>const&a,std::vector<int>const&b){
    int i=0,j=0;
    while(i<a.size()&j<b.size()){
        if(a[i]==b[j])return a[i];
        else++(a[i]<b[j]?i:j);
    }
    return 0;
}

int main() {
    std::vector<std::array<std::vector<int>,END_ANS>>T(END_INPUT);
    //T[i][f1] = all values of m s.t. f(i*m) <= f1
    for(int i=1;i<END_INPUT;++i){
        for(unsigned m=1;m<END_M;++m)
            for(int fi=f(i*m);fi<END_ANS;++fi)
                T[i][fi].push_back(m);
    }

    // THIS PART IS USED FOR CALCULATING SCORE
    int score=0;

    int p,q;while(std::cin>>p>>q){
        for(int ans=1;;++ans){
            unsigned m=intersect(T[p][ans],T[q][ans]);
            if(m){
                std::cout<<p*m<<' '<<q*m<<' '<<ans<<'\n';

                // THIS PART IS USED FOR CALCULATING SCORE
                score+=std::max(ans-2,0);

                break;
            }
        }
    }

    // THIS PART IS USED FOR CALCULATING SCORE
    std::cerr<<score<<'\n';

}

Note: There are some hardcoded constants in the program (maximum input value, maximum ans, etc.) I assume it's allowed because otherwise most programs can work in 32-bit integer range anyway.

For each (p,q) pair, the program tests all pairs (p×m,q×m) for all m in range [1,MAX_M[ and find the optimal value, in a smart way.

Because the result is the same for MAX_M=500000 and MAX_M=4194304 (the largest value such that all intermediate computations still fit in an uint32_t (unsigned 32-bit integer)), I just use MAX_M=500000 so the program runs faster.

Bash+Jelly script to calculate score:

jelly eun 'ỴḲṛ/VƊ€_2»0S' "'''$(cat result.txt)'''" # <- weird input format

tl;dr: Tested all values of m up to 4194304 (exclusive), best penalty so far = 2330.

user202729

Posted 2018-03-25T13:41:15.877

Reputation: 14 620

I may be able to kill a few more bytes by changing between unsigned, uint32_t and int, but it's not very important anyway. – user202729 – 2018-03-29T15:48:51.930

Extended to 60000000 (6e7) now. – user202729 – 2018-03-31T16:11:13.997

I think I have an idea how to find the optimal solution... – user202729 – 2018-04-01T13:48:43.740

Yes, the optimal value is 2330. – user202729 – 2018-04-03T10:23:15.973

1

Jelly, 7437

  • Code size = 10 bytes
  • Total penalty = 4182
  • Score = round(101/4 * 4182) = 7437

Takes input as [p, q], returns [f(p, q), [p', q']].

BC,Bḅ1ZṀṂ,

Try it online!

How?

This code simply computes f(p, q) without trying to apply any multiplier.

BC,Bḅ1ZṀṂ, - main link taking [p, q]              --> e.g. [61, 40]
B          - convert to binary                    --> [[1,1,1,1,0,1], [1,0,1,0,0,0]]
 C         - 1-complement                         --> [[0,0,0,0,1,0], [0,1,0,1,1,1]]
   B       - convert to binary again              --> [[1,1,1,1,0,1], [1,0,1,0,0,0]]
  ,        - pair                                 --> [ [[0,0,0,0,1,0], [0,1,0,1,1,1]],
                                                        [[1,1,1,1,0,1], [1,0,1,0,0,0]] ]
    ḅ1     - convert from unary base,             --> [[1, 4], [5, 2]]
             which effectively sums the lists
      Z    - zip, so that number of 0-bits and    --> [[1, 5], [4, 2]]
             number of 1-bits are paired together
       ṀṂ  - maximum of the minimum               --> max(min(1, 5), min(4, 2)) =
                                                      max(1, 2) =
                                                      2
         , - pair with the input to comply with   --> [2, [61, 40]]
             the expected output format

Arnauld

Posted 2018-03-25T13:41:15.877

Reputation: 111 334

:| TODO convex hull optimization ... – user202729 – 2018-04-03T10:22:58.137

(why don't you S to sum?) – user202729 – 2018-04-03T10:23:59.443

@user202729 There must be a way to have it working with S but I'm not sure how to do it. Applying S on e.g. [[1,1,0,0],[0,0,0,1]] will result in [1,1,0,1], which is not what I want. (Disclaimer: I have little to no experience in Jelly.) – Arnauld – 2018-04-03T10:28:51.093

... Either ZS or S€, which doesn't really help. – user202729 – 2018-04-03T10:55:44.557