Compute the most efficient binary function

13

Today, we'll be computing the most efficient binary function. More specifically, we'll be computing the function which, when an expression is created from applying the function to the constant input 0 or its own output, can represent all positive integers with the shortest possible expressions, placing higher priority on the smaller integers.

This function is built as follows:

For each integer, starting at 1 and going upwards, choose the shortest expression which we have not yet assigned an output to, and make that integer the output of that expression. Ties in expression length will be broken by smaller left argument, and then by smaller right argument. Here's how it works:

  • Initially, 1 is unassigned. The shortest unassigned expression is f(0, 0), so we'll set that to 1.

  • Now, 2 is unassigned. The shortest unassigned expressions are f(f(0, 0), 0) = f(1, 0) and f(0, f(0, 0)) = f(0, 1). Ties are broken towards smaller left argument, so f(0, 1) = 2.

  • The shortest unassigned expression remaining is f(f(0, 0), 0) = f(1, 0), so f(1, 0) = 3.

  • Now, we're out of expressions with only 2fs and 30s, so we'll have to add one more of each. Breaking ties by left argument, then right argument, we get f(0, 2) = 4, since f(0, f(0, f(0, 0))) = f(0, f(0, 1)) = f(0, 2).

  • Continuing, we have f(0, 3) = 5, f(1, 1) = 6, f(2, 0) = 7, f(3, 0) = 8, f(0, 4) = 9, ...

Here's a table I filled out for the first few values:

    0  1  2  3  4  5  6  7  8
 /---------------------------
0|  1  2  4  5  9 10 11 12 13
1|  3  6 14 15 37 38 39 40 41
2|  7 16 42 43
3|  8 17 44 45
4| 18 46
5| 19 47
6| 20 48
7| 21 49
8| 22 50

Another way to look at it is that each output has a size, equal to the sum of the sizes of its inputs plus one. The table is filled out in order of increasing size of output, ties broken by minimizing left input then right input.

Your challenge is, given two non-negative integers as input, compute and output the value of this function. This is code golf. Shortest solution, in bytes, wins. Standard loopholes are banned.

isaacg

Posted 2017-03-20T04:39:00.947

Reputation: 39 268

Looks similar to A072766, but differs starting from f(3, 1).

– kennytm – 2017-03-20T17:22:22.727

2This is the first challenge in a while that puzzles me somewhat to calculate efficiently. I believe something is possible with Catalan numbers, but can't immediately think of a solution. Hmm... – orlp – 2017-03-20T18:36:27.860

2

Alright, so I don't think it'd make a good golfed answer, but what you can do to make it reasonably efficient is repeatedly subtract Catalan numbers from the function arguments until they're smaller than the next Catalan number. Then you've found the length of their expressions. Then you can use the ranking/unranking functions from this paper, with modification, to calculate the result. Perhaps after doing all that it's possible to 'cancel out' bits of code in the middle and find a reasonably elegant solution.

– orlp – 2017-03-21T07:37:57.657

Actually, the approach from my previous comment doesn't work. ((0, (0, (0, 0))), 0) is lexicographically smaller than (((0, 0), 0), (0, 0)), however the latter has a smaller left hand side. – orlp – 2017-03-21T21:18:57.613

Answers

6

Haskell, 110 bytes

f q=head[i|let c=[(-1,0)]:[[(f a,f b)|n<-[0..k],a<-c!!n,b<-c!!(k-n)]|k<-[0..]],(p,i)<-zip(concat c)[0..],p==q]

The argument here is taken to be the tuple (x,y). Pretty similar to the answer above, but the lookup list holds just the pairs of left and right indices instead of the trees.

halfflat

Posted 2017-03-20T04:39:00.947

Reputation: 261

1Nice answer! head[...] is [...]!!0 and (p,i)<-zip(concat c)[0..] can be shortened to (i,p)<-zip[0..]$id=<<c. – Laikoni – 2017-03-22T07:10:27.467

Thanks for the improvements! Definitely adding id=<< to the repertoire :) – halfflat – 2017-03-22T09:55:51.013

5

Python 3, 154 bytes

b=lambda n:[(l,r)for k in range(1,n)for l in b(k)for r in b(n-k)]+[0]*(n<2)
def f(x,y):r=sum((b(n)for n in range(1,x+y+3)),[]);return r.index((r[x],r[y]))

It's not very fast nor very golfy, but it's a start.

orlp

Posted 2017-03-20T04:39:00.947

Reputation: 37 067

5

Wow! I actually managed to make an efficient computation algorithm. I did not expect this at first. The solution is quite elegant. It repeatedly deduces more and more, then recurses all the way down to the base case of 0. In this answer the C(n) function denotes the Catalan numbers.

The crucial first step is acknowledging that there are C(0) = 1 values of length zero (namely 0 itself), C(1) = 1 values of length one (namely f(0, 0)), C(2) = 2 values of length two (f(0, f(0, 0)) and f(f(0, 0), 0)).

This means that if we're looking for the nth expression, and we find the biggest k such that C(0) + C(1) + ... + C(k) <= n then we know that n has length k.

But now we can continue! Because the expression we look for is the n - C(0) - C(1) - ... - C(k) th expression in its length class.

Now we can use a similar trick to find the length of the left segment, and then the rank within that subsection. And then recurse on those ranks we found!

It found that f(5030, 3749) = 1542317211 in the blink of an eye.

Python, noncompeting

def C(n):
    r = 1
    for i in range(n):
        r *= 2*n - i
        r //= i + 1
    return r//(n+1)

def unrank(n):
    if n == 0: return 0

    l = 0
    while C(l) <= n:
        n -= C(l)
        l += 1

    right_l = l - 1
    while right_l and n >= C(l - 1 - right_l) * C(right_l):
        n -= C(l - 1 - right_l) * C(right_l)
        right_l -= 1

    right_num = C(right_l)

    r_rank = n % right_num
    l_rank = n // right_num

    for sz in range(l - 1 - right_l): l_rank += C(sz)
    for sz in range(right_l): r_rank += C(sz)

    return (unrank(l_rank), unrank(r_rank))

def rank(e):
    if e == 0: return 0
    left, right = e

    l = str(e).count("(")
    left_l = str(left).count("(")
    right_l = str(right).count("(")
    right_num = C(right_l)

    n = sum(C(sz) for sz in range(l))
    n += sum(C(sz)*C(l - 1 - sz) for sz in range(left_l))

    n += (rank(left) - sum(C(sz) for sz in range(left_l))) * C(right_l)
    n += rank(right) - sum(C(sz) for sz in range(right_l))

    return n

def f(x, y):
    return rank((unrank(x), unrank(y)))

I'm fairly certain I'm doing a bunch of unnecessary computation and a lot of middle steps could be removed.

orlp

Posted 2017-03-20T04:39:00.947

Reputation: 37 067