Probabilities - how high can you go?

10

3

I previously asked a question for how to compute a probability quickly and accurately. However, evidently it was too easy as a closed form solution was given! Here is a more difficult version.

This task is about writing code to compute a probability exactly and quickly. The output should be a precise probability written as a fraction in its most reduced form. That is it should never output 4/8 but rather 1/2.

For some positive integer n, consider a uniformly random string of 1s and -1s of length n and call it A. Now concatenate to A a copy of itself. That is A[1] = A[n+1] if indexing from 1, A[2] = A[n+2] and so on. A now has length 2n. Now also consider a second random string of length n whose first n values are -1, 0, or 1 with probability 1/4,1/2, 1/4 each and call it B.

Now consider the inner product of B with A[1+j,...,n+j] for different j =0,1,2,....

For example, consider n=3. Possible values for A and B could be A = [-1,1,1,-1,...] and B=[0,1,-1]. In this case the first two inner products are 0 and 2.

Task

For each j, starting with j=1, your code should output the probability that all the first j+1 inner products are zero for every n=j,...,50.

Copying the table produced by Martin Büttner for j=1 we have the following sample results.

n   P(n)
1   1/2
2   3/8
3   7/32
4   89/512
5   269/2048
6   903/8192
7   3035/32768
8   169801/2097152

Score

Your score is the largest j your code completes in 1 minute on my computer. To clarify a little, each j gets one minute. Note that the dynamic programming code in the previous linked question will do this easily for j=1.

Tie breaker

If two entries get the same j score then the winning entry will be the one that gets to the highest n in one minute on my machine for that j. If the two best entries are equal on this criterion too then the winner will be the answer submitted first.

Languages and libraries

You can use any freely available language and libraries you like. I must be able to run your code so please include a full explanation for how to run/compile your code in linux if at all possible.

My Machine The timings will be run on my machine. This is a standard ubuntu install on an AMD FX-8350 Eight-Core Processor. This also means I need to be able to run your code.

Winning entries

  • j=2 in Python by Mitch Schwartz.
  • j=2 in Python by feersum. Currently the fastest entry.

user9206

Posted 2015-06-12T08:49:04.337

Reputation:

If the question is unclear in any way, please let me know so I can fix it quickly. – None – 2015-06-12T09:04:18.567

2You are by far my favorite question asker. Then again, I do have a thing for computing values exactly and quickly. – primo – 2015-06-12T12:14:12.197

@primo Thank you! Does this mean we can expect an answer in RPython? :) – None – 2015-06-12T12:51:11.843

Could you put the difference between this question and the other one? – kirbyfan64sos – 2015-06-12T18:34:47.660

@kirbyfan64sos The other one is essentially the same question for j=1 only. – None – 2015-06-12T18:50:04.367

Ok, so, the program calculates the first 50 probabilities above j? Kind of like for n in range(j, 50+j):? – kirbyfan64sos – 2015-06-12T19:59:08.000

@kirbyfan64sos Actually just up to 50. So from j to 50. They don't make much sense below j. – None – 2015-06-14T14:20:14.620

I can't wait until someone asks casually a question whose solution would mean solution to to P=NP, and someone else casually solves it (in less than 100 bytes.) – rr- – 2015-06-14T15:30:23.860

I still don't think I understand the question. So it's j=1, n=1...50, j=2, n=2...50, etc? – qwr – 2015-06-16T08:20:38.453

@qwr Yes exactly. I am often in chat too if that could help. – None – 2015-06-16T08:21:30.837

Answers

3

Python 2, j=2

I tried to find a sort of 'closed form' for j=2. Perhaps I could make a MathJax image of it, although it would be really ugly with all the index fiddling. I wrote this unoptimized code only to test the formula. It takes about 1 second to complete. The results match with Mitch Schwartz's code.

ch = lambda n, k: n>=k>=0 and fac[n]/fac[k]/fac[n-k]
W = lambda l, d: ch(2*l, l+d)
P = lambda n, p: n==0 or ch(n-1, p-1)
ir = lambda a,b: xrange(a,b+1)

N = 50
fac = [1]
for i in ir(1,4*N): fac += [i * fac[-1]]

for n in ir(2, N):
    s = 0
    for i in ir(0,n+1):
     for j in ir(0,min(i,n+1-i)):
      for k in ir(0,n+i%2-i-j):
       t=(2-(k==0))*W(n+i%2-i-j,k)*W(i-(j+i%2),k)*W(j,k)**2*P(i,j+i%2)*P(n+1-i,j+1-i%2)
       s+=t
    denp = 3 * n - 1
    while denp and not s&1: denp -= 1; s>>=1
    print n, '%d/%d'%(s,1<<denp)

Consider a sequence where the ith member is e if A[i] == A[i+1] or n if A[i] != A[i+1]. i in the program is the number of ns. If i is even, the sequence must begin in and end with e. If i is odd, the sequence must begin and end with n. Sequences are further classified by the number of runs of consecutive es or ns. There are j + 1 of one and j of the other.

When the random walk idea is extended to 3 dimensions, there is an unfortunate problem that there are 4 possible directions to walk in (one for each of ee,en,ne, or nn) which means they are not linearly dependent. So the k index sums over the number of steps taken in one of the directions (1, 1, 1). Then there will be an exact number of steps which must taken in the other 3 directions to cancel it.

P(n,p) gives the number of ordered partitions of n objects into p parts. W(l,d) gives the number of ways for a random walk of l steps to travel a distance of exactly d. As before there is 1 chance to move left, 1 chance to move right and 2 to stay put.

feersum

Posted 2015-06-12T08:49:04.337

Reputation: 29 566

Thank you! An image of the formula would be really great. – None – 2015-06-16T05:26:55.327

Thank you for the explanation. You make it look simple! I just saw your comment that you could make a solution for j=3. That would be amazing! – None – 2015-06-18T07:05:27.187

3

Python, j=2

The dynamic programming approach for j = 1 in my answer to the previous question does not need many modifications to work for higher j, but gets slow quickly. Table for reference:

n   p(n)

2   3/8
3   11/64
4   71/512
5   323/4096
6   501/8192
7   2927/65536
8   76519/2097152
9   490655/16777216
10  207313/8388608

And the code:

from time import*
from fractions import*
from collections import*

def main():
    N_MAX=50

    T=time()

    n=2
    Y=defaultdict(lambda:0)
    numer=0

    for a1 in [1]:
        for b1 in (1,0):
            for a2 in (1,-1):
                for b2 in (1,0,0,-1):
                    if not a1*b1+a2*b2 and not a2*b1+a1*b2:
                        numer+=1
                    Y[(a1,a2,b1,b2,a1*b1+a2*b2,a2*b1,0)]+=1

    thresh=N_MAX-1

    while time() <= T+60:
        print('%d %s'%(n,Fraction(numer,8**n/4)))

        if thresh<2:
            print('reached N_MAX with %.2f seconds remaining'%(T+60-time()))
            return

        n+=1
        X=Y
        Y=defaultdict(lambda:0)
        numer=0

        for a1,a2,b1,b2,s,t,u in X:
            if not ( abs(s)<thresh and abs(t)<thresh+1 and abs(u)<thresh+2 ):
                continue

            c=X[(a1,a2,b1,b2,s,t,u)]

            # 1,1

            if not s+1 and not t+b2+a1 and not u+b1+a1*b2+a2: numer+=c
            Y[(a1,a2,b2,1,s+1,t+b2,u+b1)]+=c

            # -1,1

            if not s-1 and not t-b2+a1 and not u-b1+a1*b2+a2: numer+=c
            Y[(a1,a2,b2,1,s-1,t-b2,u-b1)]+=c

            # 1,-1

            if not s-1 and not t+b2-a1 and not u+b1+a1*b2-a2: numer+=c
            Y[(a1,a2,b2,-1,s-1,t+b2,u+b1)]+=c

            # -1,-1

            if not s+1 and not t-b2-a1 and not u-b1+a1*b2-a2: numer+=c
            Y[(a1,a2,b2,-1,s+1,t-b2,u-b1)]+=c

            # 1,0

            c+=c

            if not s and not t+b2 and not u+b1+a1*b2: numer+=c
            Y[(a1,a2,b2,0,s,t+b2,u+b1)]+=c

            # -1,0

            if not s and not t-b2 and not u-b1+a1*b2: numer+=c
            Y[(a1,a2,b2,0,s,t-b2,u-b1)]+=c

        thresh-=1

main()

Here we are keeping track of the first two elements of A, the last two elements of B (where b2 is the last element), and the inner products of (A[:n], B), (A[1:n], B[:-1]), and (A[2:n], B[:-2]).

Mitch Schwartz

Posted 2015-06-12T08:49:04.337

Reputation: 4 899

....reached N_MAX with 21.20 seconds remaining – None – 2015-06-12T10:52:23.987