Calculate a probability exactly and quickly

10

1

[This is a partner question to Calculate a probability exactly ]

This task is about writing code to compute a probability exactly and quickly. The output should be a precise probability written as a fraction in its most reduced form. That is it should never output 4/8 but rather 1/2.

For some positive integer n, consider a uniformly random string of 1s and -1s of length n and call it A. Now concatenate to A its first value. That is A[1] = A[n+1] if indexing from 1. A now has length n+1. Now also consider a second random string of length n whose first n values are -1, 0, or 1 with probability 1/4,1/2, 1/4 each and call it B.

Now consider the inner product of A[1,...,n] and B and the inner product of A[2,...,n+1] and B.

For example, consider n=3. Possible values for A and B could be A = [-1,1,1,-1] and B=[0,1,-1]. In this case the two inner products are 0 and 2.

Your code must output the probability that both inner products are zero.

Copying the table produced by Martin Büttner we have the following sample results.

n   P(n)
1   1/2
2   3/8
3   7/32
4   89/512
5   269/2048
6   903/8192
7   3035/32768
8   169801/2097152

Languages and libraries

You can use any freely available language and libraries you like. I must be able to run your code so please include a full explanation for how to run/compile your code in linux if at all possible.

The task

Your code must start with n=1 and give the correct output for each increasing n on a separate line. It should stop after 10 seconds.

The score

The score is simply the highest n reached before your code stops after 10 seconds when run on my computer. If there is a tie, the winner be the one to get to the highest score quickest.

Table of entries

  • n = 64 in Python. Version 1 by Mitch Schwartz
  • n = 106 in Python. Version June 11 2015 by Mitch Schwartz
  • n = 151 in C++. Port of Mitch Schwartz's answer by kirbyfan64sos
  • n = 165 in Python. Version June 11 2015 the "pruning" version by Mitch Schwartz with N_MAX = 165.
  • n = 945 in Python by Min_25 using an exact formula. Amazing!
  • n = 1228 in Python by Mitch Schwartz using another exact formula (based on Min_25's previous answer).
  • n = 2761 in Python by Mitch Schwartz using a faster implementation of the same exact formula.
  • n = 3250 in Python using Pypy by Mitch Schwartz using the same implementation. This score needs pypy MitchSchwartz-faster.py |tail to avoid console scrolling overhead.

user9206

Posted 2015-06-08T16:39:33.403

Reputation:

I wonder if a numpy solution would run faster than Boost C++? – qwr – 2015-06-10T21:45:04.770

@qwr I think numpy, numba and cython would all be interesting as they keep within the Python family. – None – 2015-06-11T07:57:41.387

2I'd love to see more of these fastest-code problems – qwr – 2015-06-11T18:09:22.170

@qwr it's my favourite sort of question... Thank you! The challenge is to find one that doesn't just involve coding exactly the same algorithm in the lowest level language you can find. – None – 2015-06-11T18:10:40.707

Are you writing the results to the console or to a file? Using pypy and writing to a file seems the fastest for me. The console slows the process considerably. – gnibbler – 2015-06-15T21:13:13.400

@gnibbler Thank you! I updated the fastest score following your suggestion. I did find that pypy slowed down the dynamic programming approaches however. – None – 2015-06-16T05:58:27.097

Answers

24

Python

A closed-form formula of p(n) is

enter image description here

An exponential generating function of p(n) is

enter image description here

where I_0(x) is the modified Bessel function of the first kind.

Edit on 2015-06-11:
- updated the Python code.

Edit on 2015-06-13:
- added a proof of the above formula.
- fixed the time_limit.
- added a PARI/GP code.

Python

def solve():
  # straightforward implementation

  from time import time
  from itertools import count

  def binom(n, r):
    return facts[n] // (facts[r] * facts[n - r])

  def p(N):
    ans = 0
    for i in range(1 + N // 2):
      t = binom(2 * (N - 2 * i), N - 2 * i)
      t *= binom(N, 2 * i)
      t *= binom(4 * i, 2 * i)
      ans += t
    e = (ans & -ans).bit_length() - 1
    numer = ans >> e
    denom = 1 << (3 * N - 1 - e)
    return numer, denom

  facts = [1]
  time_limit = 10.0 + time()

  for i in count(1):
    facts.append(facts[-1] * (2 * i - 1))
    facts.append(facts[-1] * (2 * i))

    n, d = p(i)

    if time() > time_limit:
      break

    print("%d %d/%d" % (i, n, d))

solve()

PARI/GP

p(n) = polcoeff( (exp(x/2) + 1) * besseli(0, x/4) ^ 2, n) * n!;

Proof:
This problem is similar to a 2-dimensional (restricted) random walk problem.

If A[i] = A[i+1], we can move from (x, y) to (x+1, y+1) [1 way], (x, y) [2 ways] or (x-1, y-1) [1 way].

If A[i] != A[i+1], we can move from (x, y) to (x-1, y+1) [1 way], (x, y) [2 ways] or (x+1, y-1) [1 way].

Let a(n, m) = [x^m]((x+1)^n + (x-1)^n), b(n) = [x^n](1+x)^{2n} and c(n) be the number of ways to move from (0, 0) to (0, 0) with n steps.

Then, c(n) = \sum_{i=0}^n a(n, i) * b(i) * b(n-i).

Since p(n) = c(n) / 8^n, we can get the above closed-form formula.

Min_25

Posted 2015-06-08T16:39:33.403

Reputation: 356

1This is .. well.. amazing! How on earth did you compute the exact formula? – None – 2015-06-11T11:48:26.187

1Wow! Closed form is always neat! – qwr – 2015-06-11T18:08:17.287

I bet if you port this answer to C++, you can get a higher value of n. – mbomb007 – 2015-06-12T20:48:32.357

1@Lembik : I added a (rough) proof. – Min_25 – 2015-06-12T22:07:45.813

1@qwr : Thanks. I think so, too ! – Min_25 – 2015-06-12T22:08:48.710

1@mbomb007 : Yes. But, it is an implementation task rather than a computing task. So, I would not code it in C++. – Min_25 – 2015-06-12T22:14:03.873

Thank you for the proof! I look forward to fully understanding it. – None – 2015-06-13T05:32:51.840

I think I understand your proof now except I don't see where the "wrap-around" part is dealt with. Would you mind explaining that please? Also, do you think the same method can get a closed form solution for j =2 at http://codegolf.stackexchange.com/questions/51624/probabilities-how-high-can-you-go ? I just added a bounty there as well in case it's of interest.

– None – 2015-06-14T14:08:59.867

@Lembik Sorry, what does [the "wrap-around" part] mean ? About j >= 2, I think it is difficult to obtain a simple formula. For j = 1, a random walk is linearly independent, but not for j >= 2. – Min_25 – 2015-06-14T17:01:31.437

I just worked out how you solve the "wrap around" part. What I meant by that was that A[1] = A[n+1] . I see your point about the lack of independence as well for j >= 2. (Do you see my chat posts when I add @Min_25?) – None – 2015-06-14T17:01:34.817

@Lembik Thank you. Yes, I can see them. – Min_25 – 2015-06-14T17:13:02.137

9

Python

Note: Congratulations to Min_25 for finding a closed-form solution!

Thanks for the interesting problem! It can be solved using DP, although I am not currently feeling very motivated to optimise for speed to get a higher score. It could be nice for golf.

The code reached N=39 within 10 seconds on this old laptop running Python 2.7.5.

from time import*
from fractions import*
from collections import*

X={(1,0,0,0):1,(-1,0,0,0):1}

T=time()
N=0

while 1:
    Y=defaultdict(lambda:0)
    n=d=0
    for a,b,s,t in X:
        c=X[(a,b,s,t)]
        for A in ( (1,-1) if N else [a] ):
            for B in 1,0,0,-1:
                n+=c*(s+A*B==0==t+A*b+a*B)
                d+=c
                Y[(a,B,s+A*B,t+A*b)]+=c
    if time()>T+10: break
    N+=1
    print N,Fraction(n,d)
    X=Y

For tuple (a,b,s,t): a is the first element of A, b is the last element of B, s is the inner product of A[:-1] and B, and t is the inner product of A[1:-1] and B[:-1], using Python slice notation. My code does not store the arrays A or B anywhere, so I use those letters to refer to the next elements to be added to A and B, respectively. This variable naming choice makes the explanation a little awkward, but allows for a nice-looking A*b+a*B in the code itself. Note that the element being added to A is the penultimate one, since the last element is always the same as the first. I have used Martin Büttner's trick of including 0 twice in B candidates in order to get the proper probability distribution. The dictionary X (which is named Y for N+1) keeps track of the count of all possible arrays according to the value of the tuple. The variables n and d stand for numerator and denominator, which is why I renamed the n of the problem statement as N.

The key part of the logic is that you can update from N to N+1 using just the values in the tuple. The two inner products specified in the question are given by s+A*B and t+A*b+a*B. This is clear if you examine the definitions a bit; note that [A,a] and [b,B] are the last two elements of arrays A and B respectively.

Note that s and t are small and bounded according to N, and for a fast implementation in a fast language we could avoid dictionaries in favor of arrays.

It may be possible to take advantage of symmetry considering values differing only in sign; I have not looked into that.

Remark 1: The size of the dictionary grows quadratically in N, where size means number of key-value pairs.

Remark 2: If we set an upper bound on N, then we can prune tuples for which N_MAX - N <= |s| and similarly for t. This could be done by specifying an absorbing state, or implicitly with a variable to hold the count of pruned states (which would need to be multiplied by 8 at each iteration).

Update: This version is faster:

from time import*
from fractions import*
from collections import*

N_MAX=115

def main():
    T=time()

    N=1
    Y={(1,0,0,0):1,(1,1,1,0):1}
    n=1
    thresh=N_MAX

    while time() <= T+10:
        print('%d %s'%(N,Fraction(n,8**N/4)))

        N+=1
        X=Y
        Y=defaultdict(lambda:0)
        n=0

        if thresh<2:
            print('reached MAX_N with %.2f seconds remaining'%(T+10-time()))
            return

        for a,b,s,t in X:
            if not abs(s)<thresh>=abs(t):
                continue

            c=X[(a,b,s,t)]

            # 1,1

            if not s+1 and not t+b+a: n+=c
            Y[(a,1,s+1,t+b)]+=c

            # -1,1

            if not s-1 and not t-b+a: n+=c
            Y[(a,1,s-1,t-b)]+=c

            # 1,-1

            if not s-1 and not t+b-a: n+=c
            Y[(a,-1,s-1,t+b)]+=c

            # -1,-1

            if not s+1 and not t-b-a: n+=c
            Y[(a,-1,s+1,t-b)]+=c

            # 1,0

            c+=c

            if not s and not t+b: n+=c
            Y[(a,0,s,t+b)]+=c

            # -1,0

            if not s and not t-b: n+=c
            Y[(a,0,s,t-b)]+=c

        thresh-=1

main()

Optimisations implemented:

  • put everything into main() -- local variable access is faster than global
  • handle N=1 explicitly to avoid the check (1,-1) if N else [a] (which enforces that the first element in the tuple is consistent, when adding elements to A starting from the empty list)
  • unroll the inner loops, which also eliminates multiplication
  • double the count c for adding a 0 to B instead of doing those operations twice
  • the denominator is always 8^N so we do not need to keep track of it
  • now taking symmetry into account: we can fix the first element of A as 1 and divide the denominator by 2, since the valid pairs (A,B) with A[1]=1 and those with A[1]=-1 can be put into one-to-one correspondence by negating A. Similarly, we can fix the first element of B as non-negative.
  • now with pruning. You would need to fiddle with N_MAX to see what score it can get on your machine. It could be rewritten to find an appropriate N_MAX automatically by binary search, but seems unnecessary? Note: We do not need to check the pruning condition until reaching around N_MAX / 2, so we could get a little speedup by iterating in two phases, but I decided not to for simplicity and code cleanliness.

Mitch Schwartz

Posted 2015-06-08T16:39:33.403

Reputation: 4 899

1This is a really great answer! Could you explain what you did in your speed up? – None – 2015-06-10T13:06:05.053

@Lembik Thank you :) Added an explanation, plus another small optimisation, and made it Python3 compliant. – Mitch Schwartz – 2015-06-10T13:48:40.783

On my computer, I got N=57 for the first version and N=75 for the second. – kirbyfan64sos – 2015-06-10T15:41:42.620

Your answers have been awesome. It's just that Min_25's answer was even more so :) – None – 2015-06-11T17:39:41.320

5

Python

Using Min_25's idea of random walk, I was able to arrive at a different formula:

p(n) = \begin{cases} \frac{\sum _{i=0}^{\lfloor n/2 \rfloor} \binom{2i}{i}^2 \binom{n}{2i} 4^{n-2i}}{8^n} & n \text{ odd} \ \frac{\binom{n}{n/2}^2 + \sum _{i=0}^{\lfloor n/2 \rfloor} \binom{2i}{i}^2 \binom{n}{2i} 4^{n-2i}}{8^n} & n \text{ even} \ \end{cases}

Here is a Python implementation based on Min_25's:

from time import*
from itertools import*

def main():
    def binom(n, k):
        return facts[n]/(facts[k]*facts[n-k])

    def p(n):
        numer=0
        for i in range(n/2+1):
            t=binom(2*i,i)
            t*=t
            t*=binom(n,2*i)
            t<<=2*(n-2*i)
            numer+=t
        if not n&1:
            numer+=t
        e=(numer&-numer).bit_length()-1
        numer>>=e
        denom=1<<(3*n-e)
        return numer, denom

    facts=[1]
    time_limit=time()+10

    for i in count(1):
        facts.append(facts[-1]*i)

        n,d=p(i)

        if time()>time_limit:
            break

        print("%d %d/%d"%(i,n,d))

main()

Explanation / proof:

First we solve a related counting problem, where we allow A[n+1] = -A[1]; that is, the additional element concatenated to A can be 1 or -1 regardless of the first element. So we do not need to keep track of how many times A[i] = A[i+1] occurs. We have the following random walk:

From (x,y) we can move to (x+1,y+1) [1 way], (x+1,y-1) [1 way], (x-1,y+1) [1 way], (x-1,y-1) [1 way], (x,y) [4 ways]

where x and y stand for the two dot products, and we are counting the number of ways to move from (0,0) to (0,0) in n steps. That count would then be multiplied by 2 to account for the fact that A can start with 1 or -1.

We refer to staying at (x,y) as a zero move.

We iterate over the number of non-zero moves i, which has to be even in order to get back to (0,0). Horizontal and vertical movements make up two independent one-dimensional random walks, which can be counted by C(i,i/2)^2, where C(n,k) is binomial coefficient. (For a walk with k steps left and k steps right, there are C(2k,k) ways to choose the order of steps.) Additionally there are C(n,i) ways to place the moves, and 4^(n-i) ways to choose the zero moves. So we get:

a(n) = 2 * sum_{i in (0,2,4,...,n)} C(i/2,i)^2 * C(n,i) * 4^(n-i)

Now, we need to turn back to the original problem. Define an allowable pair (A,B) to be convertible if B contains a zero. Define a pair (A,B) to be almost allowable if A[n+1] = -A[1] and the two dot products are both zero.

Lemma: For a given n, the almost allowable pairs are in one-to-one correspondence with the convertible pairs.

We can (reversibly) convert a convertible pair (A,B) into an almost allowable pair (A',B') by negating A[m+1:] and B[m+1:], where m is the index of the last zero in B. The check for this is straightforward: If the last element of B is zero, we do not need to do anything. Otherwise, when we negate the last element of A, we can negate the last element of B in order to preserve the last term of the shifted dot product. But this negates the last value of the non-shifted dot product, so we fix this by negating the second-to-last element of A. But then this throws off the second-to-last value of the shifted product, so we negate the second-to-last element of B. And so on, until reaching a zero element in B.

Now, we just need to show that there are no almost allowable pairs for which B does not contain a zero. For a dot product to equal zero, we must have an equal number of 1 and -1 terms to cancel out. Each -1 term is composed of (1,-1) or (-1,1). So the parity of the number of -1 that occur is fixed according to n. If the first and last elements of A have different signs, we change the parity, so this is impossible.

So we get

c(n) = a(n)/2 if n is odd, else a(n)/2 + C(n,n/2)^2

p(n) = c(n) / 8^n

which gives the above formula (re-indexing with i' = i/2).

Update: Here is a faster version using the same formula:

from time import*
from itertools import*

def main():
    time_limit=time()+10

    binoms=[1]
    cb2s=[1]
    cb=1

    for n in count(1):
        if n&1:
            binoms=[a+b for a,b in zip([0]+binoms,binoms)]
        else:
            binoms=[a+b for a,b in zip([0]+binoms,binoms+[binoms[-1]])]
            cb=(cb<<2)-(cb+cb)/(n/2)
            cb2s.append(cb*cb)

        numer=0
        for i in xrange(n/2+1):
            t=cb2s[i]*binoms[min(2*i,n-2*i)]
            t<<=2*(n-2*i)
            numer+=t
        if not n&1:
            numer+=t
        e=(numer&-numer).bit_length()-1
        numer>>=e
        denom=1<<(3*n-e)

        if time()>time_limit:
            break

        print("%d %d/%d"%(n,numer,denom))

main()

Optimisations implemented:

  • inline function p(n)
  • use recurrence for binomial coefficients C(n,k) with k <= n/2
  • use recurrence for central binomial coefficients

Mitch Schwartz

Posted 2015-06-08T16:39:33.403

Reputation: 4 899

Just so you know, p(n) doesn't need to be a piecewise function. In general, if f(n) == {g(n) : n is odd; h(n) : n is even} then you can write f(n) == (n-2*floor(n/2))*g(n) + ((n+1)-2*(floor((n+1)/2)))*h(n) or use n mod 2 instead of (n-2*floor(n/2)). See here

– mbomb007 – 2015-06-17T23:19:55.747

1@mbomb007 That is obvious and uninteresting. – Mitch Schwartz – 2015-06-17T23:48:33.393

3

Explication of Min_25's formula

Min_25 posted a great proof but it took a while to follow. This is a bit of explanation to fill in between the lines.

a(n, m) represents the number of ways to choose A such that A[i] = A[i+1] m times. The formula for a(n,m) is equivalent to a(n,m) = {2 * (n choose m) for n-m even; 0 for n-m odd.} Only one parity is allowed because A[i] != A[i+1] must occur an even number of times so that A[0] = A[n]. The factor of 2 is due to the initial choice A[0] = 1 or A[0] = -1.

Once the number of (A[i] != A[i+1]) is fixed to be q (named i in the c(n) formula), it separates into two 1D random walks of length q and n-q. b(m) is the number of ways to take a one-dimensional random walk of m steps that ends at the same place it started, and has 25% chance of moving left, 50% chance of staying still, and 25% chance of moving right. A more obvious way to state the generating function is [x^m](1 + 2x + x^2)^n, where 1, 2x and x^2 represent left, no move, and right respectively. But then 1+2x+x^2 = (x+1)^2.

feersum

Posted 2015-06-08T16:39:33.403

Reputation: 29 566

Yet another reason to love PPCG! Thank you. – None – 2015-06-13T20:19:39.580

2

C++

Just a port of the (excellent) Python answer by Mitch Schwartz. The primary difference is that I used 2 to represent -1 for the a variable and did something similar for b, which allowed me to use an array. Using Intel C++ with -O3, I got N=141! My first version got N=140.

This uses Boost. I tried a parallel version but ran into some trouble.

#include <boost/multiprecision/gmp.hpp>
#include <boost/typeof/typeof.hpp>
#include <boost/rational.hpp>
#include <boost/chrono.hpp>
#include <boost/array.hpp>
#include <iostream>
#include <utility>
#include <map>

typedef boost::multiprecision::mpz_int integer;
typedef boost::array<boost::array<std::map<int, std::map<int, integer> >, 3>, 2> array;
typedef boost::rational<integer> rational;

int main() {
    BOOST_AUTO(T, boost::chrono::high_resolution_clock::now());

    int N = 1;
    integer n = 1;
    array* Y = new array, *X = NULL;
    (*Y)[1][0][0][0] = 1;
    (*Y)[1][1][1][0] = 1;

    while (boost::chrono::high_resolution_clock::now() < T+boost::chrono::seconds(10)) {
        std::cout << N << " " << rational(n, boost::multiprecision::pow(integer(8), N)/4) << std::endl;
        ++N;
        delete X;
        X = Y;
        Y = new array;
        n = 0;

        for (int a=0; a<2; ++a)
            for (int b=0; b<3; ++b)
                for (BOOST_AUTO(s, (*X)[a][b].begin()); s != (*X)[a][b].end(); ++s)
                    for (BOOST_AUTO(t, s->second.begin()); t != s->second.end(); ++t) {
                        integer c = t->second;
                        int d = b&2 ? -1 : b, e = a == 0 ? -1 : a;

                        if (s->first == -1 && t->first+d+e == 0) n += c;
                        (*Y)[a][1][s->first+1][t->first+d] += c;

                        if (s->first == 1 && t->first-d+e == 0) n += c;
                        (*Y)[a][1][s->first-1][t->first-d] += c;

                        if (s->first == 1 && t->first+d-e == 0) n += c;
                        (*Y)[a][2][s->first-1][t->first+d] += c;

                        if (s->first == -1 && t->first-d-e == 0) n += c;
                        (*Y)[a][2][s->first+1][t->first-d] += c;

                        c *= 2;

                        if (s->first == 0 && t->first+d == 0) n += c;
                        (*Y)[a][0][s->first][t->first+d] += c;

                        if (s->first == 0 && t->first-d == 0) n += c;
                        (*Y)[a][0][s->first][t->first-d] += c;
                    }
    }

    delete X;
    delete Y;
}

kirbyfan64sos

Posted 2015-06-08T16:39:33.403

Reputation: 8 730

This needs g++ -O3 kirbyfan64sos.cpp -o kirbyfan64sos -lboost_system -lboost_timer -lboost_chrono -lrt -lgmp to compile. (Thanks to aditsu.) – None – 2015-06-11T10:26:35.083