Poisson binomial distribution

Poisson binomial
Parameters	— success probabilities for each of the n trials
Support	k ∈ { 0, …, n }
pmf
CDF
Mean
Variance
Skewness
Ex. kurtosis
MGF
CF

In probability theory and statistics, the Poisson binomial distribution is the discrete probability distribution of a sum of independent Bernoulli trials that are not necessarily identically distributed. The concept is named after Siméon Denis Poisson.

In other words, it is the probability distribution of the number of successes in a sequence of n independent yes/no experiments with success probabilities $p_{1},p_{2},\dots ,p_{n}$ . The ordinary binomial distribution is a special case of the Poisson binomial distribution, when all success probabilities are the same, that is $p_{1}=p_{2}=\cdots =p_{n}$ .

Mean and Variance

Since a Poisson binomial distributed variable is a sum of n independent Bernoulli distributed variables, its mean and variance will simply be sums of the mean and variance of the n Bernoulli distributions:

\mu =\sum \limits _{i=1}^{n}p_{i}

\sigma ^{2}=\sum \limits _{i=1}^{n}(1-p_{i})p_{i}

For fixed values of the mean ( $\mu$ ) and size (n), the variance is maximal when all success probabilities are equal and we have a binomial distribution. When the mean is fixed, the variance is bounded from above by the variance of the Poisson distribution with the same mean which is attained asymptotically as n tends to infinity.

Probability Mass Function

The probability of having k successful trials out of a total of n can be written as the sum [1]

\Pr(K=k)=\sum \limits _{A\in F_{k}}\prod \limits _{i\in A}p_{i}\prod \limits _{j\in A^{c}}(1-p_{j})

where $F_{k}$ is the set of all subsets of k integers that can be selected from {1,2,3,...,n}. For example, if n = 3, then $F_{2}=\left\{\{1,2\},\{1,3\},\{2,3\}\right\}$ . $A^{c}$ is the complement of $A$ , i.e. $A^{c}=\{1,2,3,\dots ,n\}\setminus A$ .

$F_{k}$ will contain $n!/((n-k)!k!)$ elements, the sum over which is infeasible to compute in practice unless the number of trials n is small (e.g. if n = 30, $F_{15}$ contains over 10²⁰ elements). However, there are other, more efficient ways to calculate $\Pr(K=k)$ .

As long as none of the success probabilities are equal to one, one can calculate the probability of k successes using the recursive formula [2] [3]

\Pr(K=k)={\begin{cases}\prod \limits _{i=1}^{n}(1-p_{i})&k=0\\{\frac {1}{k}}\sum \limits _{i=1}^{k}(-1)^{i-1}\Pr(K=k-i)T(i)&k>0\\\end{cases}}

where

T(i)=\sum \limits _{j=1}^{n}\left({\frac {p_{j}}{1-p_{j}}}\right)^{i}.

The recursive formula is not numerically stable, and should be avoided if $n$ is greater than approximately 20. Another possibility is using the discrete Fourier transform.[4]

\Pr(K=k)={\frac {1}{n+1}}\sum \limits _{l=0}^{n}C^{-lk}\prod \limits _{m=1}^{n}\left(1+(C^{l}-1)p_{m}\right)

where $C=\exp \left({\frac {2i\pi }{n+1}}\right)$ and $i={\sqrt {-1}}$ .

Still other methods are described in [5] .

Entropy

There is no simple formula for the entropy of a Poisson binomial distribution, but the entropy is bounded above by the entropy of a binomial distribution with the same number parameter and the same mean. Therefore, the entropy is also bounded above by the entropy of a Poisson distribution with the same mean.[6]

The Shepp–Olkin concavity conjecture, due to Lawrence Shepp and Ingram Olkin in 1981, states that the entropy of a Poisson binomial distribution is a concave function of the success probabilities $p_{1},p_{2},\dots ,p_{n}$ .[7] This conjecture was proved by Erwan Hillion and Oliver Johnson in 2015.[8] The Shepp-Olkin monotonicity conjecture, also from the same 1981 paper, is that the entropy is monotone increasing in $p_{i}$ , if all $p_{i}\leq 1/2$ . This conjecture was also proved by Hillion and Johnson, in 2019 [9]

Chernoff bound

The probability that a Poisson binomial distribution gets large, can be bounded using its moment generating function as follows (valid when $s\geq \mu$ ):

{\begin{aligned}\Pr[S\geq s]&\leq \exp(-st)\operatorname {E} \left[\exp \left[t\sum _{i}X_{i}\right]\right]\\&=\exp(-st)\prod _{i}(1-p_{i}+e^{t}p_{i})\\&=\exp \left(-st+\sum _{i}\log \left(p_{i}(e^{t}-1)+1\right)\right)\\&\leq \exp \left(-st+\sum _{i}\log \left(\exp(p_{i}(e^{t}-1))\right)\right)\\&=\exp \left(-st+\sum _{i}p_{i}(e^{t}-1)\right)\\&=\exp \left(s-\mu -s\log {\frac {s}{\mu }}\right),\end{aligned}}

where we took ${\textstyle t=\log \left(s\left/\sum _{i}p_{i}\right.\right)}$ . This is similar to the tail bounds of a binomial distribution.

gollark: Well, I can wait quite easily. But you know.

gollark: Can't wait for micro-LED panels.

gollark: What, always?

gollark: Battery life?

gollark: Because power efficiency.

References

Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312.
Shah, B. K. (1994). "On the distribution of the sum of independent integer valued random variables". American Statistician. 27 (3): 123–124. JSTOR 2683639.
Chen, X. H.; A. P. Dempster; J. S. Liu (1994). "Weighted finite population sampling to maximize entropy" (PDF). Biometrika. 81 (3): 457. doi:10.1093/biomet/81.3.457.
Fernandez, M.; S. Williams (2010). "Closed-Form Expression for the Poisson-Binomial Probability Density Function". IEEE Transactions on Aerospace and Electronic Systems. 46 (2): 803–817. Bibcode:2010ITAES..46..803F. doi:10.1109/TAES.2010.5461658.
Chen, S. X.; J. S. Liu (1997). "Statistical Applications of the Poisson-Binomial and conditional Bernoulli distributions". Statistica Sinica. 7: 875–892.
Harremoës, P. (2001). "Binomial and Poisson distributions as maximum entropy distributions" (PDF). IEEE Transactions on Information Theory. 47 (5): 2039–2041. doi:10.1109/18.930936.
Shepp, Lawrence; Olkin, Ingram (1981). "Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution". In Gani, J.; Rohatgi, V.K. (eds.). Contributions to probability: A collection of papers dedicated to Eugene Lukacs. New York: Academic Press. pp. 201–206. ISBN 0-12-274460-8. MR 0618689.
Hillion, Erwan; Johnson, Oliver (2015-03-05). "A proof of the Shepp-Olkin entropy concavity conjecture". Bernoulli. 23: 3638–3649. arXiv:1503.01570. doi:10.3150/16-BEJ860.
Hillion, Erwan; Johnson, Oliver (2019-11-09). "A proof of the Shepp-Olkin entropy monotonicity conjecture". Electronic Journal of Probability. 24 (126): 1–14. doi:10.1214/19-EJP380.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312.

[2] Shah, B. K. (1994). "On the distribution of the sum of independent integer valued random variables". American Statistician. 27 (3): 123–124. JSTOR 2683639.

[3] Chen, X. H.; A. P. Dempster; J. S. Liu (1994). "Weighted finite population sampling to maximize entropy" (PDF). Biometrika. 81 (3): 457. doi:10.1093/biomet/81.3.457.

[4] Fernandez, M.; S. Williams (2010). "Closed-Form Expression for the Poisson-Binomial Probability Density Function". IEEE Transactions on Aerospace and Electronic Systems. 46 (2): 803–817. Bibcode:2010ITAES..46..803F. doi:10.1109/TAES.2010.5461658.

[5] Chen, S. X.; J. S. Liu (1997). "Statistical Applications of the Poisson-Binomial and conditional Bernoulli distributions". Statistica Sinica. 7: 875–892.

[6] Harremoës, P. (2001). "Binomial and Poisson distributions as maximum entropy distributions" (PDF). IEEE Transactions on Information Theory. 47 (5): 2039–2041. doi:10.1109/18.930936.

[7] Shepp, Lawrence; Olkin, Ingram (1981). "Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution". In Gani, J.; Rohatgi, V.K. (eds.). Contributions to probability: A collection of papers dedicated to Eugene Lukacs. New York: Academic Press. pp. 201–206. ISBN 0-12-274460-8. MR 0618689.

[8] Hillion, Erwan; Johnson, Oliver (2015-03-05). "A proof of the Shepp-Olkin entropy concavity conjecture". Bernoulli. 23: 3638–3649. arXiv:1503.01570. doi:10.3150/16-BEJ860.

[9] Hillion, Erwan; Johnson, Oliver (2019-11-09). "A proof of the Shepp-Olkin entropy monotonicity conjecture". Electronic Journal of Probability. 24 (126): 1–14. doi:10.1214/19-EJP380.

Probability distributions (List)
Discrete univariate with finite support	Benford Bernoulli beta-binomial binomial categorical hypergeometric Poisson binomial Rademacher soliton discrete uniform Zipf Zipf–Mandelbrot
Discrete univariate with infinite support	beta negative binomial Borel Conway–Maxwell–Poisson discrete phase-type Delaporte extended negative binomial Flory–Schulz Gauss–Kuzmin geometric logarithmic negative binomial parabolic fractal Poisson Skellam Yule–Simon zeta
Continuous univariate supported on a bounded interval	arcsine ARGUS Balding–Nichols Bates beta beta rectangular continuous Bernoulli Irwin–Hall Kumaraswamy logit-normal noncentral beta raised cosine reciprocal triangular U-quadratic uniform Wigner semicircle
Continuous univariate supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind beta prime Burr chi-squared chi Dagum Davis exponential-logarithmic Erlang exponential F folded normal Fréchet gamma gamma/Gompertz generalized gamma generalized inverse Gaussian Gompertz half-logistic half-normal Hotelling's T-squared hyper-Erlang hyperexponential hypoexponential inverse chi-squared scaled inverse chi-squared inverse Gaussian inverse gamma Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal Lomax matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami noncentral chi-squared noncentral F Pareto phase-type poly-Weibull Rayleigh relativistic Breit–Wigner Rice shifted Gompertz truncated normal type-2 Gumbel Weibull discrete Weibull Wilks's lambda
Continuous univariate supported on the whole real line	Cauchy exponential power Fisher's z Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S_U Landau Laplace asymmetric Laplace logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t type-1 Gumbel Tracy–Widom variance-gamma Voigt
Continuous univariate with support whose type varies	generalized chi-squared generalized extreme value generalized Pareto Marchenko–Pastur q-exponential q-Gaussian q-Weibull shifted log-logistic Tukey lambda
Mixed continuous-discrete univariate	rectified Gaussian
Multivariate (joint)	Discrete Ewens multinomial Dirichlet-multinomial negative multinomial Continuous Dirichlet generalized Dirichlet multivariate Laplace multivariate normal multivariate stable multivariate t normal-inverse-gamma normal-gamma Matrix-valued inverse matrix gamma inverse-Wishart matrix normal matrix t matrix gamma normal-inverse-Wishart normal-Wishart Wishart
Directional	Univariate (circular) directional Circular uniform univariate von Mises wrapped normal wrapped Cauchy wrapped exponential wrapped asymmetric Laplace wrapped Lévy Bivariate (spherical) Kent Bivariate (toroidal) bivariate von Mises Multivariate von Mises–Fisher Bingham
Degenerate and singular	Degenerate Dirac delta function Singular Cantor
Families	Circular compound Poisson elliptical exponential natural exponential location–scale maximum entropy mixture Pearson Tweedie wrapped