\!| kurtosis =| entropy =| mgf =| char =| }}
In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, then the binomial distribution is the Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.
A typical example is the following: assume 5% of the population is green-eyed. You pick 500 people randomly. How likely is it that you get 30 or more green-eyed people? The number of green-eyed people you pick is a random variable X which follows a binomial distribution with n = 500 and p = 0.05 (when picking the people with replacement). We are interested in the probability Pr≥ 30.
In general, if the random variable X follows the binomial distribution with parameters n and p, we write X ~ B(n, p). The probability of getting exactly k successes is given by the probability mass function:
for and where
is the binomial coefficient "n choose k" (also denoted C(n, k) or nCk), hence the name of the distribution. The formula can be understood as follows: we want k successes (pk) and n − k failures ((1 − p)n − k). However, the k successes can occur anywhere among the n trials, and there are C(n, k) different ways of distributing k successes in a sequence of n trials.
In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as
So, one must look to a different k and a different p (the binomial is not symmetrical in general).
The cumulative distribution function can be expressed in terms of the regularized incomplete beta function, as follows:
provided k is an integer and 0 ≤ k ≤ n. If x is not necessarily an integer or not necessarily positive, one can express it thus:
where is the greatest integer less than or equal to x.
For , upper bounds for the lower tail of the distribution function can be derived. In particular, Hoeffding's inequality yields the bound
and Chernoff's inequality can be used to derive the bound
If X ~ B(n, p) (that is, X is a binomially distributed random variate), then the expected value of X is
and the variance is
This fact is easily proven as follows. Suppose first that we have exactly one Bernoulli trial. We have two possible outcomes, 1 and 0, with the first having probability p and the second having probability 1 − p; the mean for this trial is given by μ = p. Using the definition of variance, we have
Now suppose that we want the variance for n such trials (i.e. for the general binomial distribution). Since the trials are independent, we may add the variances for each trial, giving
The most likely value or mode of X is given by the largest integer less than or equal to (n + 1)p; if m = (n + 1)p is itself an integer, then m − 1 and m are both modes.
The formula for Bézier curves was inspired by the binomial distribution.
Factorial and binomial topics | Probability and statistics
Binomické rozdělení | Binomialfordeling | Binomialverteilung | Distribución binomial | Loi binomiale | Variabile casuale binomiale | התפלגות בינומית | Binominis skirstinys | Binomiale verdeling | 二項分布 | Rozkład dwumianowy | Distribuição binomial | Биномиальное распределение | Sebaran binomial | Binomijakauma | Binomialfördelning | 二項分佈
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Binomial distribution".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world