Geometric Distribution
Geometric
Probability mass function
|
Cumulative distribution function
|
| Parameters
|
success probability (real)
|
| Support
|
|
| PMF
|
|
| CDF
|
|
| Mean
|
|
| Median
|
(not unique if is an integer)
|
| Mode
|
|
| Variance
|
|
| Skewness
|
|
| Ex. kurtosis
|
|
| Entropy
|
|
| MGF
|
, for
|
| CF
|
|
There are two similar distributions with the name "Geometric Distribution".
- The probability distribution of the number X of Bernoulli trials needed to get one success, supported on the set { 1, 2, 3, ...}
- The probability distribution of the number Y = X − 1 of failures before the first success, supported on the set { 0, 1, 2, 3, ... }
These two different geometric distributions should not be confused with each other. Often, the name shifted geometric distribution is adopted for the former one. We will use X and Y to refer to distinguish the two.
Shifted
The shifted Geometric Distribution refers to the probability of the number of times needed to do something until getting a desired result. For example:
- How many times will I throw a coin until it lands on heads?
- How many children will I have until I get a girl?
- How many cards will I draw from a pack until I get a Joker?
Just like the Bernoulli Distribution, the Geometric distribution has one controlling parameter: The probability of success in any independent test.
If a random variable X is distributed with a Geometric Distribution with a parameter p we write its probability mass function as:
With a Geometric Distribution it is also pretty easy to calculate the probability of a "more than n times" case. The probability of failing to achieve the wanted result is
.
Example: a student comes home from a party in the forest, in which interesting substances were consumed. The student is trying to find the key to his front door, out of a keychain with 10 different keys. What is the probability of the student succeeding in finding the right key in the 4th attempt?
Unshifted
The probability mass function is defined as:
for 
Mean
![{\displaystyle \operatorname {E} [X]=\sum _{i}f(x_{i})x_{i}=\sum _{0}^{\infty }p(1-p)^{x}x}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/14772fe3fb1aac3aeda043b8844d8e44ea9f4beb.svg)
Let q=1-p
![{\displaystyle \operatorname {E} [X]=\sum _{0}^{\infty }(1-q)q^{x}x}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/f9ad822d6cb59a023236e1c2dba4ea8ce8fe0bb0.svg)
![{\displaystyle \operatorname {E} [X]=\sum _{0}^{\infty }(1-q)qq^{x-1}x}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/b173d51fc1397b9d8290c8e8a58dbac31d80e0cd.svg)
![{\displaystyle \operatorname {E} [X]=(1-q)q\sum _{0}^{\infty }q^{x-1}x}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/3694fabe53f333614812d1e6b7a75577a8b00d2c.svg)
![{\displaystyle \operatorname {E} [X]=(1-q)q\sum _{0}^{\infty }{\frac {d}{dq}}q^{x}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/00ecb1f87eadc5d1bec6706a38d2e526254149b0.svg)
We can now interchange the derivative and the sum.
![{\displaystyle \operatorname {E} [X]=(1-q)q{\frac {d}{dq}}\sum _{0}^{\infty }q^{x}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/29126ac9a1e1420eb4b6f0b292b11542191314ac.svg)
![{\displaystyle \operatorname {E} [X]=(1-q)q{\frac {d}{dq}}{1 \over 1-q}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/c28cabe4165678e022964b678cb3bed82f7d0a7f.svg)
![{\displaystyle \operatorname {E} [X]=(1-q)q{1 \over (1-q)^{2}}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/0088bfe78733dbdd446cdcee8839d8b06334e2b5.svg)
![{\displaystyle \operatorname {E} [X]=q{1 \over (1-q)}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/d22bd1347ab70303be5bafa8d54e6b05a41fd008.svg)
![{\displaystyle \operatorname {E} [X]={(1-p) \over p}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/005e463e9bbf3557dcc3ad6930b3321e09013809.svg)
Variance
We derive the variance using the following formula:
![{\displaystyle \operatorname {Var} [X]=\operatorname {E} [X^{2}]-(\operatorname {E} [X])^{2}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/d3a927c2af9ce191701ce464bf66fe7a6fe2190c.svg)
We have already calculated E[X] above, so now we will calculate E[X2] and then return to this variance formula:
![{\displaystyle \operatorname {E} [X^{2}]=\sum _{i}f(x_{i})\cdot x^{2}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/9b457462143b3c9a9d2d1c9de47dd57fae3dbec2.svg)
![{\displaystyle \operatorname {E} [X^{2}]=\sum _{0}^{\infty }p(1-p)^{x}x^{2}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/5683649108ad3852d33b3f0f8ca642e237678fd5.svg)
Let q=1-p
![{\displaystyle \operatorname {E} [X^{2}]=\sum _{0}^{\infty }(1-q)q^{x}x^{2}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/3d5c3ee343acd07e0bb8c579be8dc55ca22fc589.svg)
We now manipulate x2 so that we get forms that are easy to handle by the technique used when deriving the mean.
![{\displaystyle \operatorname {E} [X^{2}]=(1-q)\sum _{0}^{\infty }q^{x}[(x^{2}-x)+x]}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/75d3eafe380f83410881e368fe533b827ca74f92.svg)
![{\displaystyle \operatorname {E} [X^{2}]=(1-q)\left[\sum _{0}^{\infty }q^{x}(x^{2}-x)+\sum _{0}^{\infty }q^{x}x\right]}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/24f93fd82a11d90cda7b6d70be13b8244e3a7d05.svg)
![{\displaystyle \operatorname {E} [X^{2}]=(1-q)\left[q^{2}\sum _{0}^{\infty }q^{x-2}x(x-1)+q\sum _{0}^{\infty }q^{x-1}x\right]}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/e6b1637a45a96e3347eac070309b680a9d4ff66c.svg)
![{\displaystyle \operatorname {E} [X^{2}]=(1-q)q\left[q\sum _{0}^{\infty }{\frac {d^{2}}{(dq)^{2}}}q^{x}+\sum _{0}^{\infty }{\frac {d}{dq}}q^{x}\right]}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/a670c354fefd15e8e9c92ae0e84f11f2aaa924ea.svg)
![{\displaystyle \operatorname {E} [X^{2}]=(1-q)q\left[q{\frac {d^{2}}{(dq)^{2}}}\sum _{0}^{\infty }q^{x}+{\frac {d}{dq}}\sum _{0}^{\infty }q^{x}\right]}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/eb989d8fec7d63001d1753487542e9c66ee35688.svg)
![{\displaystyle \operatorname {E} [X^{2}]=(1-q)q\left[q{\frac {d^{2}}{(dq)^{2}}}{1 \over 1-q}+{\frac {d}{dq}}{1 \over 1-q}\right]}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/dc28f609f6a73d90dd592e92f471e1f9c892f962.svg)
![{\displaystyle \operatorname {E} [X^{2}]=(1-q)q\left[q{2 \over (1-q)^{3}}+{1 \over (1-q)^{2}}\right]}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/569b086d793e05931e2f107f0faa3669e3a2e38e.svg)
![{\displaystyle \operatorname {E} [X^{2}]={2q^{2} \over (1-q)^{2}}+{q \over (1-q)}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/f1d4242ae0316880299a80f3b86026827724baa2.svg)
![{\displaystyle \operatorname {E} [X^{2}]={2q^{2}+q(1-q) \over (1-q)^{2}}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/5fa45c19cb12088654ff064a34fe60e5d01145e3.svg)
![{\displaystyle \operatorname {E} [X^{2}]={q(q+1) \over (1-q)^{2}}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/0439f27f91edf1d341788b23bebb1f52f7aed08e.svg)
![{\displaystyle \operatorname {E} [X^{2}]={(1-p)(2-p) \over p^{2}}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/a65e09f63d237df831cf41227667ef09b8663043.svg)
We then return to the variance formula
![{\displaystyle \operatorname {Var} [X]=\left[{(1-p)(2-p) \over p^{2}}\right]-\left({1-p \over p}\right)^{2}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/5083e58db627dd9b8b3c9538630be47a30f52318.svg)
![{\displaystyle \operatorname {Var} [X]={(1-p) \over p^{2}}}](../../_assets_/eb734a37dd21ce173a46342d1cc64c92/425211a2e0d5b31cb138f0691612a7009fa16060.svg)
External links