Zipf-Mandelbrot law
| Probability mass function | |
| Cumulative distribution function | |
| Parameters | (integer) (real) (real)
|
| Support |
|
| pmf |
|
| cdf |
|
| Mean |
|
| Median | N/A |
| Mode |
|
| Variance | |
| Skewness | |
| Kurtosis | |
| Entropy | |
| mgf | |
| Char. func. | |
In probability theory and statistics, the Zipf-Mandelbrot law is a discrete probability distribution. Also known as the Pareto-Zipf law, it is a power-law distribution on ranked data, named after the Harvard linguistics professor George Kingsley Zipf (1902-1950) who suggested a simpler distribution called Zipf's law, and the mathematician Benoit Mandelbrot (born November 20, 1924), who subsequently generalized it.
The probability mass function is given by:
where HN,q,s is given by:
which may be thought of as a generalization of a harmonic number. In the limit as N approaches infinity, this becomes the Hurwitz zeta function ζ(q,s). For finite N and q = 0 the Zipf-Mandelbrot law becomes Zipf's law. For infinite N and q = 0 it becomes a Zeta distribution.
Applications
The distribution of words ranked by their frequency in a random corpus of text is generally a power-law distribution, known as Zipf's law.
If one plots the frequency rank of words contained in a large corpus of text data versus the number of occurrences or actual frequencies, one obtains a power-law distribution, with exponent close to one (but see Gelbukh and Sidoro 2001).
External links
- Z. K. Silagadze: Citations and the Zipf-Mandelbrot's law
- NIST: Zipf's law
- W. Li's References on Zipf's law

(
(
(