It is common, when discussing random processes, to assume that they have
well-defined mean and finite variance; and there are indeed plenty of random
processes of which this is true. None the less, this is not a universal
property of random variates; and the exceptions matter because, by violating
that expectation, they stand outside the reach of certain standard statistical
tools, notably including Chebyshev's bound. In
particular, the Central Limit Theorem
does not apply to them, so taking a
large number of samples of such a variate and averaging need not give you a
nicely-behaved average variate.
Where variates do have well-defined finite mean and variance, it is well
established (thanks to the central limit theorem) that they behave roughly like
the Gaussian distribution, with probability density
varying exponentially with square of difference from the mean. However, that is
only roughly true and, in particular, the Gaussian is a poor model for the many
variates which are intrinsically positive, such as the heights of human beings:
although the Gaussian may give only a vanishingly small probability of finding
someone with a negative height, it does none the less give a non-zero
probability to this unreal possibility. Fortunately, there are soeme similarly
well-behaved distributions that do respect a boundary value: so I shall use one
of these as my model of well-behaved
randomness, specifically
the gamma distribution, with density at (positive)
variate-value t proportional to exp(−t/b).power(a−1, t/b), for some
positive constants b and a. This has mean b.a and variance b.b.a, both handily
finite.
For the unruly part of randomness, I'll take a distribution that has
power-law tails
, that is, its density near zero and at large
variate-value varies as powers of the value. A simple model of this kind is
simply power(a−1)/(1 +power(a+c)), which behaves as power(a−1) near
zero and as power(−(1 +c)) for large variate value; this is normalisable
for any positive a and c. Let the normalisation factor for this be N(a, c) =
integral(: power(a −1)/(1 +power(a +c)) :{positives}); then the mean of
such a variate is N(a+1, c−1)/N(a, c) and the mean of its square is
likewise N(a+2, c−2)/N(a, c), saving only that the former is well defined
only for c > 1 and the latter only for c > 2; lower values of c give
infinite values for the mean or mean-square, respectively. This means that, for
c between 1 and 2, the variate has finite mean but infinite variance; and for c
between 0 and 1 it doesn't even have a well-defined mean. I'll call this a
power-tail distribution.
For actual computation of N, see study.maths.powertails.PowerTails in my pythonic study package.
Since the gamma distribution also has power-law behaviour near zero, I'll take both distributions with the same power-law behaviour near zero, hence the use of the same name a for the relevant power. If we have a random variate whose density is a weighted average of gamma and power-tail distributions, with more weight on the gamma than on the power tail, it may be very hard for an observation, even of large numbers of samples from the distribution, to notice that it is not simply gamma-distributed; and yet, if the power-tail's c is between 1 and 2, the variate will still have infinite variance, despite appearing to follow one of the best-behaved distributions there is. Such a distribution looks like
and, when h.Γ(a) is much bigger than k.N(a, c), the gamma part of the distribution shall dominate wherever the density is not tiny, so that any evidence of the power-tail in a sample of the random variate is apt to be mistaken for a rare outlier of the gamma distribution; yet that power-tail cannot safely be ignored, since it's what makes the variance infinite.
Written by Eddy.