Ratio distribution

A ratio distribution (or quotient distribution) is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two (usually independent) random variables X and Y, the distribution of the random variable Z that is formed as the ratio

is a ratio distribution. (See also Relationships among probability distributions.) The Cauchy distribution is an example of a ratio distribution. The random variable associated with this distribution comes about as the ratio of two Gaussian (normal) distributed variables with zero mean. Thus the Cauchy distribution is also called the normal ratio distribution. A number of researchers have considered more general ratio distributions.[1][2][3][4][5][6][7][8][9] Two distributions often used in test-statistics, the t-distribution and the F-distribution, are also ratio distributions: The t-distributed random variable is the ratio of a Gaussian random variable divided by an independent chi-distributed random variable (i.e., the square root of a chi-squared distribution), while the F-distributed random variable is the ratio of two independent chi-squared distributed random variables.

Often the ratio distributions are heavy-tailed, and it may be difficult to work with such distributions and develop an associated statistical test. A method based on the median has been suggested as a "work-around".[10]

Algebra of random variables

The ratio is one type of algebra for random variables: Related to the ratio distribution are the product distribution, sum distribution and difference distribution. More generally, one may talk of combinations of sums, differences, products and ratios. Many of these distributions are described in Melvin D. Springer's book from 1979 The Algebra of Random Variables.[8]

The algebraic rules known with ordinary numbers do not apply for the algebra of random variables. For example, if a product is C = AB and a ratio is D=C/A it does not necessarily mean that the distributions of D and B are the same. Indeed, a peculiar effect is seen for the Cauchy distribution: The product and the ratio of two independent Cauchy distributions (with the same scale parameter and the location parameter set to zero) will give the same distribution.[8] This becomes evident when regarding the Cauchy distribution as itself a ratio distribution of two Gaussian distributions: Consider two Cauchy random variables, and each constructed from two Gaussian distributions and then

where . The first term is the ratio of two Cauchy distributions while the last term is the product of two such distributions.

Derivation

A way of deriving the ratio distribution of Z from the joint distribution of the two other random variables, X and Y, is by integration of the following form[3]

This is not always straightforward.

The Mellin transform has also been suggested for derivation of ratio distributions.[8]

Moments of Random Ratios

From Mellin transform theory, for distributions existing only on the positive half-line , we have the product identity provided are independent. For the case of a ratio of samples like , in order to make use of this identity it is necessary to use moments of the inverse distribution. Set such that . Thus, if the moments of can be determined separately, then the moments of can be found. The moments of are determined from the inverse pdf of , often a tractable exercise. At simplest, .

To illustrate, let be sampled from a standard Gamma distribution moment is .

is sampled from an inverse Gamma distribution with parameter and has pdf . The moments of this pdf are .

Multiplying the corresponding moments gives .

Independently, it is known that the ratio of the two Gamma samples follows the Beta Prime distribution: whose moments are

Substituting we have which is consistent with the product of moments above.

Gaussian ratio distribution

When X and Y are independent and have a Gaussian distribution with zero mean, the form of their ratio distribution is fairly simple: It is a Cauchy distribution. However, when the two distributions have non-zero means then the form for the distribution of the ratio is much more complicated. In 1932 Fieller[2] removed all approximation from Geary's earlier result but his algorithm, as published, is not quite computer-ready due to the Gaussian integral in the final result (eqns 23-24) possibly going backward along the axis which needs to be trapped out. Here it is given in the more succinct form presented by David Hinkley.[6] In the absence of correlation (cor(X,Y) = 0), the probability density function of the two normal variable X = N(μX, σX2) and Y = N(μY, σY2) ratio Z = X/Y is given by the following expression:

where


And is the cumulative distribution function of the Normal distribution

The above expression becomes even more complicated if the variables X and Y are correlated. In the case that and we have the standard Cauchy distribution. This is most easily derived by a change of variable. Since is uniformly distributed on for the bivariate Normal distribution then in the right hand semicircle we have . Defining we have . Finally set to get and by circular symmetry, .


If , or the more general Cauchy distribution is obtained

where ρ is the correlation coefficient between X and Y and

The complex distribution has also been expressed with Kummer's confluent hypergeometric function or the Hermite function.[9]

A transformation to Gaussianity

A transformation has been suggested so that, under certain assumptions, the transformed variable T would approximately have a standard Gaussian distribution:[1]

The transformation has been called the GearyHinkley transformation,[7] and the approximation is good if Y is unlikely to assume negative values.

Correlated normal ratio

Geary showed how the correlated ratio could be transformed into a near-Gaussian form and developed an approximation for dependent on the probability of negative denominator values being vanishingly small. Fieller's later correlated ratio analysis is exact but cumbersome and incompatible with modern math packages without manual intervention to ensure the Normal integral always is defined in a positive direction. The latter problem can also be identified in some of Marsaglia's equations. Hinkley's correlated results are exact but it is shown below that the correlated ratio condition can be transformed simply into an uncorrelated one so only the simplified Hinkley equations above are required, not the full correlated ratio version.

Let the ratio be in which are zero-mean correlated normal variables with variances and have means .
We can in general write such that become uncorrelated and has standard deviation .
The ratio is invariant and retains the same pdf.

The term in the numerator is made separable by expanding

to get

in which
.

Finally, to be explicit, the pdf of the ratio for correlated variables is found by inputting the modified parameters and into the Hinkley equation above which returns the pdf for the correlated ratio with an offset on . In retrospect this transformation will be recognized as being the same as that used by Geary as a partial result in his eqn viii but which is not well-explained and shows that part of Geary's transformation is not dependent on the positivity of Y

The figures below show an example of a positively correlated ratio with in which the shaded areas represent the increment of area selected by given ratio which accumulates probability from the distribution. The theoretical distribution below, derived from the equations under discussion combined with Hinkley's equations, is highly consistent with a simulation result using 5,000 samples. In the top figure it is easily understood that for a ratio the line almost bypasses the distribution mass altogether and this coincides with a near-zero region in the theoretical pdf. Conversely as reduces toward zero the line collects a higher probability.

Gaussian ratio contours
Contours of the bivariate Gaussian distribution (not to scale)
pdf of probability distribution ratio z
pdf of the ratio z and a simulation (points) for
Example of a correlated normal ratio

Uniform ratio distribution

With two independent random variables following a uniform distribution, e.g.,

the ratio distribution becomes

Cauchy ratio distribution

If two independent random variables, X and Y each follow a Cauchy distribution with median equal to zero and shape factor

then the ratio distribution for the random variable is [11]

This distribution does not depend on and the result stated by Springer [8] (p158 Question 4.6) is not correct. The ratio distribution is similar to but not the same as the product distribution of the random variable :

[8]

More generally, if two independent random variables X and Y each follow a Cauchy distribution with median equal to zero and shape factor and respectively, then:

1. The ratio distribution for the random variable is [11]

2. The product distribution for the random variable is [11]

The result for the ratio distribution can be obtained from the product distribution by replacing with

Ratio of standard normal to standard uniform

If X has a standard normal distribution and Y has a standard uniform distribution, then Z = X / Y has a distribution known as the slash distribution, with probability density function

where φ(z) is the probability density function of the standard normal distribution.[12]

Other ratio distributions

Let X be a normal(0,1) distribution, Y and Z be chi square distributions with m and n degrees of freedom respectively, all independent, with . Then



If U is gamma ( α1, 1) and V is gamma (α2, 1) distributed, where , then




where tm is Student's t distribution, is the F distribution, is the beta distribution, is the beta prime distribution and is the gamma distribution

Scaling: if U is a sample from then U is a sample from where

thus, if U is and V is distributed, then by rescaling the parameter to unity we have, trivially



thus
where is the generalized Beta prime distribution
i.e. if then

Other Gamma Distributions

Generalized gamma distribution

The Gamma distribution can be generalized to


which includes the regular gamma, chi, chi-squared, exponential and Weibull distributions.

If then [13]


Modelling a mixture of different scaling factors


In the ratios above, Gamma samples, U, V may have differing sample sizes but must be drawn from the same distribution with equal scaling .
In situations where U and V are differently scaled, a variables transformation allows the modified random ratio pdf to be determined. Let where arbitrary
and, from above, .
Rescale V arbitrarily, defining


We have and substitution into Y gives
Transforming X to Y gives
Noting we finally have



Thus, if and
then is distributed as with

The distribution of Y is limited here to the interval [0,1]. It can be generalized by scaling such that if then


where

is then a sample from



Though not ratio distributions of two variables, the following identities are useful:

If then
If then
If then
If then
thus, from above,

  • If X and Y are exponential random variables with mean μ, then X-Y is a double exponential random variable with mean 0 and scale μ.



Binomial distribution

This result was first derived by Katz et al in 1978.[14]

Let p1 and p2 be the probabilities of success in the binomial distributions B(X,n) and B(Y,m) respectively. Let T = (X/n)/(Y/m).

Then log(T) is approximately normally distributed with mean log(p1/p2) and variance (1/x) - (1/n) + (1/y) - (1/m).

Ratio distributions in multivariate analysis

Ratio distributions also appear in multivariate analysis. If the random matrices X and Y follow a Wishart distribution then the ratio of the determinants

is proportional to the product of independent F random variables. In the case where X and Y are from independent standardized Wishart distributions then the ratio

has a Wilks' lambda distribution.

See also

References

  1. 1 2 Geary, R. C. (1930). "The Frequency Distribution of the Quotient of Two Normal Variates". Journal of the Royal Statistical Society. 93 (3): 442–446. doi:10.2307/2342070. JSTOR 2342070.
  2. 1 2 Fieller, E. C. (November 1932). "The Distribution of the Index in a Normal Bivariate Population". Biometrika. 24 (3/4): 428–440. doi:10.2307/2331976. JSTOR 2331976.
  3. 1 2 Curtiss, J. H. (December 1941). "On the Distribution of the Quotient of Two Chance Variables". The Annals of Mathematical Statistics. 12 (4): 409–421. doi:10.1214/aoms/1177731679. JSTOR 2235953.
  4. George Marsaglia (April 1964). Ratios of Normal Variables and Ratios of Sums of Uniform Variables. Defense Technical Information Center.
  5. Marsaglia, George (March 1965). "Ratios of Normal Variables and Ratios of Sums of Uniform Variables". Journal of the American Statistical Association. 60 (309): 193–204. doi:10.2307/2283145. JSTOR 2283145.
  6. 1 2 Hinkley, D. V. (December 1969). "On the Ratio of Two Correlated Normal Random Variables". Biometrika. 56 (3): 635–639. doi:10.2307/2334671. JSTOR 2334671.
  7. 1 2 Hayya, Jack; Armstrong, Donald; Gressis, Nicolas (July 1975). "A Note on the Ratio of Two Normally Distributed Variables". Management Science. 21 (11): 1338–1341. doi:10.1287/mnsc.21.11.1338. JSTOR 2629897.
  8. 1 2 3 4 5 6 Springer, Melvin Dale (1979). The Algebra of Random Variables. Wiley. ISBN 0-471-01406-0.
  9. 1 2 Pham-Gia, T.; Turkkan, N.; Marchand, E. (2006). "Density of the Ratio of Two Normal Random Variables and Applications". Communications in Statistics - Theory and Methods. Taylor & Francis. 35 (9): 1569–1591. doi:10.1080/03610920600683689.
  10. Brody, James P.; Williams, Brian A.; Wold, Barbara J.; Quake, Stephen R. (October 2002). "Significance and statistical errors in the analysis of DNA microarray data". Proc Natl Acad Sci U S A. 99 (20): 12975–12978. doi:10.1073/pnas.162468199. PMC 130571. PMID 12235357.
  11. 1 2 3 Kermond, John (2010). "An Introduction to the Algebra of Random Variables". Mathematical Association of Victoria 47th Annual Conference Proceedings - New Curriculum. New Opportunities. The Mathematical Association of Victoria: 1–16. ISBN 978-1-876949-50-1.
  12. "SLAPPF". Statistical Engineering Division, National Institute of Science and Technology. Retrieved 2009-07-02.
  13. B. Raja Rao, M. L. Garg. "A note on the generalized (positive) Cauchy distribution." "Canadian Mathematical Bulletin." 12(1969), 865-868 Published:1969-01-01
  14. Katz D. et al.(1978) Obtaining confidence intervals for the risk ratio in cohort studies. Biometrics 34:469–474
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.