Probability integral transform

In probability theory, the probability integral transform (also known as universality of the uniform) relates to the result that data values that are modeled as being random variables from any given continuous distribution can be converted to random variables having a standard uniform distribution.[1] This holds exactly provided that the distribution being used is the true distribution of the random variables; if the distribution is one fitted to the data, the result will hold approximately in large samples.

The result is sometimes modified or extended so that the result of the transformation is a standard distribution other than the uniform distribution, such as the exponential distribution.

Applications

One use for the probability integral transform in statistical data analysis is to provide the basis for testing whether a set of observations can reasonably be modelled as arising from a specified distribution. Specifically, the probability integral transform is applied to construct an equivalent set of values, and a test is then made of whether a uniform distribution is appropriate for the constructed dataset. Examples of this are P-P plots and Kolmogorov-Smirnov tests.

A second use for the transformation is in the theory related to copulas which are a means of both defining and working with distributions for statistically dependent multivariate data. Here the problem of defining or manipulating a joint probability distribution for a set of random variables is simplified or reduced in apparent complexity by applying the probability integral transform to each of the components and then working with a joint distribution for which the marginal variables have uniform distributions.

A third use is based on applying the inverse of the probability integral transform to convert random variables from a uniform distribution to have a selected distribution: this is known as inverse transform sampling.

Statement

Suppose that a random variable X has a continuous distribution for which the cumulative distribution function (CDF) is FX. Then the random variable Y defined as

has a standard uniform distribution.[1]

Proof

Given any random continuous variable , define . Then:

is just the CDF of a random variable. Thus, has a uniform distribution on the interval .

Examples

For an illustrative example, let X be a random variable with a standard normal distribution . Then its CDF is

where is the error function. Then the new random variable Y, defined by Y=Φ(X), is uniformly distributed.

If X has an exponential distribution with unit mean, then its CDF is

and the immediate result of the probability integral transform is that

has a uniform distribution. The symmetry of the uniform distribution can then be used to show that

also has a uniform distribution.

gollark: An alternative to using CD or USB images for installation is to use the static version of the package manager Pacman, from within another Linux-based operating system. The user can mount their newly formatted drive partition, and use pacstrap (or Pacman with the appropriate command-line switch) to install base and additional packages with the mountpoint of the destination device as the root for its operations. This method is useful when installing Arch Linux onto USB flash drives, or onto a temporarily mounted device which belongs to another system. Regardless of the selected installation type, further actions need to be taken before the new system is ready for use, most notably by installing a bootloader and configuring the new system with a system name, network connection, language settings, and graphical user interface. The installation images come packaged with an experimental command line installer, archinstall, which can assist with installing Arch Linux.
gollark: Arch is largely based on binary packages. Packages target x86-64 microprocessors to assist performance on modern hardware. A ports/ebuild-like system is also provided for automated source compilation, known as the Arch Build System. Arch Linux focuses on simplicity of design, meaning that the main focus involves creating an environment that is straightforward and relatively easy for the user to understand directly, rather than providing polished point-and-click style management tools — the package manager, for example, does not have an official graphical front-end. This is largely achieved by encouraging the use of succinctly commented, clean configuration files that are arranged for quick access and editing. This has earned it a reputation as a distribution for "advanced users" who are willing to use the command line. The Arch Linux website supplies ISO images that can be run from CD or USB. After a user partitions and formats their drive, a simple command line script (pacstrap) is used to install the base system. The installation of additional packages which are not part of the base system (for example, desktop environments), can be done with either pacstrap, or Pacman after booting (or chrooting) into the new installation.
gollark: On March 2021, Arch Linux developers were thinking of porting Arch Linux packages to x86_64-v3. x86-64-v3 roughly correlates to Intel Haswell era of processors.
gollark: The migration to systemd as its init system started in August 2012, and it became the default on new installations in October 2012. It replaced the SysV-style init system, used since the distribution inception. On 24 February 2020, Aaron Griffin announced that due to his limited involvement with the project, he would, after a voting period, transfer control of the project to Levente Polyak. This change also led to a new 2-year term period being added to the Project Leader position. The end of i686 support was announced in January 2017, with the February 2017 ISO being the last one including i686 and making the architecture unsupported in November 2017. Since then, the community derivative Arch Linux 32 can be used for i686 hardware.
gollark: Vinet led Arch Linux until 1 October 2007, when he stepped down due to lack of time, transferring control of the project to Aaron Griffin.

See also

References

  1. Dodge, Y. (2006) The Oxford Dictionary of Statistical Terms, Oxford University Press.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.