Notation in probability and statistics

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

Probability theory

  • Random variables are usually written in upper case roman letters: X, Y, etc.
  • Particular realizations of a random variable are written in corresponding lower case letters. For example, x1, x2, …, xn could be a sample corresponding to the random variable X. A cumulative probability is formally written to differentiate the random variable from its realization.
  • The probability is sometimes written to distinguish it from other functions and measure P so as to avoid having to define “P is a probability” and is short for , where is the event space and is a random variable. notation is used alternatively.
  • or indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as , while joint probability mass function or probability density function as and joint cumulative distribution function as .
  • or indicates the probability of either event A or event B occurring (“or” in this case means one or the other or both).
  • σ-algebras are usually written with uppercase calligraphic (e.g. for the set of sets on which we define the probability P)
  • Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. , or .
  • Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. , or .
  • Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:, or denoted as ,
  • In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
  • Some common operators:
  • X is independent of Y is often written or , and X is independent of Y given W is often written
or
  • , the conditional probability, is the probability of given , i.e., after is observed.

Statistics

  • Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
  • A tilde (~) denotes "has the probability distribution of".
  • Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., is an estimator for .
  • The arithmetic mean of a series of values x1, x2, ..., xn is often denoted by placing an "overbar" over the symbol, e.g. , pronounced "x bar".
  • Some commonly used symbols for sample statistics are given below:
    • the sample mean ,
    • the sample variance s2,
    • the sample standard deviation s,
    • the sample correlation coefficient r,
    • the sample cumulants kr.
  • Some commonly used symbols for population parameters are given below:
    • the population mean μ,
    • the population variance σ2,
    • the population standard deviation σ,
    • the population correlation ρ,
    • the population cumulants κr,
  • is used for the order statistic, where is the sample minimum and is the sample maximum from a total sample size n.

Critical values

The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value xα such that F(xα) = 1  α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:

  • zα or z(α) for the standard normal distribution
  • tα,ν or t(α,ν) for the t-distribution with ν degrees of freedom
  • or for the chi-squared distribution with ν degrees of freedom
  • or F(α,ν1,ν2) for the F-distribution with ν1 and ν2 degrees of freedom

Linear algebra

  • Matrices are usually denoted by boldface capital letters, e.g. A.
  • Column vectors are usually denoted by boldface lowercase letters, e.g. x.
  • The transpose operator is denoted by either a superscript T (e.g. AT) or a prime symbol (e.g. A).
  • A row vector is written as the transpose of a column vector, e.g. xT or x.

Abbreviations

Common abbreviations include:

gollark: EEEEEVIL DRAGON MURDERERS!
gollark: At 34 generations you need more dragons than the age of the universe in years.
gollark: Anything above 10 runs into impracticality central.
gollark: Well, if you don't go to 56 generations it should be fine.
gollark: 1 for a CB one, 2 for a 2G one, 4 for a 3G one, 8 for a 4G one...

See also

References

  • Halperin, Max; Hartley, H. O.; Hoel, P. G. (1965), "Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation", The American Statistician, 19 (3): 12–14, doi:10.2307/2681417, JSTOR 2681417
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.