Most of the terms listed in Wikipedia glossaries are already defined and explained within Wikipedia itself. However, glossaries like this one are useful for looking up, comparing and reviewing large numbers of terms together. You can help enhance this page by adding new terms or writing definitions for existing ones.
The following is a glossary of terms used in the mathematical sciences statistics and probability.
Part of a series on |
Science |
---|
|
|
|
|
|
|
|
Glossaries of science and engineering |
|
|
C
causal study
A statistical study in which the objective is to measure the effect of some variable on the outcome of a different variable. For example, how will my headache feel if I take aspirin, versus if I do not take aspirin? Causal studies may be either experimental or observational.[1]
central limit theorem
central moment
characteristic function
chi-squared distribution
chi-squared test
cluster analysis
cluster sampling
complementary event
completely randomized design
computational statistics
concomitants
In a statistical study, concomitants are any variables whose values are unaffected by treatments, such as a unit’s age, gender, and cholesterol level before starting a diet (treatment).[1]
conditional distribution
Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X (written "Y | X") is the probability distribution of Y when X is known to be a particular value
conditional probability
The probability of some event A, assuming event B. Conditional probability is written P(A|B), and is read "the probability of A, given B"
conditional probability distribution
confidence interval
In inferential statistics, a CI is a range of plausible values for some parameter, such as the population mean.[2] For example, based on a study of sleep habits among 100 people, a researcher may estimate that the overall population sleeps somewhere between 5 and 9 hours per night. This is different from the sample mean, which can be measured directly.
confidence level
Also known as a confidence coefficient, the confidence level indicates the probability that the confidence interval (range) captures the true population mean. For example, a confidence interval with a 95 percent confidence level has a 95 percent chance of capturing the population mean. Technically, this means that, if the experiment were repeated many times, 95 percent of the CIs would contain the true population mean.[2]
confounding
conjugate prior
continuous variable
convenience sampling
correlation
Also called correlation coefficient, a numeric measure of the strength of linear relationship between two random variables (one can use it to quantify, for example, how shoe size and height are correlated in the population). An example is the Pearson product-moment correlation coefficient, which is found by dividing the covariance of the two variables by the product of their standard deviations. Independent variables have a correlation of 0
count data
Data arising from counting that can take only non-negative integer values
covariance
Given two random variables X and Y, with expected values
and
, covariance is defined as the expected value of random variable
, and is written
. It is used for measuring correlation
E
elementary event
An event with only one element. For example, when pulling a card out of a deck, "getting the jack of spades" is an elementary event, while "getting a king or an ace" is not
estimation theory
estimator
A function of the known data that is used to estimate an unknown parameter; an estimate is the result from the actual application of the function to a particular set of data. The mean can be used as an estimator
expected value
The sum of the probability of each possible outcome of the experiment multiplied by its payoff ("value"). Thus, it represents the average amount one "expects" to win per bet if bets with identical odds are repeated many times. For example, the expected value of a six-sided die roll is 3.5. The concept is similar to the mean. The expected value of random variable X is typically written E(X) for the operator and
(mu) for the parameter
experiment
Any procedure that can be infinitely repeated and has a well-defined set of outcomes
exponential family
event
A subset of the sample space (a possible experiment's outcome), to which a probability can be assigned. For example, on rolling a die, "getting a five or a six" is an event (with a probability of one third if the die is fair)
J
joint distribution
Given two random variables X and Y, the joint distribution of X and Y is the probability distribution of X and Y together
joint probability
The probability of two events occurring together. The joint probability of A and B is written
or
K
Kalman filter
kernel
kernel density estimation
kurtosis
A measure of the infrequent extreme observations (outliers) of the probability distribution of a real-valued random variable. Higher kurtosis means more of the variance is due to infrequent extreme deviations, as opposed to frequent modestly sized deviations
L
L-moment
law of large numbers
likelihood function
A conditional probability function considered a function of its second argument with its first argument held fixed. For example, imagine pulling a numbered ball with the number k from a bag of n balls, numbered 1 to n. Then you could describe a likelihood function for the random variable N as the probability of getting k given that there are n balls : the likelihood will be 1/n for n greater or equal to k, and 0 for n smaller than k. Unlike a probability distribution function, this likelihood function will not sum up to 1 on the sample space
likelihood-ratio test
S
sample
That part of a population which is actually observed
sample mean
The arithmetic mean of a sample of values drawn from the population. It is denoted by
. An example is the average test score of a subset of 10 students from a class. Sample mean is used as an estimator of the population mean, which in this example would be the average test score of all of the students in the class.
sample space
The set of possible outcomes of an experiment. For example, the sample space for rolling a six-sided die will be {1, 2, 3, 4, 5, 6}
sampling
A process of selecting observations to obtain knowledge about a population. There are many methods to choose on which sample to do the observations
sampling bias
sampling distribution
The probability distribution, under repeated sampling of the population, of a given statistic
sampling error
scatter plot
significance level
simple random sample
Simpson's paradox
skewness
A measure of the asymmetry of the probability distribution of a real-valued random variable. Roughly speaking, a distribution has positive skew (right-skewed) if the higher tail is longer and negative skew (left-skewed) if the lower tail is longer (confusing the two is a common error)
spaghetti plot
spectrum bias
standard deviation
The most commonly used measure of statistical dispersion. It is the square root of the variance, and is generally written
(sigma)
standard error
standard score
statistic
The result of applying a statistical algorithm to a data set. It can also be described as an observable random variable
statistical dispersion
statistical graphics
statistical hypothesis testing
statistical independence
Two events are independent if the outcome of one does not affect that of the other (for example, getting a 1 on one die roll does not affect the probability of getting a 1 on a second roll). Similarly, when we assert that two random variables are independent, we intuitively mean that knowing something about the value of one of them does not yield any information about the value of the other
statistical inference
Inference about a population from a random sample drawn from it or, more generally, about a random process from its observed behavior during a finite period of time
statistical interference
statistical model
statistical population
A set of entities about which statistical inferences are to be drawn, often based on random sampling. One can also talk about a population of measurements or values
statistical dispersion
Statistical variability is a measure of how diverse some data is. It can be expressed by the variance or the standard deviation
statistical parameter
A parameter that indexes a family of probability distributions
statistical significance
statistics
stem-and-leaf display
stratified sampling
survey methodology
survival function
survivorship bias
symmetric probability distribution
systematic sampling
U
unimodal probability distribution
units
In a statistical study, the objects to which treatments are assigned. For example, in a study examining the effects of smoking cigarettes, the units would be people.[1]
V
variance
A measure of its statistical dispersion of a random variable, indicating how far from the expected value its values typically are. The variance of random variable X is typically designated as
,
, or simply
External links
|
---|
|
|
|
|
|
|
|
- Category
- Portal
- Commons
- WikiProject
|
Glossaries of science and engineering |
---|
|