Controlling for a variable

In causal models, controlling for a variable means binning data according to measured values of the variable. This is typically done so that the variable can no longer act as a confounder in, for example, in an observational study or experiment.

When estimating the effect of explanatory variables on an outcome by regression, controlled-for variables are included as inputs in order to separate their effects from the explanatory variables.[1]

The limitation of controlling for variables is that that it opens back-door paths to unknown confounders. Counterfactual reasoning mitigates the influence of confounders without this drawback.

Experiments

Experiments attempt to assess the effect of manipulating one or more independent variables on one or more dependent variables. To ensure the measured effect is not influenced by external factors, other variables must be held constant. The variables made to remain constant during an experiment are referred to as control variables.

For example, if an outdoor experiment were to be conducted to compare how different wing designs of a paper airplane (the independent variable) affect how far it can fly (the dependent variable), one would want to ensure that the experiment is conducted at times when the weather is the same, because one would not want weather to affect the experiment. In this case, the control variables may be wind speed, direction and precipitation. If the experiment were conducted when it was sunny with no wind, but the weather changed, one would want to postpone the completion of the experiment until the control variables (the wind and precipitation level) were the same as when the experiment began.

In controlled experiments of medical treatment options on humans, researchers randomly assign individuals to a treatment group or control group. This is done to reduce the confounding effect of irrelevant variables that are not being studied, such as the placebo effect.

Observational studies

In an observational study, researchers have no control over the values of the independent variables, such as who receives the treatment. Instead, they must control for variables using statistics.

Observational studies are used when controlled experiments may be unethical or impractical. For instance, if a researcher wished to study the effect of unemployment (the independent variable) on health (the dependent variable), it would be considered unethical by most institutional review boards to randomly assign some participants to have jobs and some not to. Instead, the researcher will have to create a sample wherein some people are employed and some are unemployed. However, there could be factors that affect both whether someone is employed and how healthy he or she is. Any observed association between the independent variable and the dependent variable could be due instead to these outside, spurious factors rather than indicating a true link between them. This can be problematic even in a true random sample. By controlling for the extraneous variables, the researcher can come closer to understanding the true effect of the independent variable on the dependent variable.

In this context the extraneous variables can be controlled for by using multiple regression. The regression uses as independent variables not only the one or ones whose effects on the dependent variable are being studied, but also any potential confounding variables, thus avoiding omitted variable bias.

See also

References

  1. Frost, Jim. "A Tribute to Regression Analysis | Minitab". Retrieved 2015-08-04.
  • Freedman, David; Pisani, Robert; Purves, Roger (2007). Statistics. W. W. Norton & Company. ISBN 0393929728.


This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.