DÄ internationalArchive7/2022Collider Bias in Observational Studies: Consequences for Medical Research

Review article

Collider Bias in Observational Studies: Consequences for Medical Research

Part 30 of a Series on Evaluation of Scientific Publications

Dtsch Arztebl Int 2022; 119: 107-12. DOI: 10.3238/arztebl.m2022.0076

Tönnies, T; Kahl, S; Kuss, O

Background: The findings of observational studies can be distorted by a number of factors. So-called confounders are well known, but distortion by collider bias (CB) has received little attention in medical research to date. The goal of this article is to present the principle of CB, and measures that can be taken to avoid it, by way of a few illustrative examples.

Methods: The findings of a selective review of the literature on CB are explained with illustrative examples.

Results: The simplest case of a collider variable is one that is caused by at least two other variables. An example of CB is the observation that, among persons with diabetes, obesity is associated with lower mortality, even though it is associated with higher mortality in the general population. The false protective association between obesity and mortality arises from the restriction of the study population to persons with diabetes.

Conclusion: CB is a distortion that arises through restriction on or stratification by a collider variable, or through statistical adjustment for a collider variable in a regression model. CB can arise in many ways. The graphic representation of causal structures helps to identify potential sources of CB. It is important to distinguish confounders from colliders, as methods that serve to correct for confounding can themselves cause bias when applied to colliders. There is no generally applicable method for correcting CB.

LNSLNS

The question of how to prove causality in observational studies has preoccupied researchers for centuries. In medical research, various approaches are used for identifying causal relationships, most of them relying on a probabilistic understanding of causality (e.g., A increases the probability of B) (1). According to this, an exposure (e.g., smoking) or a treatment causes an outcome (e.g., lung cancer) if the exposure or treatment affects the probability that the outcome will occur. In contrast to this are deterministic theories of causality based on logical conditions (e.g., A is necessarily and always followed by B) (1). In medical research, mathematical methods of causal inference are being used increasingly. With this approach, the conditions that must be fulfilled if causal inferences are to be to be drawn from observational studies are determined by mathematical reasoning.

Over the past few decades, the theory and methods of causal inference have contributed significantly to the understanding and avoidance of bias in observational studies. Major advances have been made in methods for adjusting for confounding, i.e., for the “mixing together” of an exposure effect with the effect of a confounding factor (a “confounder”) that affects both the exposure and the outcome. For example, age is a confounder for the association between smoking and lung cancer risk because age affects both the probability of smoking and the risk of lung cancer. Some statistical methods that can at least partially correct for confounding have been presented previously in this journal (e.g., [2–4]).

Causal inference theory also clarifies the important distinction between a confounder and a collider. A confounder is a variable that causes both the exposure and the outcome. By contrast, a “collider” is a variable that is caused by at least two other variables (the causing variables “collide” in the collider). For example, if quality of life is affected by both smoking (exposure) and lung cancer (outcome), quality of life would be a collider and not a confounder for the association between smoking and lung cancer. This distinction is important, because methods designed to correct for confounding (e.g., regression analysis) can introduce bias if they are applied to colliders. For this reason, bias of this kind is termed “collider bias” (CB).

Directed acyclic graphs (DAGs) have contributed to better understanding of CB as they offer a simple graphical representation of causal associations (5, 6). This allows potential sources of CB to be identified graphically without the need to penetrate the underlying mathematics. DAGs also make it easier to distinguish between colliders and confounders (5, 6).

Confounding bias is a well-known problem in medical research and is usually taken into account in study design and analysis. By comparison, the potential for distortion due to CB has been largely overlooked in medical research. The aim of this paper, therefore, is to explain the principle of CB using examples and DAGs, and to present measures that can be taken to avoid it.

Methods

Based on a selective literature review, and using examples and DAGs, this paper describes situations in which CB can distort the estimation of exposure effects.

Results

Directed acyclic graphs and collider bias

In a DAG, a causal relationship is represented as a directional arrow from the causing variable to the affected variable (Figure 1). A collider is a variable where at least two arrowheads meet, i.e., the arrows “collide” in this variable. A confounder, by contrast, is a variable that causes both the exposure and the outcome, i.e., a confounder has (at least) two arrows pointing away from it – one toward the exposure and one toward the outcome.

Directed acyclic graph depicting collider bias caused by selective study participation in relation to the research question whether depression affects overall health status
Figure 1
Directed acyclic graph depicting collider bias caused by selective study participation in relation to the research question whether depression affects overall health status

Figure 1 shows a hypothetical study situation in which the question to be investigated is whether depression influences a person’s general state of health. For this question, the variable “social isolation” is a confounder as there are arrows pointing toward both the exposure and the outcome. For example, one could assume that social isolation increases the risk of depression and also worsens health status. The variable “study participation,” on the other hand, is a collider. The assumption underlying this is that willingness to take part in a scientific study is reduced by depression (represented in Figure 1 as an arrow from “depression” to “study participation”), and that it is mainly healthier individuals that participate in studies (represented in Figure 1 as an arrow from “health status” to “study participation”). Thus, two arrows collide in the variable “study participation.” If now the association between “depression” and “health status” in the study population is estimated (rather than the relationship in the general population!), the estimate will be biased, since people with depression are more likely to participate if they are in good health. Thus, study participants with depression will appear to be healthier on average than individuals with depression in the general population. So this CB comes from restricting the study population on the collider “study participation.” “Restriction on the collider” means that only persons with the characteristic “study participation = yes” are included in the analysis. This restriction occurs necessarily because no information is available for persons who do not participate in the study.

Basically, a CB can result from

1. Restriction on a collider (e.g., by selective study participation, Figure 1);

2. Stratifying the analysis on the collider;

3. Adjusting for a collider in a regression model.

In relation to the example (Figure 1), in this context it becomes clear why it is important to distinguish between confounders and colliders. Confounding by “social isolation” could be eliminated by restricting the study population, i.e., by studying only persons who are socially isolated (“social isolation = yes”). Within this group, confounding by social isolation cannot occur because the confounder is expressed the same way in all the individuals in the study. Thus, restriction on the confounder eliminates bias that would have been caused by the confounder. By contrast, as explained above, restriction on the collider “study participation” does not eliminate bias but causes it, i.e., it causes CB. The same is true with regard to points 2 and 3 above: when applied to confounders, these methods can remove bias, whereas when applied to colliders they introduce it.

The methods listed at points 1 to 3 above all belong to the group of methods that “condition” on a collider. Thus, CB arises whenever conditioning is applied to a collider, irrespective of which conditioning method is used. In the case of stratification, it can be assumed that an artificial association will arise in at least one level of the collider. In the remainder of this article, the term “conditioning” always refers to all three methods.

The example of a CB shown in Figure 1 would probably have been identifiable even without being set out in a DAG. However, other examples of possible CB are more complicated and harder to grasp intuitively: e.g., the birth weight paradox (7) and the obesity paradox (8). In the next section, therefore, we shall take a closer look at the latter.

The obesity paradox

The obesity paradox describes the apparently paradoxical observation that for people with a chronic disease (in this case diabetes), being obese is associated with reduced mortality, even though in the general population obesity is associated with increased mortality (8). One possible explanation for this seemingly paradoxical finding is a particular form of CB.

To illustrate this, Figure 2 shows a highly simplified study situation in which the effect of obesity on mortality is being investigated. The chart is based on the following assumptions.

Hypothetical study population consisting of 1000 study participants divided into four groups by obesity and smoking status and showing correspondingly different mortality risk
Figure 2
Hypothetical study population consisting of 1000 study participants divided into four groups by obesity and smoking status and showing correspondingly different mortality risk

Exposure: obesity

  • Out of 1000 study participants, 500 are obese.
  • Obesity increases mortality by 2.5 percentage points.
  • Obesity increases the risk of diabetes by 16.0 percentage points.

Collider: diabetes

  • Diabetes increases mortality by 5.0 percentage points.
  • Individuals who do not smoke and are not obese have a 4% risk of diabetes.

Risk factor: smoking

  • Out of 1000 study participants, 500 smoke.
  • Smoking increases mortality by 15 percentage points.
  • Smoking increases the risk of diabetes by 12 percentage points.
  • Smoking does not affect the probability of becoming obese.

Outcome: mortality

  • Nonsmokers who are not obese and do not have diabetes have 5% mortality risk.

Now, if all the study participants who are obese and all those who are not obese are compared with each other, the obese group shows increased mortality, as would be expected on the basis of the assumptions outlined above (Figure 2a). By contrast, however, if only participants with diabetes are selected, obesity is associated with reduced mortality (Figure 2b), even though in the baseline data the presence of obesity consistently led to an increase in mortality. It follows that this association is solely due to the restriction of the study population and must not be interpreted as causal. CB is present.

To be specific, selection on diabetes status has led to underrepresentation of nonsmokers who are not obese, since diabetes is more common among those who smoke and/or are obese. However, individuals with diabetes who do not smoke are more likely to be obese than individuals with diabetes who do smoke. It follows that restricting the study to individuals with diabetes has led to a statistical association between obesity and smoking, even though in the overall population this association does not exist. Because smoking is associated with significantly increased mortality, obesity appears to reduce mortality.

The DAG that underlies this example is shown in Figure 3. There is both a direct causal relation between obesity and mortality and an indirect causal relation mediated by the increased risk of diabetes that is due to obesity. Smoking increases diabetes risk and increases mortality. As can also be seen in Figure 2a, there is no association between smoking and obesity. However, this association is brought into being when conditioning is introduced on the collider “diabetes” (in this case by restricting the analysis to individuals with diabetes), since both smoking and obesity increase diabetes risk. This observation can be generalized: if no causal relationship exists between two variables, conditioning on a third variable caused by these two variables will bring about a non-causal (i.e., false) association between the two causing variables.

Directed acyclic graph (DAG) depicting the collider bias associated with the obesity paradox
Figure 3
Directed acyclic graph (DAG) depicting the collider bias associated with the obesity paradox

Unlike in Figure 1, this CB is not instantly identifiable. The matter is further complicated by the fact that the definition of a variable as a collider is “path-dependent.” “Paths” are all the possible routes from exposure to outcome. On the path “obesity → diabetes → mortality,” diabetes is not a collider because no arrows meet at it. On the path “obesity → diabetes ← smoking → mortality,” however, diabetes is a collider because two arrows meet at it. Although this example includes only four variables, identifying this bias is already proving to be quite complicated.

Detecting and avoiding collider bias

The above examples make clear how different causal structures can lead to biased effect estimates in the form of CB. Bias arises when there is any form of conditioning on colliders, since this produces a statistical association between the causing variables (i.e., the variables whose arrows meet in the collider), or else it distorts (biases) an existing association between these variables. For example, in the obesity paradox, restriction on the collider “diabetes” caused a statistical association between obesity and smoking that was not present in the overall population.

Unfortunately, there is no single, generally applicable method of easily correcting bias due to colliders. Producing an explicit representation of the causal structure of the research question in a DAG is helpful. As shown in the example of the obesity paradox, this enables direct visual identification of variables that may cause a CB. However, as the number of variables included rises, identifying possible bias by visual inspection of a DAG becomes increasingly complex. For such cases, software is available (9, 10) that can be used to test which variables could cause CB (one example is DAGitty, available free at www.dagitty.net).

Explicit representation in this way requires that knowledge-based causal assumptions for the relevant research question can be made in the form of a DAG. The DAG should be developed a priori, i.e., before the study begins and without knowledge of the data, and should relate to the population from which the data are to be collected. There should also be close coordination from an early stage between the experts who are posing the research question and those whose expertise is methodological. If a DAG is developed before the study begins, potential sources of CB can be taken into account at the design stage and during data collection: for example, when an assumption can be made that both the exposure and the outcome will affect participation in the study (Figure 1).

In the data analysis, a DAG and appropriate software can be helpful in selecting appropriate adjustment variables for regression analysis. This can minimize the risk that a regression model will include variables that distort through CB rather than adjust for confounding.

To assess the risk of CB when reading studies, attention should be paid to whether a distinction has been made between confounders and colliders, for example if a reasoned explanation is provided of why the variables for the statistical analysis were chosen. If, by contrast, variables were selected based on the data (e.g., all variables statistically associated with exposure), there is an increased risk of conditioning not only on confounders but also, erroneously, on colliders. Particular attention should be paid to whether conditioning has been applied on variables caused by exposure, as in the obesity paradox (restriction on “diabetes,” caused by the exposure “obesity”), since this increases the risk of CB (11).

Furthermore, only associations between the exposure and the outcome defined at the outset of the study should be interpreted, because the confounders, too, were selected only for this association, at least if the confounders were selected on the basis of subject-matter reasoning rather than data. The associations between other variables from the regression model and the outcome are usually difficult to interpret, e.g., because variables that are confounders for the association of interest may be colliders for other relations (12).

Summary

As the examples provided make clear, CB can lead to bias in the estimating of exposure effects. The fact that this form of bias is still comparatively unknown is shown by, among other things, the fact that, although it has often been indicated that the obesity paradox may have a methodological cause, some authors advise against the weight loss usually regarded as desirable in patients with chronic disease who are obese (13).

CB can lead to just as much bias as confounding (11). The causal structures underlying this are multifarious, making it difficult to offer a simple universal solution. Compared with confounding, CB is more difficult to comprehend intuitively, with the result that it often shows up in apparently paradoxical associations. Systematic representation of causal structures in the form of DAGs helps to pool expert knowledge and identify potential sources of CB. These insights can be used to counter CB in study design, data collection, and data analysis.

Conflict of interest statement
The authors declare that no conflict of interest exists.

Manuscript received on 26 August 2021, revised version accepted on 2 December 2021

Translated from the original German by Kersti Wagstaff, M.A.

Corresponding author
Dr. PH Thaddäus Tönnies
Deutsches Diabetes Zentrum (DDZ)
Institut für Biometrie und Epidemiologie
Auf’m Hennekamp 65, 40225 Düsseldorf, Germany
thaddaeus.toennies@ddz.de

Cite this as:
Tönnies T, Kahl S, Kuss O: Collider bias in observational studies: consequences for medical research. Part 30 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2022; 119: 107–12. DOI: 10.3238/arztebl.m2022.0076

1.
Gianicolo EAL, Eichler M, Muensterer O, Strauch K, Blettner M: Methods for evaluating causality in observational studies—part 27 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2020; 117: 101–7 VOLLTEXT
2.
Kuss O, Blettner M, Börgermann J: Propensity score: an alternative method of analyzing treatment effects—part 23 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2016; 113: 597–603 VOLLTEXT
3.
Zwiener I, Blettner M, Hommel G: Survival analysis—part 15 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2011; 108: 163–9 VOLLTEXT
4.
Schneider A, Hommel G, Blettner M: Linear regression analysis—part 14 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2010; 107: 776–82 VOLLTEXT
5.
Schipf S, Knüppel S, Hardt J, Stang A: Directed Acyclic Graphs (DAGs) – Die Anwendung kausaler Graphen in der Epidemiologie. Gesundheitswesen 2011; 73: 888–92 CrossRef MEDLINE
6.
Greenland S, Pearl J, Robins JM: Causal diagrams for epidemiologic research. Epidemiology 1999; 10: 37–48 CrossRef
7.
Hernández-Díaz S, Schisterman EF, Hernán MA: The birth weight “paradox” uncovered? Am J Epidemiol 2006; 164: 1115–20 CrossRef MEDLINE
8.
Banack HR, Kaufman JS: The “Obesity paradox” explained. Epidemiology 2013; 24: 461–2 CrossRef MEDLINE
9.
Textor J, Hardt J, Knüppel S: DAGitty: a graphical tool for analyzing causal diagrams. Epidemiology 2011; 22: 745 CrossRef MEDLINE
10.
Barrett M: ggdag: analyze and create elegant directed acyclic graphs. r package version 0.2.0. www.CRAN.R-project.org/package=ggdag (last accessed on 24 August 2021).
11.
Greenland S: Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology 2003; 14: 300–6 CrossRef MEDLINE
12.
Westreich D, Greenland S: The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. Am J Epidemiol 2013; 177: 292–8 CrossRef MEDLINE PubMed Central
13.
Anker S, von Haehling S: The obesity paradox in heart failure: accepting reality and making rational decisions. Clin Pharmacol Ther 2011; 90: 188–90 CrossRef MEDLINE
German Diabetes Center (DDZ), Leibniz Center for Diabetes Research at Heinrich-Heine-University Düsseldorf, Institue for Biometrics and Epidemiology, Düsseldorf: Dr. PH Thaddäus Tönnies, Prof. Dr. sc. hum. Oliver Kuss
German Diabetes Center (DDZ), Leibniz Center for Diabetes Research at Heinrich-Heine-University Düsseldorf, Insitute for Clinical Diabetology, Düsseldorf: Dr. med. Sabine Kahl
German Center for Diabetes Research, Partner Düsseldorf, München-Neuherberg: Dr. med. Sabine Kahl
Division of Endocrinology and Diabetology, Medical Faculty and University Hospital Düsseldorf, Heinrich-Heine-University Düsseldorf: Dr. med. Sabine Kahl
Centre for Health and Society, Medical Faculty and University Hospital Düsseldorf, Heinrich-Heine-University Düsseldorf: Prof. Dr. sc. hum. Oliver Kuss
Directed acyclic graph depicting collider bias caused by selective study participation in relation to the research question whether depression affects overall health status
Figure 1
Directed acyclic graph depicting collider bias caused by selective study participation in relation to the research question whether depression affects overall health status
Hypothetical study population consisting of 1000 study participants divided into four groups by obesity and smoking status and showing correspondingly different mortality risk
Figure 2
Hypothetical study population consisting of 1000 study participants divided into four groups by obesity and smoking status and showing correspondingly different mortality risk
Directed acyclic graph (DAG) depicting the collider bias associated with the obesity paradox
Figure 3
Directed acyclic graph (DAG) depicting the collider bias associated with the obesity paradox
1.Gianicolo EAL, Eichler M, Muensterer O, Strauch K, Blettner M: Methods for evaluating causality in observational studies—part 27 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2020; 117: 101–7 VOLLTEXT
2.Kuss O, Blettner M, Börgermann J: Propensity score: an alternative method of analyzing treatment effects—part 23 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2016; 113: 597–603 VOLLTEXT
3.Zwiener I, Blettner M, Hommel G: Survival analysis—part 15 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2011; 108: 163–9 VOLLTEXT
4.Schneider A, Hommel G, Blettner M: Linear regression analysis—part 14 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2010; 107: 776–82 VOLLTEXT
5.Schipf S, Knüppel S, Hardt J, Stang A: Directed Acyclic Graphs (DAGs) – Die Anwendung kausaler Graphen in der Epidemiologie. Gesundheitswesen 2011; 73: 888–92 CrossRef MEDLINE
6.Greenland S, Pearl J, Robins JM: Causal diagrams for epidemiologic research. Epidemiology 1999; 10: 37–48 CrossRef
7.Hernández-Díaz S, Schisterman EF, Hernán MA: The birth weight “paradox” uncovered? Am J Epidemiol 2006; 164: 1115–20 CrossRef MEDLINE
8.Banack HR, Kaufman JS: The “Obesity paradox” explained. Epidemiology 2013; 24: 461–2 CrossRef MEDLINE
9.Textor J, Hardt J, Knüppel S: DAGitty: a graphical tool for analyzing causal diagrams. Epidemiology 2011; 22: 745 CrossRef MEDLINE
10.Barrett M: ggdag: analyze and create elegant directed acyclic graphs. r package version 0.2.0. www.CRAN.R-project.org/package=ggdag (last accessed on 24 August 2021).
11.Greenland S: Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology 2003; 14: 300–6 CrossRef MEDLINE
12.Westreich D, Greenland S: The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. Am J Epidemiol 2013; 177: 292–8 CrossRef MEDLINE PubMed Central
13.Anker S, von Haehling S: The obesity paradox in heart failure: accepting reality and making rational decisions. Clin Pharmacol Ther 2011; 90: 188–90 CrossRef MEDLINE