# Causality, Confounding, and Simpson’s Paradox 1

First of sequence of 6 pedagogical posts on the Simpson’s Paradox.

Bitter fighting among Christian factions and immoral behavior among Church leaders led to a transition to secular thought in Europe (see Zaman (2018) for details). One of the consequences of rejection of religion was the rejection of all unobservables. Empiricists like David Hume rejected all knowledge which was not based on observations and logic. He famously stated that: ““If we take in our hand any volume; of divinity or school metaphysics, for instance; let us ask, Does it contain any abstract reasoning concerning quantity or number? No. Does it contain any experimental reasoning concerning matter of fact and existence? No. Commit it then to the flames: for it can contain nothing but sophistry and illusion.” David Hume further realized that causality was not observable. This means that it is observable that event Y happened after event X, but it is not observable that Y happened due to X. The underlying mechanisms which connect X to Y are not observable. In the early 20^{th} Century, the philosophy of logical positivism which says that human knowledge is based solely on observations and logic became widely accepted and wildly popular. Disciplines like Statistics and Econometrics, which evolved during 20^{th} century, were built on positivist foundations. They only deal with measurable and observable (numbers) and ignore immeasurable concepts like causality. A much more detailed discussion of the philosophical background which led to these widespread misconceptions about human knowledge is given in Zaman (2012).

Pearl et. al. (2016, Chapter 2) provide a history of how mistakes by the founders of the discipline led to replacement of causality by correlation in statistics. Pearl (2018, Chapter 5) provides the history of how causal information was dropped from econometric models. Current econometric techniques do not allow us to distinguish between real and spurious relationships. Excellent and robust fits can be seen between totally unrelated variables like log of number of newspapers published and life expectancy. there is no way to tell if a regression is real or spurious. How do we differentiate between a regression of Turkish Consumption on Turkish GDP, which has a strong causal basis, and one of GNP on Newspapers, which is purely correlation without causation? Many examples like these are discussed in Zaman (2010), which show serious confusions about causality in conventional econometrics, and resulting consequences in terms of defective analysis.

This is an introductory article which explains the importance of explicit consideration and modeling of causality, contrary to current econometric practice, in order to use data set for extraction of meaningful information. One of the easiest to understand approaches to causality is via Simpson’s paradox. We will use this paradox, framed in different real-world contexts, to provide an introduction to basic concepts of causality.

# The Berkeley Admission Case

Suppose there are two departments Engineering (ENG) and Humanities (HUM), which have differing admissions policies. Due to these policies, 80% of female applicants to ENG are admitted, while only 40% of the female applicants are admitted in HUM. To understand Simpson’s Paradox, it is essential to understand the relation between these departmental admissions rates, and the overall admit rate for females in Berkeley. Assuming, for simplicity, that these are the only two departments, we ask: What is the OVERALL admission rate for female applicants at Berkeley? The answer is that the overall admit ratio is the weighted average of the two admission percentages (80% and 40%). Table 1 show overall admit rate of females with different no. of applicants

**Table 1: Overall admit rate for Female applicants **

Engineering |
Humanities |
Overall admit rate |
|||||||

Situations | Applied | Admitted | % Admitted | Applied | Admitted | % Admitted | Applied | Admitted | % Admitted |

A | 1800 | 1440 | 80% | 200 | 80 | 40% | 2000 | 1520 | 76% |

B | 1500 | 1200 | 80% | 500 | 200 | 40% | 2000 | 1400 | 70% |

C | 1000 | 800 | 80% | 1000 | 400 | 40% | 2000 | 1200 | 60% |

D | 500 | 400 | 80% | 1500 | 600 | 40% | 2000 | 1000 | 50% |

E | 200 | 160 | 80% | 1800 | 720 | 40% | 2000 | 880 | 44% |

If all females apply to HUM and none to ENG then overall admit rate is 40%. If all females apply to ENG then overall admit rate for females will be 80%. The table shows that the overall admit rate for females can vary from 40% to 80% depending upon proportions of females which apply to the two departments.

Now suppose Berkeley systematically discriminates against males. For male applicants to ENG, the admit ratio is only 60%, much lower than the 80% ratio for females. For male applicants to HUM, the admit ratio is only 20%, much lower than the 40% for females. What will the overall admit rate for males be? As before, this will be a weighted average of the two rates 20% and 60%, where the weights will be the proportion of male applicants to the two departments. The table below shows how the overall admissions ratio varies depending on how many males apply to which department:

**Table 2: Overall admit rate for male applicants **

Engineering |
Humanities |
Overall admit rate |
|||||||

Situations | Applied | Admitted | % Admitted | Applied | Admitted | % Admitted | Applied | Admitted | % Admitted |

A | 1800 | 1080 | 60% | 200 | 40 | 20% | 2000 | 1120 | 56% |

B | 1500 | 900 | 60% | 500 | 100 | 20% | 2000 | 1000 | 50% |

C | 1000 | 600 | 60% | 1000 | 200 | 20% | 2000 | 800 | 40% |

D | 500 | 300 | 60% | 1500 | 300 | 20% | 2000 | 600 | 30% |

E | 200 | 120 | 60% | 1800 | 360 | 20% | 2000 | 480 | 24% |

The table shows that the overall admit rate for males can vary between 20% and 60% according to how the applicants are distributed between ENG and HUM. We have already seen that overall admit rates for females can vary between 40% and 80%. Now consider the scenario created by the highlighted rows in the table. If 90% of the females apply to HUM, then the female admit ratio will be 44%, close to the 40% admit ratio for females in HUM. If 90% of the males apply to ENG then the admit ratio for males will be 56%, close to the 60% admit ratio for males in ENG. Despite the fact that females are heavily favored in both ENG and in HUM, the overall admit ratio for females (44%) will be much lower than the admit ratio for males (56%). Someone who looks only at the overall admit ratio for males and females will come to the conclusion that Berkeley discriminates against females, which is the opposite of the picture that emerges when looking at departmental admit ratios. This is known as the Simpson’s Paradox.

Interestingly, this is not a hypothetical example. I have simplified the numbers to make the analysis easier to follow, but the actual data for Berkeley admissions follows a similar pattern. The overall admit rates appear to show bias against females. Bickel et. al. (1975) carry out a standard statistical analysis of aggregate admissions data. They test the hypothesis of equality of admit rates for males and females and conclude that males have significantly higher admissions ratio than females. A causal analysis of data attempts to answer the “WHY” question. Why is the admit rate for males higher? To try to learn why the male admit rate was higher, Bickel et. al. (1975) looked at the breakdown by department. Note that the data themselves furnish us with no clue as to what else we need to look at. It is our real world knowledge about colleges, admissions process, departments, which suggests that department-wise analysis might lead to deeper insights. This shows how real world knowledge, which goes beyond the data, matters for data analysis. Doing the analysis on the departmental level leads to an unexpected finding – each department discriminates in favor of women. Philosophers call this “counter-phenomenal”. The phenomena – the observation – at the aggregate level suggests that Berkeley discriminates against women. But a deeper probe into reality reveals that the opposite is true. This shows the necessity of going beyond the surface appearances, the observations, to deeper structures of reality, in order to understand the phenomena. This is in conflict with Kantian and Empiricist ideas that observations by themselves are sufficient, and we do not need to probe deeper.

When we discover a conflict between the phenomena and our exploration of the noumena – the deeper and hidden structures of reality – then we are faced with the necessity of explaining this conflict. Because both departments discriminate against males, the explanation that Berkeley admissions process discriminates against females is no longer acceptable. Bickel et. al. (1975) do the data analysis and come up with the deeper explanation. ENG is easier to get into, and HUM is more difficult. Females choose to apply to the more difficult department and hence end up with lower admit ratios. Males choose to apply to the easier department, and hence have higher admit ratios. The search for causal explanations does not stop here. We can then ask: WHY do females choose humanities? We can also ask: WHY is ENG easier to get into, and WHY is HUM more difficult to get into? For both of these questions, there are several possible hypotheses which could be true, and which could be explored using data or qualitative techniques. In the next section, we consider some other causal structures for admissions, which lead to radically different answers to the WHY questions, even though the observed data remains exactly the same.

For other parts in sequence see:

Changing causal structures: Simpson’s Paradox 2 ,

Policy depends upon unobservable causal relations: Simpson’s Paradox 3,

Baseball scores: Overall Average or Stratified?: Simpson’s Paradox 4,

Effect of Drugs on Recovery 1: Simpson’s Paradox 5 &

Effect of Drugs on Recovery 2: Simpson’s Paradox 5 —- continued.

**Exercises:**

- Construct an example where there are two hospital A and B. Overall recovery rates for patients admitted to hospital A are higher, while these rates are lower in hospital B. Yet hospital B is a much better hospital than hospital A. Make up numbers to create the Simpson’s Paradox, and then EXPLAIN the paradox in simple words, easily understandable by non-technical audience. (hint: suppose hospital A B takes all extremely sick patients with low recovery chances)

For solution, see: