A note on format. I'm experimenting with shorter, more frequent posts to work through a backlog of ideas and observations. Some of them will come with an artifact/tool. With AI code generation, it's often faster and more fun for me to build an explanation tool than to write a long-form post.
A couple of inverse concepts that I've been interested in are:
- Operationalization: methods that encode human intuition or concepts, so can be applied at scale. For example, measures of "disagreement"
- Fast and Frugal Heuristics: simple rules of thumb that work well in practice, with a mathematical/statistical correspondence that explains their effectiveness
One Fast and Frugal tool that's simple and surprisingly effective is the Analysis of Competing Hypotheses (ACH), a CIA tradecraft method from Richards Heuer's Psychology of Intelligence Analysis. To perform ACH, you list candidate hypotheses across the top of a table, evidence down the side, and rate each cell as strongly consistent / consistent / neutral / inconsistent / strongly inconsistent with the hypothesis.
You can get fancy with the analysis. As a pre-LLM side project, I built a tool to collaborate on ACH. By tracking individual assessments and evidence source quality, the tool makes it possible to observe how disagreement and provenance influence the conclusion.
The most commonly cited reason for ACH's effectiveness is that it forces you to consider evidence against each hypothesis simultaneously. That combats multiple cognitive biases, e.g., anchoring, confirmation bias, and availability bias.
But the deeper reason for ACH's effectiveness is that it's Bayesian reasoning in disguise. In the Bayesian formulation, each consistency rating encodes the likelihood of the evidence given the hypothesis is true: P(Ei | Hj). IMPORTANT: consistency is the likelihood of the evidence given the hypothesis, not the joint probability of the evidence and the hypothesis, or the likelihood of the hypothesis given the evidence. If you assume the evidence is conditionally independent given the hypothesis (the Naive Bayes assumption), you can combine the rows:
P(H | E₁, …, Eₙ) ∝ P(H) · P(E₁ | H) · P(E₂ | H) · … · P(Eₙ | H)
Now comes board evaluation. If performing evaluation by hand, you might simply count and select the hypothesis with the fewest inconsistencies. That's a simple approach, and operationalizes the Popperian principle of falsification. However, it doesn't operationalize the concept of "the most likely hypothesis given the evidence." In the Bayesian formulation, you'd select the MAP hypothesis, the hypothesis with the highest posterior probability. You can reconcile the two through model choice: deciding if consistency should contribute ~0 to log-likelihood, or if it should contribute positively.
Once you make the mapping explicit, the Bayesian formulation makes explicit what ACH leaves implicit:
- Priors: ACH treats hypotheses as equally likely going in. Bayes lets you encode variation, which is especially important when evidence is weak
- Diagnosticity: ACH calls evidence with similar ratings across hypotheses "non-diagnostic." In the Bayes formulation, diagnosticity is the variance of likelihoods across hypotheses.
- Magnitude, not just direction: Two "++" ratings aren't twice as strong as one — they multiply. Naive Bayes makes the compounding explicit.
To make the correspondence tangible, I built an interactive explorer with Claude that lets you fill in an ACH table and watch the Bayesian posterior update live:
ACH ↔ Bayesian Reasoning Explorer
The explorer includes a few worked examples (e.g., wet sidewalk) and a toggle to expose the likelihood matrix, log-likelihood contributions, and the joint distribution.
Want to learn more about ACH?
I gave a workshop at Hackers on Planet Earth (HOPE) 2020. Or, see my open-source Open Synthesis repository.