Bayes’ Rule calculation for Cellulite DX:
Given:
Prevalence = 65%
Sensitivity = 84%
Specificity = 26%
Method 1: Out of 100 women
Assume 100 women.
65 have cellulite
35 do not have cellulite
Among the 65 with cellulite:
84% test positive → 55 true positives
16% test negative → 10 false negatives
Among the 35 without cellulite:
26% test negative → 9 true negatives
74% test positive → 26 false positives
Totals:
Positive tests = 55 + 26 = 81
Negative tests = 9 + 10 = 19
Positive Predictive Value (PPV):
55 / 81 ≈ 68%
Negative Predictive Value (NPV):
9 / 19 ≈ 47%
Method 2: Formal Bayes Rule formula
PPV = [Sensitivity × Prevalence] / [(Sensitivity × Prevalence) + ((1 − Specificity) × (1 − Prevalence))]
PPV = (0.84 × 0.65) / [(0.84 × 0.65) + (0.74 × 0.35)]
PPV = 0.546 / (0.546 + 0.259)
PPV = 0.678 = 68%
NPV = [Specificity × (1 − Prevalence)] / [(Specificity × (1 − Prevalence)) + ((1 − Sensitivity) × Prevalence)]
NPV = (0.26 × 0.35) / [(0.26 × 0.35) + (0.16 × 0.65)]
NPV = 0.091 / (0.091 + 0.104)
NPV = 0.467 = 47%
Likelihood ratios
Likelihood ratios tell you how a test result changes the odds of having a condition. A positive likelihood ratio tells you how much to multiply the odds after a positive test, and a negative likelihood ratio tells you how much to multiply the odds after a negative test.
Start by converting probabilities to odds:
Baseline (65%):
0.65 / 0.35 = 1.86
After a positive test (68%):
0.68 / 0.32 = 2.13
After a negative test (53%):
0.53 / 0.47 = 1.13
Now compare to baseline:
Positive likelihood ratio = 2.13 / 1.86 ≈ 1.14
Negative likelihood ratio = 1.13 / 1.86 ≈ 0.61
So a positive test barely increases the odds (multiplies them by about 1.14), and a negative test reduces them a bit (to about 0.61 times what they were). That’s why this test doesn’t provide much useful information.
The Biological Psychiatry paper split the data into a training set and a test set to identify biomarker signatures. That’s good practice—it helps reduce overfitting (i.e., chasing noise in the data). But it doesn’t eliminate the problem entirely, especially if the initial signal is weak or the model is complex.