June 30, 2025

Stats Reunion: What have we learned so far?

Stats Reunion: What have we learned so far?

It’s our first stats reunion! In this special review episode, we revisit favorite concepts from past episodes—p-values, multiple testing, regression adjustment—and give them fresh personalities as characters. Meet the seductive false positive, the clingy post hoc ex, and Charlotte, the well-meaning but overfitting idealist.


Statistical topics

  • Bar charts vs Box plots
  • Bonferroni correction
  • Confounding
  • False positives 
  • Multiple testing
  • Multivariable regression
  • Outcome switching
  • Over-adjustment
  • Post hoc analysis
  • Pre-registration
  • Residual confounding
  • Statistical adjustment using regression
  • Subgroup analysis 
  • Unmeasured confounding


Review Sheet


References


Kristin and Regina’s online courses: 

Demystifying Data: A Modern Approach to Statistical Understanding  

Clinical Trials: Design, Strategy, and Analysis 

Medical Statistics Certificate Program  

Writing in the Sciences 

Epidemiology and Clinical Research Graduate Certificate Program 


Programs that we teach in:

Epidemiology and Clinical Research Graduate Certificate Program 


Find us on:

Kristin -  LinkedIn & Twitter/X

Regina - LinkedIn & ReginaNuzzo.com

  • (00:00) - Intro
  • (02:26) - Mailbag
  • (06:42) - P-values
  • (12:43) - Multiple Testing Guy
  • (16:05) - Bonferroni solution
  • (17:11) - Post hoc analysis ex
  • (22:22) - Subgroup analysis person
  • (29:34) - Statistical adjustment idealist
  • (43:00) - Unmeasured confounding
  • (44:25) - Residual confounding
  • (48:31) - Over-adjustment
  • (53:48) - Wrap-up

00:00 - Intro

02:26 - Mailbag

06:42 - P-values

12:43 - Multiple Testing Guy

16:05 - Bonferroni solution

17:11 - Post hoc analysis ex

22:22 - Subgroup analysis person

29:34 - Statistical adjustment idealist

43:00 - Unmeasured confounding

44:25 - Residual confounding

48:31 - Over-adjustment

53:48 - Wrap-up

[Regina] (0:00 - 0:18)
If you cannot get him to pre-commit or be transparent, then I say you got to adjust your expectations. Do not be so easily impressed by multiple testing dude and his little declarations of love. You got to raise the bar for significance.


You gotta Bonferroni the guy.


[Kristin] (0:23 - 0:47)
Welcome to Normal Curves. This is a podcast for anyone who wants to learn about scientific studies and the statistics behind them. It's like a journal club, except we pick topics that are fun, relevant, and sometimes a little spicy.


We evaluate the evidence, and we also give you the tools that you need to evaluate scientific studies on your own. I'm Kristin Sainani. I'm a professor at Stanford University.


[Regina] (0:47 - 0:53)
And I'm Regina Nuzzo. I'm a professor at Gallaudet University and part-time lecturer at Stanford.


[Kristin] (0:54 - 0:59)
We are not medical doctors. We are PhDs. So nothing in this podcast should be construed as medical advice.


[Regina] (0:59 - 1:04)
Also, this podcast is separate from our day jobs at Stanford and Gallaudet University.


[Kristin] (1:05 - 1:23)
Regina, today we're doing something different. This is a review episode where we look back over the past 10 episodes and highlight a few recurring statistical tools and pitfalls. It allows us to connect the dots and to go a little deeper.


It's kind of like reviewing for a test, but that makes it sound boring.


[Regina] (1:23 - 1:53)
It is not going to be boring because we are never boring. But here's my thing, Kristin. I confess, I always think of these statistical tools and concepts as actual characters, like characters on a sitcom or familiar characters in our own lives.


So I'm thinking of this episode less like a review before a test and maybe more like a cast reunion of all of our favorite statistical characters.


[Kristin] (1:54 - 1:57)
Clearly, you're more fun to study with for a test, Regina.


[Regina] (1:58 - 2:11)
We're more likely to get an A with you, though, so I think it balances out. This episode will not have acclaim, methodological morals, or smooches at the end, but it will have goodies along the way.


[Kristin] (2:11 - 2:25)
We are shaking things up, Regina, keeping it fresh. Every 10 episodes or so, we're going to throw in one of these review episodes, if this one is not a total flop, by the way, because we are trying to teach in this podcast, and so the teaching might be a bit more overt in these review episodes.


[Regina] (2:26 - 2:52)
It's not just us wanting to teach, I want to point out. We have had listeners request that we go into these concepts in a bit more depth, so that's what we're doing now. And before we get started, Kristin, let's address a particular question from a listener.


[Kristin]
Oh, yes, from the mailbag. Right. Okay.


[Regina]
Here's what they write. The first few episodes of your podcast are great, and I'm enjoying listening.


[Kristin] (2:52 - 2:56)
Regina, I love how you slipped in that compliment. Good job. Pat on the back.


[Regina] (2:58 - 3:24)
They go on to write, I have a question after listening to the first one and your thoughts on bar graphs. What are good alternative visual representations for numerical data for showing treatment differences in a presentation? Tables feel hard for audiences to grasp and digest within the constraints of a presentation, especially at meetings.


Looking forward to hearing your thoughts.


[Kristin] (3:24 - 3:44)
Oh, great question. This goes back to the pheromones episode, our first full episode where we talked about how bar graphs are inappropriate for numeric data. They are meant for categorical data.


But it's an interesting question because you want some kind of visual in a presentation, not a table. So, Regina, what do you like to use instead for numeric data?


[Regina] (3:44 - 3:55)
I prefer box plots. I actually have a whole paper on box plots. That's how much I love them.


For that stats column for the medical journal that you invited me to do with you.


[Kristin] (3:55 - 4:01)
That is a great paper, and we're going to put a link to it in the show notes. Regina, these are also called box and whisker plots.


[Regina] (4:02 - 4:11)
Yeah, which is so cute, I think, because it reminds me of cats and how cats like to sit in boxes and cats have whiskers. So, cats and whiskers in a box plot.


[Kristin] (4:13 - 4:25)
That's a good image. You know, Regina, I also like to superimpose the actual data points on top of that box plot because I love to see individual data points. It gives you an honest feel for the data.


Nothing is hidden.


[Regina] (4:26 - 4:34)
Only if it's a small data set, though, or else that cloud of data starts to look like, I don't know, smog or a cloud of insects or something.


[Kristin] (4:34 - 4:35)
Particulate matter, yes.


[Regina] (4:35 - 4:54)
A box plot I love because it shows you nice summaries of the data. It gives you the min, the max, median, the quartiles, outliers, so you can get a nice picture of the shape. Like maybe it's really right skewed with some high outliers and you can say, hmm, something weird is going on here.


[Kristin] (4:55 - 4:57)
You don't get any of that from a bar chart. Right.


[Regina] (4:57 - 5:05)
And we love listener questions, so please keep them coming. You can submit them at normalcurves.com.


[Kristin] (5:05 - 5:17)
All right, Regina, I want to cover two major statistical topics today. First, I want to talk about the problem of multiple testing. Second, about statistical adjustment using regression.


[Regina] (5:17 - 5:57)
These are big topics that are likely to keep coming up in future episodes, so it's great if we tackle them now in a special episode to make them easier to understand. I think that's our sneaky goal here. And as I mentioned, I like to think of them as people, so I have little personalities and stories worked out for all of the concepts.


For example, the bad research practices, I am thinking of them as the bad boyfriends. Oh, that's good. Because you know you shouldn't, right?


But ooh, they're so seductive in the moment.


[Kristin] (5:57 - 6:29)
I love how your brain works, Regina. That should be really interesting. And I can't wait to hear what you came up with. But first, a little actual statistics.


Multiple testing. We touched on that in the pheromones episode, in vitamin D2, and in alcohol. And multiple testing comes in many flavors.


But the basic idea is that if you dig around in your data enough, if you slice it and dice it enough ways, torture it enough ways, fish around enough, you're going to find some quote, positive results that are actually just false positives.


[Regina] (6:30 - 6:41)
And I have a good personality for this. Let's just say he's the bad boyfriend who is not entirely monogamous and not entirely transparent about that.


[Kristin] (6:42 - 6:58)
That sounds right on the money, and I want to hear more about that. But first, to really understand the concept of multiple testing, you need to understand something about p-values and significance testing. We've mentioned these in previous episodes, but I want to unpack them a little bit more now.


[Regina] (6:59 - 7:13)
So much to say about p-values, Kristin. So much. You and I are working on an entire episode coming up just on p-values.


But I'm thinking maybe for now we can just touch on statistical significance. It's a slightly simpler concept.


[Kristin] (7:13 - 7:45)
That's a good idea, Regina. Yeah, that p-value episode is going to be fun because you and I are going to talk about some of our own papers for a change. But yeah, p-values are a little complicated and subtle, and we can't explain them in, you know, 30 seconds.


So we're going to defer that conversation. Good call, Regina. But Regina, do you remember back to the alcohol episode, you gave me an ESP test?


You were thinking of a number between 1 and 20, and you asked me to read your mind and find the number. And of course, I failed badly. I did not get it on the first try.


I think it took me, what, six tries?


[Regina] (7:46 - 7:56)
I would not call you a failure for that performance, by the way, because I did not actually think you were trying to be a professional psychic. I assumed you would be guessing.


[Kristin] (7:57 - 8:20)
Right. I mean, you've known me for more than a quarter century, so you probably would know by now if I were actually psychic. But imagine, Regina, it actually would have been so cool if I had gotten it right on the first try.


It would have ruined the episode because then it wouldn't have made the point we were trying to illustrate. True. But it would have been totally surprising, and I would have kept it in the episode just for the shock value.


[Regina] (8:20 - 8:21)
And that would have been worth it.


[Kristin] (8:22 - 8:40)
Absolutely. All right. So let's imagine that I did get it on the first try.


Just how surprising would that result have been if I'm not psychic? That is actually what a p-value tells us. In this case, the p-value is the probability that I would get the number in one try if I'm not psychic.


[Regina] (8:41 - 8:53)
You can see why this is difficult to convey in 30 seconds or less. And p-values, I want to point out, are at the heart of a debate that has been raging for 100 years.


[Kristin] (8:54 - 9:04)
We're going to defer that conversation. We're just going to calculate this one p-value here, and we'll talk about what it is more later. At least, though, I've made it easy to calculate this probability here.


[Regina] (9:04 - 9:13)
Right. I gave you 20 numbers to choose from, 1 to 20. So the probability that you would guess correctly is 1 in 20, or 5%.


[Kristin] (9:13 - 9:15)
And that 5%, that's the p-value for this experiment.


[Regina] (9:16 - 9:21)
P is for probability, by the way. Probability value, not penis value, as much as we wish it were.


[Kristin] (9:22 - 9:30)
Well, we've probably talked more about penis value than p-value so far in this podcast, so I don't know. Maybe we should co-opt that term for penises.


[Regina] (9:32 - 9:47)
Kristin, yes, if you got it right on the first try. This would have been surprising, but not overwhelmingly surprising because you had a 5% chance of getting lucky. So probably not enough to convince me that you're psychic.


[Kristin] (9:48 - 10:11)
But in medical studies, the threshold that we typically use to declare statistical significance is less than 5%. So if you had asked me to guess a number from 0 to 20, and I got that right on the first try, the p-value would be 1 in 21, which is less than 5%. And in most medical studies, that result would be, quote, statistically significant and enough for people to declare that I'm psychic.


[Regina] (10:12 - 10:16)
Which would be an example of a false positive, because you're not really psychic.


[Kristin] (10:17 - 10:44)
Right. If we are using this typical p-value less than 5%, or sometimes you hear it p less than 0.05, to declare statistical significance, this means we are accepting some false positives. When there is no real effect, when I'm not psychic, we are accepting a false positive rate of 1 in 20.


Exactly. Let's bring this back around to medical studies now. Remember in the alcohol episode, we talked about the Cascade Trial?


Regina, do you remember the Cascade Trial?


[Regina] (10:44 - 10:51)
That was the one in Israel where they managed to actually pull off a randomized trial of wine drinking, right?


[Kristin] (10:51 - 11:14)
Exactly. So, Regina, let's imagine for a minute that you and I pulled off a randomized trial of red wine versus water, and one of the outcomes was cholesterol. And what if we found, at the end of the trial, the red wine group was six units lower in cholesterol than the water group, even though, because it's a randomized trial, they started at about the same? Is that a big enough difference to convince you that wine lowers cholesterol, Regina?


[Regina] (11:14 - 11:25)
Hmm. Just six units difference. It's really not obvious if that's just a fluke difference between the groups that we happen to have in that study, or whether it's something real.


[Kristin] (11:25 - 11:44)
Right. But I mean, if the red wine group was 30 units lower, like they had a cholesterol of 170 versus the water group was at 200, then we probably wouldn't need a p-value to be convinced that red wine works. But if it's 194 versus 200, is that enough to convince you, Regina, that wine is psychic?


[Regina] (11:44 - 11:49)
Wine is definitely psychic, or psychedelic, maybe. Psychedelic. Different things.


[Kristin] (11:49 - 12:07)
Different things. Makes you feel like you're psychic. It might make you feel like that, but that would be a false positive because you're not actually psychic.


Yes. All right. So, let's say we have the six-unit difference, and we're not sure.


Does it reflect a real effect, or is it just a chance fluctuation? This is where significance testing comes in handy.


[Regina] (12:07 - 12:24)
Right. Significance testing helps us distinguish between signal and noise. It's the heart of stats right there.


Am I seeing random patterns in the cloud, or is that really God up there? Or have I had too much wine?


[Kristin] (12:25 - 12:34)
Exactly. So, let's say we found this difference in cholesterol was statistically significant. In many studies, this would allow us to conclude that wine lowers cholesterol.


[Regina] (12:34 - 12:43)
But we have to keep in mind, just like me concluding you're psychic, based on one lucky guess, this cholesterol finding could also be a false positive.


[Kristin] (12:43 - 13:04)
Right. And here's the problem. In that Cascade study, they didn't just look at cholesterol.


They looked at blood pressure, weight, glucose, insulin resistance, triglycerides, a whole host of outcomes. And if wine does nothing at all, every time you run a new statistical test, like looking at an additional outcome, you have a 5% chance of a false positive if you are using this traditional threshold for statistical significance.


[Regina] (13:05 - 13:15)
And if wine does nothing, and you run 100 tests, then you'd expect about 5%, that is 5 out of 100, to come up significant, just by chance.


[Kristin] (13:16 - 13:22)
Right. So if you run those 100 tests and get 5 significant results, this is actually consistent with wine doing absolutely nothing.


[Regina] (13:22 - 13:31)
And if you highlight just those five findings and pretend you never ran the other 95 tests, you are misleading people. It is cherry-picking.


[Kristin] (13:32 - 13:39)
Right. I mean, running a lot of tests per se is not wrong. The problem is when you are not transparent, when you don't tell people the context.


[Regina] (13:39 - 14:01)
And that is exactly, Kristin, in line with my multiple testing bad boyfriend guy. So I want to just preface it by saying that, of course, these could be bad girlfriends or bad lovers. I'm just drawing on, let's say, my own experience, which is mostly with bad boyfriends.


[Kristin] (14:02 - 14:04)
And you've had a few of those, Regina.


[Regina] (14:05 - 14:39)
You have seen them, yes. Okay. Ready?


[Kristin]
Yep.


[Regina]
Multiple testing guy is that charming dude at the bar or the guy that you're dating, and he says to you, of course, it's you. You're my primary variable of interest, baby.


And you're the only one that I'm seeing. But really? Come on.


He is hedging his bets, and he is saying the same thing to the 50 other variables in his life he had not committed. He's going to go with whatever catches his eye the most in the moment.


[Kristin] (14:40 - 14:44)
All right. So when he's telling you that you're the one, it's really just a false positive.


[Regina] (14:46 - 15:07)
The thing is, there is no law against this, right, in dating or in data analysis. The problem is, if you start to believe his story and you get all, like, personally flattered if he says, oh, this is significant and real, and he shows you off to all his friends, you know, like publishes you in the medical journal.


[Kristin] (15:09 - 15:18)
Right. We can get swept into the story just like we can in research studies. But really, at the end of the day, it's probably not a real connection.


[Regina] (15:20 - 15:32)
And so you have to think about what is the fix here. You ask for transparency. How many other variables are you seeing right now?


Have you committed? Did you pre-register our dates?


[Kristin] (15:34 - 15:43)
That would be interesting, pre-registering dates. Declare your intentions ahead of time. I'm not sure that he's going to agree to that.


[Regina] (15:44 - 15:50)
Exactly. But of course, the parallels to statistical data analysis practices are obvious.


[Kristin] (15:50 - 15:53)
It is hard to get people to pre-register studies as well, yes.


[Regina] (15:53 - 16:01)
It is, because it's more fun to not commit and play the field and date 50 variables all at the same time.


[Kristin] (16:02 - 16:04)
And get surprising and exciting results that you can publish.


[Regina] (16:05 - 16:28)
Right. They don't really mean anything. They're just a flash in the pan.


OK, if you cannot get him to pre-commit or be transparent, then I say you got to adjust your expectations. Do not be so easily impressed by multiple testing dude and his little declarations of love. You got to raise the bar for significance.


You got to Bonferroni the guy.


[Kristin] (16:30 - 16:52)
We haven't actually talked about Bonferroni yet on this podcast, so let me say a little bit about what that is. It just means that you can make your threshold for declaring statistical significance more stringent. You are raising the bar for what's considered signal versus noise.


So instead of needing that p-value to be under 5 percent, maybe you say it has to be under 1 percent or 0.5 percent.


[Regina] (16:52 - 17:02)
Mm hmm. It's like you're saying you got to impress me more, right? I'm not just going to take your little stories here.


You want to be more impressed.


[Kristin] (17:02 - 17:04)
You've got to be skeptical with multiple testing guy.


[Regina] (17:05 - 17:11)
Oh, I like it. I feel like this is good advice, not just for statistics, but actually dating.


[Kristin] (17:11 - 17:42)
The Bonferroni solution. I think we just copyrighted that, Regina. Regina, I love this analogy, multiple testing guy.


And the thing is that just like bad boyfriends come in all sorts of sizes, shapes and flavors, so does multiple testing. So remember in our pheromones episode, we looked at the sweaty t-shirt study. In that study, it appeared that the researchers engaged in what we call post hoc analyses.


This is a type of multiple testing. And post hoc literally means after the fact.


[Regina] (17:44 - 18:11)
We don't know for sure what happened in that pheromone study, but it does appear that the researchers had to twist the data into some knots in order to find a significant result. Remember, they did this weird thing where they shifted from the woman's perspective to the man's perspective. And then they also divided the women into those taking birth control pills and those who were not.


And only then did they find significant results.


[Kristin] (18:12 - 18:39)
And we call this post hoc because it appears that the researchers decided all these weird ways to analyze the data after they already had the data in front of them, which means they could have tried 10, 20, or even 100 different ways of analyzing the data. And of course, every time you analyze the data in a different way, it's an additional test that can yield a false positive. So it's a recipe for dredging up false positives.


And it looks like their findings were a false positive because they were not repeated in subsequent studies.


[Regina] (18:41 - 19:14)
Now, I am picturing post hoc analysis as the ex who cannot accept the fact that it is over. He cannot move on. I mean, the relationship did not work out.


It happens sometimes. There was no connection, no significance. But he is still texting you months later.


And he's like, whoa, wait, wait, wait, wait, let's try something else. Let's go on vacation. What if we just chilled as friends?


Let's be biking buddies. He will not give up, but not necessarily in a good way.


[Kristin] (19:14 - 19:23)
Regina, this sounds oddly specific. And I know that you just had dinner with an ex who resurfaced out of the blue. Could you be thinking of that?


[Regina] (19:23 - 19:30)
I might have drawn inspiration from recent real-life events, perhaps. Yeah.


[Kristin] (19:31 - 19:44)
Maybe he does want to be biking buddies. Yeah, Regina, he just wants to be biking buddies. Uh-huh.


He does not want to be biking buddies. He wants to get back together. That is quite obvious to me.


[Regina] (19:44 - 20:10)
OK, so you are suggesting he's doing a little post hoc rebound data analysis. You know, like it's over, and he's feeling rejected or sad. But he had a lot invested in this relationship.


You know, like researchers have a lot invested in their study. And maybe he's going to find a way to keep it alive and get significance out of this, no matter what.


[Kristin] (20:10 - 20:25)
Yeah, because you're a catch, Regina. But I am suggesting that this kind of post hoc relationship rarely works out. Because there's a reason that the relationship failed in the first place.


And I probably know some of those reasons, because we've talked it out a few times.


[Regina] (20:28 - 20:38)
And this might just be forcing something that is not really there, just like researchers cannot accept that sometimes the study doesn't work out, and you just got to move on.


[Kristin] (20:39 - 21:07)
It's hard to let go, though. It really is. Especially when you've collected all that precious data, and you really thought it was going to pan out.


It's hard to let go of it. Yeah. All right, Regina, another flavor of multiple testing that we've encountered on our podcast is outcome switching.


And that happened in that Cascade wine trial. So remember, they ran a lot of tests. They found five significant results on five different outcomes.


And they labeled those five outcomes as their quote, primary outcomes.


[Regina] (21:08 - 21:17)
But we knew from your statistical sleuthing, looking at the published protocol, that those were not the primary outcomes they had in mind when they designed the study.


[Kristin] (21:18 - 21:34)
Exactly. It looks like they labeled them after the fact, which is a type of post hoc manipulation. It's like the boyfriend who is rewriting history.


Oh, but honey, we got along so well. We never fought. I wasn't mean all the time.


I paid all the bills.


[Regina] (21:34 - 21:53)
I love that you're getting inspired. I knew you were going to get into this. This is good.


Yes, outcome switching, bad ex-boyfriend is the guy who is rewriting history. I love it. I never got angry at you.


We hardly ever fought. I never flirted with her. I made dinner all the time.


I was an angel.


[Kristin] (21:53 - 22:02)
Yeah. Husbands are like that, too. They also rewrite history.


That is why it's really good to keep records of things. And thank goodness, I am particularly good at record keeping.


[Regina] (22:04 - 22:21)
Not the sexiest thing, perhaps, but very useful. Very useful. Again, I feel like we have stumbled upon some general truths about dating and stats.


I like it. Keep a record.


[Kristin] (22:22 - 23:02)
Yes, keep a record. We also saw some post hoc analyses in the vitamin D2 episode. We were talking about that VITAL trial, which compared vitamin D to placebo.


But the difference is that VITAL trial was a good study, and they were transparent about the fact that these were post hoc analyses. They labeled them. They called it out.


They let the reader know, hey, these are post hoc, and you should take them with a grain of salt. Another thing they did in that study, which is a flavor of multiple testing, is they did some subgroup analyses. Subgroup analyses are when you start looking for an effect in different subgroups of the population.


Only people who are normal weight. Only women not on the pill. Only in men.


Only in people under 30.


[Regina] (23:02 - 23:13)
Right. And if you're splitting the data, the entire group of participants in 50 different ways, it's the same thing like testing a bunch of outcomes. You get 50 more chances to find a false positive.


[Kristin] (23:14 - 23:32)
Now, the nice thing about VITAL is the subgroup analyses were preplanned. They were not post hoc. You can have both post hoc and preplanned subgroup analyses.


But here, again, it was a good study. So they planned some subgroup analyses ahead of time. They didn't just split the data in every possible way.


But they still ran close to 20 subgroup analyses and found one significant result.


[Regina] (23:33 - 23:43)
One out of 20. Just what you'd expect by chance if vitamin D is doing nothing. Right.


So we shouldn't read too much into it. Just like you should not read too much into subgroup analysis, dude.


[Kristin] (23:45 - 23:48)
Uh, you gotta describe him for me. What are you thinking of here, Regina?


[Regina] (23:48 - 24:25)
Yeah, yeah. I am picturing him as the guy who's only into you under very specific conditions. Like, like everyday life, he's kind of like, you know, medium into you, whatever.


But when you're on vacation together, he really likes you. Like, it is working because you're all carefree and you're not dealing with real life together. Or maybe he only likes you when you got your makeup and your hair all cute.


It's like he likes little subsets of you, but not the real whole you. Right. And maybe you do totally work together when you're this tiny slice of yourself, but that's not robust.


Not being true to yourself.


[Kristin] (24:25 - 24:50)
It's not being your whole full self. Yeah. But, Regina, this is what researchers are doing when they carve up a data set and cherry pick the one subgroup where an effect shows up.


Right. It's probably not real. And, Regina, all these bad statistical practices we've been talking about, the multiple testing, post-hoc finding, subgroup analysis, outcome switching, there are, of course, all of those things are overlapping.


They make results look more impressive than they actually really are.


[Regina] (24:50 - 25:20)
Right. Just like with the boyfriends. Or ourselves.


You know, I'm a feminist. I'm making it seem like we're just at the mercy of all of these bad dudes. I admit I have been some of these people myself.


I am the subgroup analysis girl, often. I convince myself that even if it's not working overall, I'll just slice and dice myself until I find, you know, the tiny little subgroup that works or I'll slice and dice him and then I'll pretend the whole thing is real.


[Kristin] (25:21 - 25:23)
I think we've all been there, Regina.


[Regina] (25:23 - 25:36)
We have. And now, you know, I'm pushing the analogy pretty far, but it actually does work because I think that transparency and communication are really the keys here, just like in research.


[Kristin]
That is so profound, Regina.


[Regina] (25:37 - 25:50)
Thank you.


[Kristin]
Yes, we need transparency and communication in research and in relationships. That is the key to everything.


And you've got to resist those bad boyfriends who are not giving you that transparency and communication.


[Regina] (25:50 - 25:52)
Ask him about his variables.


[Kristin] (25:53 - 26:06)
Make him define his primary outcome up front. And be skeptical, right? And if you're having to twist yourself into a knot and make yourself smaller, then you need to walk away from that relationship.


[Regina] (26:08 - 26:12)
This actually works. And that kind of brought a little tear to my eye, Kristin.


[Kristin] (26:12 - 26:21)
Or walk away from the data that didn't give you the result you wanted. You have to just let it go and move on. Yes.


[Regina] (26:21 - 26:23)
Really. This is profound. I love it.


[Kristin] (26:24 - 26:28)
We have wisdom between the two of us on both research and dating to offer, Regina.


[Regina] (26:29 - 26:36)
I never thought about stats and dating as being quite so eerily parallel until now.


[Kristin] (26:36 - 26:48)
I know, right? All right, Regina. I think that wraps up our discussion of multiple testing.


Let's take a short break before we move on to regression-based statistical adjustment.


[Regina] (26:56 - 27:07)
Kristin, we've talked about your medical statistics program. It's just a fabulous program available on Stanford Online. Maybe you can tell listeners a little bit more about it.


It's a three-course sequence.


[Kristin] (27:08 - 27:20)
If you really want that deeper dive into statistics, I teach data analysis in R or SAS, probability, and statistical tests, including regression. You can get a Stanford professional certificate as well as CME credit.


[Regina] (27:20 - 27:24)
You can find a link to this program on our website, normalcurves.com.


[Kristin] (27:32 - 27:49)
Welcome back to Normal Curves. Regina, we were about to talk about regression-based statistical adjustment. And this is something we talked about across three different episodes, vitamin D2, alcohol, and sugar sag.


And actually, there was a subtle progression in those, believe it or not.


[Regina] (27:49 - 28:03)
Hmm. So, statistical adjustment character. I am picturing something a little different here, Kristin.


Not bad, boyfriend. I'm picturing a character from your favorite series, though, Sex in the City.


[Kristin] (28:03 - 28:17)
Oh, I do like that show. Yes. And actually, I'm watching the reboot now.


It's just coming out, season three. And it's fun to see them in their 50s because I might be able to relate to that.


[Regina] (28:18 - 28:28)
Just maybe. You were the one who gave me the first season on DVD the first time I watched it. It was you who gave me the DVDs for Christmas one year.


Yeah.


[Kristin] (28:28 - 28:44)
Oh, I don't remember that. I got onto the show, actually, when I was in Antarctica. I was on a ship in Antarctica, and there was no TV reception, obviously.


But we did have little TV and DVD players in the room. And there was a selection of videos. So, I watched the first season of Sex in the City, and I got hooked.


[Regina] (28:45 - 28:54)
I had never seen it before. And remember when you had that party where we were going to try to take a photo of us all dressed as characters from Sex in the City?


[Kristin] (28:55 - 28:58)
You are making me sound dorky, and I'm cutting this part, Regina.


[Regina] (28:58 - 29:04)
No, that was super fun. It was so cute. And you wanted me to be Samantha.


[Kristin] (29:05 - 29:08)
Well, you're my only blonde friend. So, I guess you had to have that role, yes.


[Regina] (29:09 - 29:17)
Well, that was before I had even written about sex for the LA Times. So, Kristin, I'm thinking that maybe you're psychic after all.


[Kristin] (29:17 - 29:31)
Oh, maybe I am psychic.


Or maybe that was the inspiration for your column.


[Regina]
Could have been.


[Kristin]
Yes, I did go through a Sex in the City phase with shoes.


I did have the strappy sandals with lots of colors. Of course, mine were knockoffs because I was, like, just out of grad school.


[Regina] (29:31 - 29:33)
You were super glamorous. I remember it. I loved it.


[Kristin] (29:34 - 30:31)
I was not super glamorous because walking in heels is not a high fashion achievement. But I did have cute heels. All right, wrapping it back to regression-based statistical adjustment.


I'm really curious now to see how you're weaving Sex in the City into this, Regina. But we started talking about this topic in vitamin D2. We talked about how in observational studies, people often are trying to isolate the effect of one exposure on one outcome.


Like, what is the effect of vitamin D on your VO2 max, your fitness level? And we likened the problem to trying to untangle a big ball of string. Because in observational studies, sometimes variables are promiscuous, as you called them, Regina.


Meaning they're tangled up with lots of other variables. Like, vitamin D is affected by obesity, age, diet, outdoor exercise, frailty, kidney function, so many things. Vitamin D, actually, as we talked about, is a particularly promiscuous variable.


[Regina] (30:32 - 30:35)
It gets around. It hooks up with a lot of other variables.


[Kristin] (30:36 - 31:59)
And the statistical tool that we often use to try to untangle this ball of string is statistical adjustment with regression. In vitamin D2, we talked about how this is a useful tool, but not all-powerful. Because if the ball of string is too messy, if your variables are too promiscuous, I do see the tie to Sex in the City now, almost, Regina.


It might be asking too much of the math to untangle all of that. And then, Regina, in the alcohol episode, we went one step further and started to talk about the actual math under the hood. We described that what the model is trying to do is to create a hypothetical world in which we pretend that everyone is at the same level of the confounders.


They all eat the same amount of French fries, or they exercise the same amount. And then we ask, what is the effect of vitamin D on fitness in this imaginary world where everything except vitamin D is held constant? But, Regina, I want to go a bit deeper now and explain more about what's going on mathematically using that example of vitamin D and fitness, VO2max.


So, let's picture now that we have data on these two variables, on a large sample of people, and we're going to make a scatterplot. On the vertical axis, we put VO2max. That is what we call the outcome or dependent variable.


And on the horizontal axis, we put vitamin D, and that's what we call the exposure or independent variable. And we've got this scatterplot of data points, and we draw a line on the plot that best fits the data.


[Regina] (32:00 - 32:09)
Right, the line that minimizes the distance between all of those individual data points and the line, it's a simple regression model called linear regression.


[Kristin] (32:09 - 32:17)
Right, linear for line. And imagine that that line has a positive slope. It tilts upwards.


That would mean that as vitamin D goes up, VO2max also goes up.


[Regina] (32:18 - 32:28)
But correlation, not causation. That does not mean that vitamin D causes VO2max to go up, because there are really important confounders going on.


[Kristin] (32:28 - 33:04)
Right, there is a glaring confounder here, which is exercise, because your VO2max is largely determined by how much you exercise. Where does a lot of exercise take place? Outdoors in the sun.


So if you exercise a lot outdoors, you're going to have a high VO2max, and also probably a good vitamin D. But it's just confounding. But let's say in our imaginary study, Regina, we've measured exercise.


We've measured how many hours per week people exercise. That means we can try to account for it. And here's a simple way to account for it.


We could just divide everybody up into two groups, high exercise and low exercise. And then we could analyze the data separately for each group.


[Regina] (33:04 - 33:17)
I like this because it's a common sense approach. We are comparing people who are mega exercisers, like you, to other people who are mega exercisers, and then couch potatoes to couch potatoes. We're comparing apples to apples.


[Kristin] (33:17 - 33:39)
This removes confounding because we are holding exercise constant. We're saying, let's only look at people who all exercise a lot. And within that group, if we still see a difference in VO2max between people with high and low vitamin D, then that difference cannot be explained by exercise because they're all exercising the same or at least similar amounts.


[Regina] (33:39 - 33:47)
But Kristin, now we've got two lines, right? One for the low exercisers, one for the high exercisers. So what are we doing with two lines?


Right.


[Kristin] (33:47 - 34:01)
We can average those two lines together, the slopes of those two lines together, to get an overall estimate of the relationship between vitamin D and VO2max. And maybe now we don't see much of a relationship anymore between vitamin D and VO2max.


[Regina] (34:02 - 34:09)
But we're dividing people into just two buckets, right? High exercise and low exercise. And that's a little crude.


[Kristin] (34:09 - 34:19)
It's definitely crude because in the high exercise bucket, you might have someone like me who's exercising five or six hours a week, and someone like my daughter who's exercising 16 hours a week.


[Regina] (34:19 - 34:23)
And then there's me who's happy to get like three a week.


[Kristin] (34:24 - 34:26)
But we'd all be in the same high exercise bucket.


[Regina] (34:27 - 34:41)
But we wanted to compare apples to apples. And the thing is, okay, maybe we are doing apples to apples, but they're different types of apples, like Granny Smith and Red Delicious and Honeycrisp all together. We need to separate them out.


[Kristin] (34:41 - 35:11)
Right. We need to separate them out. So let's take it a step further.


And instead of just dividing into high and low, any apples versus any oranges, we could divide people into narrower buckets. So everyone who exercises 16 hours per week, that's one bucket. And then everyone who exercises 15 hours per week, that's another, and then 14, and then so on down to zero.


This would give us 17 buckets, and we could fit 17 different regression lines and then average those together. And that's gonna more solidly control for confounding by exercise.


[Regina] (35:12 - 35:15)
Right. It's only Granny Smith compared with Granny Smith now. It's better.


[Kristin] (35:15 - 35:21)
Exactly. And Regina, if you'll permit me, I wanna paint a 3D visual here.


[Regina] (35:21 - 35:29)
Is that okay? Uh-oh. Okay.


So I want to say this might be tricky for people like me who do not have a brain that likes three dimensions.


[Kristin] (35:30 - 35:51)
Yeah, it's a little hard in a podcast without like my chalkboard, but let's try. So let's imagine that we have a scatterplot, just a basic scatterplot, vitamin D on the horizontal axis, VO2 max on the vertical axis, like we talked about before. But now we're going to add another horizontal axis coming out of the page towards you, and that axis is gonna have hours per week of exercise on it.


[Regina] (35:52 - 35:58)
So now we're in 3D space is what you're saying. Yes. We've moved out of the page and now we're in space.


[Kristin] (35:58 - 37:11)
Yes. Each person is a dot in this space based on their vitamin D, VO2 max, and exercise. So imagine this kind of cloud of dots.


All right, so now imagine, let's slice this 3D space by exercise level, kind of like slicing a block of cheese. So we're gonna slice one slice for people who exercise zero hours per week, another slice for people who exercise one hour a week, another for two and so on. And in each slice, we look at just those people and we fit a regression line between vitamin D and VO2 max.


And you can picture it visually. We've now got this stack of 17 different regression lines, one per exercise level. And these lines are probably all tilted a little bit differently, right?


They're not all lined up. So the model rotates them all until they are all tilted exactly in the same direction. And that forms a plane.


[Regina]
Not an airplane, a plane in geometry.


[Kristin]
Yes, thank you. Thank you for that clarification, Regina.


Now I'm picturing little airplanes, yes. Think of a stiff piece of paper or a piece of cardboard. And that cardboard is a plane that's like we literally put a plane through our cloud of dots.


And that plane is your multivariable regression. It's telling you the average relationship between vitamin D and VO2 max at every level of exercise.


[Regina] (37:12 - 37:29)
And I want to say again, if 3D visuals are not your thing, people who are listening to this, do not worry. Just imagine fitting a line for each level of exercise and then using math, magic math, to average those lines into one. That's all the model is really doing.


[Kristin] (37:30 - 37:40)
Yeah, exactly.


And Regina, we could go even further. We can slice the data even more finely, right? Why stop at hours of exercise per week?


We could make a slice at every minute of exercise per week.


[Regina] (37:41 - 37:45)
Well, we don't have enough people in each bucket this way.


[Kristin] (37:45 - 38:14)
Right, of course. Yeah, I might not have any people doing exactly 301 minutes of exercise per week. I can't fit a regression line in each of these slices because we don't have enough data. But that's where we can use math.


Instead of slicing the data into a million tiny buckets and analyzing each one, we're just going to fit the plane. We're going to find the plane that best fits the cloud of data points. You know, find the piece of cardboard.


And again, this represents the effect of vitamin D on VO2 max at every level of exercise. Regina, can I blow your mind even further here?


[Regina] (38:15 - 38:16)
Go right ahead.


[Kristin] (38:16 - 38:57)
All right. What if we also wanted to account for another confounder now? We want to add BMI to the model.


We would have to add another axis for BMI. And this is getting harder to picture. But imagine we slice the data now into different BMI levels, just like we did with exercise.


But now within each slice of BMI, we are fitting a plane between vitamin D and exercise and VO2 max. All right. So we're getting now, instead of a whole bunch of lines, we're getting a bunch of planes, one for each BMI level.


And we're going to stack those planes up together into a cube. And the tilt of that cube now tells us something about the relationship between vitamin D and VO2 max adjusted for BMI and exercise.


[Regina] (38:58 - 39:20)
Kristin, this is amazing what you're doing here. So I have lectured about multiple regressions, statistical adjustment for judges. I did a workshop for federal judges and I used slide with animations.


I did not attempt to do it as a podcast and just words. So I'm impressed that you are blowing our minds with this.


[Kristin] (39:21 - 39:44)
I love that workshop that you give. It has some great visuals and great animations. And I should add, Regina, that you and I do give statistical workshops and writing workshops customized for different groups, such as judges.


And we are happy to talk to you about that if you want. Just contact us.


[Regina]
I love that.


That was so smooth, Kristin.


[Kristin]
Did I get that plug in nicely?


[Regina] (39:46 - 39:54)
But I feel like we need some psychedelics if we're going to keep going down this path with the cubes and the hyper cubes.


[Kristin] (39:54 - 39:55)
Think of a Rubik's cube.


[Regina] (39:56 - 40:00)
No, no, no. I think we need mushrooms. This whole thing would be good on mushrooms.


[Kristin] (40:00 - 40:42)
We're not encouraging that, but that reminds me of a boyfriend story, Regina, actually, that we're talking about substances to understand statistics. Do you remember the boyfriend that I was dating actually, when I met you? Our boyfriends at the time were friends.


They introduced us. That boyfriend, I have this very distinct memory. I was sitting, I think, at a bar and I was waiting for him and he came up to me and he had been doing some weed and he said, I finally understand statistics.


He had an epiphany while on marijuana. I remember that moment very distinctly. So maybe there's something to that.


Maybe he didn't remember statistics after the marijuana wore off, but he got it all, well, in that moment.


[Regina] (40:43 - 40:50)
I have never heard this story. Did I never tell you that story? I love it.


And weed is legal in many places. So again, we're not encouraging it.


[Kristin] (40:51 - 40:53)
Technically it was illegal then, but we'll just ignore that part.


[Regina] (40:54 - 41:02)
Right, but now it's not illegal. So we're not discouraging using a little cannabis to help you understand.


[Kristin] (41:02 - 41:05)
Or just get out like a piece of cardboard and a Rubik's Cube. That would be easier.


[Regina] (41:06 - 41:15)
What a party. Bringing the weed.


[Kristin] (41:15 - 41:52)
All right. So yes, this is hard to picture because then if I start saying, let's add another axis, now for some element of diet or supplements or whatever the other next confounder is, we're starting to get in like five dimensions. My brain at that point, I can't picture past a cube, I must admit, Regina, but you can kind of extrapolate and imagine that this would keep going past a cube.


[Regina]
Finally, your brain breaks.


Good to know.


[Kristin]
All right. So that's the mathematical picture.


And that leads us into our discussion of what are the limitations of this? It is super powerful, as we've talked about. But as we said in the alcohol episode, it's not magic.


We are statisticians, Regina, and not magicians.


[Regina] (41:52 - 41:57)
Okay, Kristin, now can I share the statistical adjustment character? Are you ready?


[Kristin] (41:57 - 41:58)
Oh, I can't wait. Yes.


[Regina] (41:59 - 42:02)
I am picturing Charlotte from Sex and the City.


[Kristin] (42:02 - 42:19)
Oh, that's perfect, actually, because she's idealistic. She believes in true love and tidy resolutions. She wants things to work out.


She's willing to put the effort in to make them work out. Yeah, that does sound a lot like statistical adjustment, over-idealized in some ways.


[Regina] (42:19 - 42:36)
Right. She's got this model. She's going to put everything in.


She knows how she's going to get married. Statistical adjustment, trying to smooth over all the messiness and make the relationships clean and meaningful. But sometimes Charlotte tries too hard.


She messes things up.


[Kristin] (42:37 - 42:49)
That is a perfect analogy for statistical adjustment. I like it, Regina. All right, so, Regina, pop quiz.


Do you remember any of the statistical adjustment problems that we talked about back in the alcohol episode? How closely were you paying attention?


[Regina] (42:49 - 42:59)
Oh, very close, of course. I remember Botox. Botox does not erase all wrinkles slash confounding.


So you've got unmeasured confounding and residual confounding.


[Kristin] (43:00 - 43:19)
Everybody's going to remember unmeasured and residual confounding from Botox. All right. So unmeasured confounding, I think that's easy for people to get.


So going back to that VO2max and vitamin D study, that actual study that we looked at in the vitamin D2 episode, they didn't have data on exercise, so they couldn't do anything about it. Right? You just, if you don't have the variable, you're out of luck.


You can't put it in the model.


[Regina] (43:20 - 43:33)
This is just like poor Charlotte and the cautionary tale of her marriage to Trey. You remember this series better than I do, Kristin. How about you give us a recap of what happened with Charlotte and Trey.


[Kristin] (43:34 - 43:53)
Right. So Charlotte's first husband, she meets Trey McDougall, who is a doctor, looks great on paper, has wealth, has good pedigree, well-groomed, right? But she waits to have sex until they get married, and then she finds out that he cannot get it up.


[Regina] (43:55 - 44:05)
Charlotte made the mistake of unmeasured confounding. She never actually collected data on what turned out to be very important, sex.


[Kristin] (44:05 - 44:18)
Right. By the time she's building the model, it's too late. Just like when you're doing a study, if you're analyzing the data at that point, it's usually too late to go back and collect any other data, so you're out of luck.


And it was too late. She was already married.


[Regina] (44:19 - 44:21)
Isn't that great? That is a good analogy.


[Kristin] (44:21 - 44:22)
Yes. Yes.


[Regina] (44:22 - 44:24)
Collect data on sexual compatibility.


[Kristin] (44:25 - 45:19)
Yes. It's a good idea. Don't leave it unmeasured until it's too late.


But, Regina, there's also the problem of residual confounding, and I'm dying to know, do you have a character for residual confounding?


[Regina]
I have another Charlotte story for this one, too. Okay.


[Kristin] All right. Well, let me just recap what residual confounding is. So, residual confounding is leftover confounding.


And this occurs because, think about it, our model is this huge simplification, like a plane or a cube acting as if everything forms perfect straight-line relationships, which is probably not the case. Think about exercise and VO2max. If you increase your exercise per week from zero hours to one hour, you're probably going to get a pretty big boost in your VO2max.


But once you're already doing 16 hours per week and you increase to 17 hours a week, it's probably not going to give the same boost. You're going to get some tiny gain. So this is not a straight-line relationship between exercise and VO2max.


[Regina] (45:20 - 45:29)
Probably not a line known as the law of diminishing returns. And we talked about this in Sugar Sag. I seem to remember, about wrinkles as we get older.


[Kristin] (45:30 - 45:40)
And a lot of those models where wrinkles was the outcome, they had to adjust for age. And of course, that assumes that there is a straight-line relationship between age and wrinkles.


[Regina] (45:41 - 45:52)
But as we get older, whoa, wrinkles just accelerate. Between 10 and 15 years of age, nothing. But between 45 and 50, like, whoa.


[Kristin] (45:52 - 46:07)
Yeah. So it's definitely not a straight line. And so our model isn't ideal.


It's a simplification. Another thing, all of these things that we're measuring are imperfect measurements, right? I mentioned exercise here.


We probably have self-reported exercise. Well, that's not going to be very accurate.


[Regina] (46:08 - 46:19)
We talked about self-report and penis size. I'm guessing, Kristin, people do not always completely, perfectly report their exercise, just like they don't report their penis length.


[Kristin] (46:20 - 46:35)
Right. Social desirability bias. People are probably going to over-report their exercise unless, Regina, we give them a lie detector test or what was it?


Play loud music and make them memorize numbers before we ask them. We had tricks in that episode. People should go back and listen to that.


[Regina] (46:36 - 46:39)
And tell them they get condoms at the end if they are accurate.


[Kristin] (46:40 - 46:42)
I don't know if that one works for reporting exercise correctly.


[Regina] (46:43 - 46:44)
I'm mixing incentives, yeah.


[Kristin] (46:45 - 47:03)
Okay. Fun episode, though. Fun episode that everybody should listen to.


All right. Because our measurements are imperfect and the model's imperfect, our imaginary world is kind of blurry or wonky, as we talked about in the alcohol episode. And we're just not going to be able to mathematically, perfectly remove all the confounding when everything's a little wonky.


[Regina] (47:03 - 47:19)
Exactly. Okay, so this relates to my story with poor Charlotte failing in her romantic life through residual confounding. I'm just warning you.


Okay, it's not great. But it did remind me somehow of Brad the Bad Kisser. Do you remember this one?


[Kristin] (47:19 - 48:00)
Oh, I do remember that episode. Yeah, I have watched these too many times.


I think the line was something like, maybe we should change his name from Brad to Bad. I think that was Samantha's line in that one. All right, this might be a bit of a tortured metaphor analogy, Regina.


But I see where you're going with this. Unlike with Trey, where she just failed to measure the variable, she did measure the kissing with Brad. And she did put it into her model in considering whether to keep dating him.


So it's like residual confounding because she did put it in the model, she accounted for it, but there's still this leftover problem and she's not sure whether or not she can live with that problem. And actually, that was the debate of the episode, was the bad kissing a non-negotiable or not?


[Regina] (48:01 - 48:08)
My model could not handle having residual confounding bad kissing. Non-negotiable.


[Kristin] (48:09 - 48:23)
Eventually she realizes that even that residual confounding is too much of a harm in the relationship. She's not going to live with it and she dumps him. Regina, what's another limitation of statistical adjustment that we talked about in Sugar Sag?


Do you remember?


[Regina] (48:24 - 48:30)
I do remember Snapchat filters. Do not overdo it because then you look really weird.


[Kristin] (48:31 - 48:38)
Exactly. So this is when you throw a bunch of variables into a model without thinking and you end up answering questions that make no sense.


[Regina] (48:39 - 48:42)
I do have a Charlotte story for this one.


[Kristin] (48:42 - 49:19)
Are you going to make me wait to hear it? I am. You're teasing it.


Okay. All right. So remember in Sugar Sag, we had this funny little study where wrinkles was actually measured as a binary variable.


We're going to ignore that part. Still can't get over that. But they were looking at how wrinkles were related to diet.


And when the researchers looked at fat intake by itself, no relationship to wrinkles. Carbs by itself, no relationship to wrinkles. But then they threw all these different dietary variables into the same model, vitamin C, thiamine, fats, carbs, all at once, and suddenly, boom, statistically significant results.


And then it looked like your carb intake was linked to wrinkles. But what was really happening, Regina?


[Regina] (49:19 - 49:23)
They were just fitting noise. They were seeing patterns in clouds.


[Kristin] (49:24 - 51:03)
Exactly. They were answering kind of a ridiculous question at that point. It was something like, what is the effect of carbs on wrinkles after holding vitamin C, thiamine, and fat completely constant?


And here's the problem. In real life, those things go together. People who eat a lot of vitamin C tend to eat more thiamine and probably fewer processed carbs.


So when you already have vitamin C in the model, you're already capturing a lot of what it means to have a healthy diet, right? Let's go back to the 3D picture, if we can, Regina. The block of cheese.


Slicing up our data. So let's start by putting just vitamin C in our model. We have vitamin C on the horizontal axis, wrinkles on the vertical axis, right?


Now add thiamine on a second horizontal axis coming out from the board. Now imagine trying to slice the data into levels of thiamine, just like we did with exercise, like these slices of cheese. But remember, vitamin C and thiamine go together.


So if you look at a high thiamine slice of cheese, what you're going to see is all the vitamin C points are going to cluster together in that slice. It's like you have cheese with pimentos, but the pimentos are all clustered up on one end of the slice. If it's a high thiamine slice, the pimentos are all clustered together at high levels of vitamin C.


And because they are so clustered together, it's going to be hard to fit an accurate regression line between vitamin C and wrinkles. The model has to guess how to fit that line, or maybe there's a few weirdos in that high thiamine cheese slice who have really low vitamin C. And because their pimento is at the other side, it's going to draw the line, the regression line to them.


And so you're essentially going to be fitting a line to like this one weird person. So that's going to cause you to see a pattern that's not even there.


[Regina] (51:03 - 51:09)
I'm kind of hungry now with the cheese and pimentos, and I'd like some pickles and a martini, please. Thank you.


[Kristin] (51:11 - 51:38)
And then it gets even worse, Regina, because now we add carbs to the model on even another axis, again, four dimensions. You're compounding the problem and you're really painting yourself into a statistical corner now because you're asking about the effect of carbs at a specific level of vitamin C and thiamine. And at that point, there's probably very little variation left in the carbs.


You've really got a lot of pimentos squished together in your block of cheese. So you're again fitting noise and then you add fat to the model and it's even worse.


[Regina] (51:39 - 52:11)
So it's not that these nutrients, vitamin C, whatever, suddenly matter. It's just that the model is kind of twisting itself into these weird knots to find something, anything, to explain these tiny little bits of variation you've got left over. Exactly.


OK, my Charlotte inspirational hopeful tale about over-adjustment is her marriage to Harry and the problems that she encountered at the very beginning of her relationship with Harry. Do you remember Harry?


[Kristin] (52:12 - 52:22)
Right, of course. Harry was her divorce lawyer when she divorced Trey McDougal. But Harry was not a bad boyfriend.


He was good. So where are you going with this, Regina?


[Regina] (52:23 - 52:36)
Yeah, but she discounted him at first, right? Oh, she did, yes. Yes, she was like, he's not good husband material.


He's not good for me. And this is because she was throwing too much into the model. Why did she not like him?


[Kristin] (52:37 - 53:08)
Right, because he was bald and hairy and sweaty and not her picture-perfect pedigreed man. Yes, so she was throwing all of those into the model, is what you're saying, and over-adjusting, over-counting all of those dimensions when they, in fact, were not important. And by over-filling her model with all of these unimportant variables, she's answering the wrong question about what makes a good husband.


And she mistakenly believes, based on her initial model, that Harry is not good husband material.


[Regina] (53:08 - 53:20)
But, but, yay Charlotte, she realized the error of her ways, tossed her model, started again, and stopped screwing things up with over-adjustment, and then got married.


[Kristin] (53:20 - 53:22)
Throw out the unimportant variables, right.


[Regina] (53:22 - 53:25)
Right, is she still married in the reboot?


[Kristin] (53:25 - 53:34)
She is still married to Harry in the reboot, yes. That's the only relationship, really, that worked out, although Carrie's kind of back with Aiden, so I guess maybe that wouldn't worked out in the long run, yeah.


[Regina] (53:35 - 53:43)
Okay, but I'm going to see this as a vote for statistical adjustment used properly. And not, not with over-adjustment. You find true love.


[Kristin] (53:43 - 53:48)
When you fix your model, it can be useful, exactly.


[Regina] (53:48 - 54:19)
There you go. Kristin, I feel like we are nearing the end of our episode, and I've got to say, I like this format. It was kind of fun. I was a little suspicious at first, but it was fun to review.


I think that we have done some interesting things today with our bad boyfriends, bad statistical practices, and hopeful sweet statistical adjustment Charlotte. I think that we've actually touched on some deeper truths here.


[Kristin] (54:19 - 54:32)
We have given a lot of advice for both good research and good dating, and it's amazing how much overlap there is between those two seemingly unrelated domains of life. I would not have predicted that.


[Regina] (54:33 - 54:45)
Me neither. But we would like to hear from listeners. Did this work?


Did it not work? What do you like? What's going well?


Give us feedback. We love feedback.


[Kristin] (54:46 - 55:12)
And again, we encourage listener questions and also topics you'd like to hear in more depth in one of these kinds of review episodes. And I'll just give a reminder to everybody that Regina and I do have lots of online courses, which you can find on our website if you like the way we teach about statistics and statistics. And writing.


And we also, as I mentioned, give customized workshops. So contact us if you're interested to find out more about that through normalcurves.com.


[Regina] (55:12 - 55:15)
That was a great, smooth, yet shameless plug.


[Kristin] (55:16 - 55:28)
We are not afraid to shamelessly plug, as women should not be. And Regina, we are going to provide a review sheet with the show notes for anybody who's really trying to follow our podcast to learn statistics in depth.


[Regina] (55:29 - 55:31)
Those are for the overachievers.


[Kristin] (55:32 - 55:39)
Yes, absolutely. Or teachers who are using our material for their class or students who are following this for class.


[Regina] (55:40 - 55:48)
This whole episode has been more fun than I expected. I will give you that one. Thank you, Kristin.


And thank you, everyone, for listening.


[Kristin] (55:48 - 55:50)
Thanks, Regina. Thanks, everyone, for listening.