Sleep and Exercise: Does working out on too little sleep speed up aging?

Can exercise actually be bad for you if you don’t get enough sleep? A widely shared claim says yes—that working out while sleep deprived may speed up aging. In this episode, we put that claim under the microscope. We examine the study behind it, unpack how sleep and aging were measured, and explore key statistical ideas like interaction effects and flexible models that can “dance” to the data. With the help of a $400,000 handbag and a man with seven boats, we also break down what it really takes to show that one variable changes the effect of another. What we find: some clear study bloopers, inconsistent modeling results, and interpretations that are flat-out wrong.
Statistical topics
- Measurement error
- Model specification
- Piecewise linear regression
- Regression models
- Residual confounding
- Splines
- Statistical interactions
- Survey design
Methodological morals
- “Before you believe something shocking, ask what had to go wrong to make it true.”
- “If slight modeling changes flip the story, there wasn't much story to begin with.”
- “Unethical Life Pro Tip: If you do not want your analysis critiqued, then just make it impossible to understand.”
Kristin’s Biological Age Calculator
References
- Original Viral Tweet: Ng D. "People who slept under 6 hours and exercised actually aged faster." X. March 9, 2026.
- Holmer B. Does exercise “age you faster” if you don’t sleep enough? Medium. March 16, 2026.
- You Y. Chen Y. Liu R., et al. Inverted U-shaped relationship between sleep duration and phenotypic age in US adults: a population-based study. Sci Rep. 2024;14:6247.
- Levine ME, Lu AT, Quach A, et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging. 2018;10:573-591.
Kristin and Regina’s online courses:
Demystifying Data: A Modern Approach to Statistical Understanding
Clinical Trials: Design, Strategy, and Analysis
Medical Statistics Certificate Program
Epidemiology and Clinical Research Graduate Certificate Program
Programs that we teach in:
Epidemiology and Clinical Research Graduate Certificate Program
Find us on:
Kristin - LinkedIn & Twitter/X
Regina - LinkedIn & ReginaNuzzo.com
- (00:00) - Introduction
- (04:05) - What is NHANES?
- (06:38) - The Sleep Duration Results
- (12:50) - The 2015 Sleep Mystery
- (17:10) - Measuring Biological Aging
- (22:32) - The Penalized Cox Regression
- (29:13) - Sleep and Aging Results
- (31:00) - Cubic Splines and Dancing
- (38:08) - Adding Exercise to the Mix
- (42:16) - Boats, Handbags, and Interaction Effects
- (49:39) - The Cubic Spline Exercise Analysis
- (52:40) - The Opposite Result
- (57:13) - Academic Writing Gone Wrong
- (59:46) - The Writing Makeover
- (01:02:31) - Rating the Claim with Gatorinis
00:00 - Introduction
04:05 - What is NHANES?
06:38 - The Sleep Duration Results
12:50 - The 2015 Sleep Mystery
17:10 - Measuring Biological Aging
22:32 - The Penalized Cox Regression
29:13 - Sleep and Aging Results
31:00 - Cubic Splines and Dancing
38:08 - Adding Exercise to the Mix
42:16 - Boats, Handbags, and Interaction Effects
49:39 - The Cubic Spline Exercise Analysis
52:40 - The Opposite Result
57:13 - Academic Writing Gone Wrong
59:46 - The Writing Makeover
01:02:31 - Rating the Claim with Gatorinis
[Kristin] (0:00 - 0:07)
Yeah, I'm going to go out on a limb here and say that that handbag is more insane than this paper that we're scrutinizing today.
[Regina] (0:09 - 0:12)
For you, that is actually saying a lot.
[Kristin] (0:19 - 0:28)
Welcome to Normal Curves. This is a podcast for anyone who wants to learn about scientific studies and the statistics behind them. I'm Kristin Sainani.
I'm a professor at Stanford University.
[Regina] (0:28 - 0:34)
And I'm Regina Nuzzo. I'm a professor at Gallaudet University and part-time lecturer at Stanford.
[Kristin] (0:35 - 0:40)
We are not medical doctors. We are PhDs. So nothing in this podcast should be construed as medical advice.
[Regina] (0:40 - 0:45)
Also, this podcast is separate from our day jobs at Stanford and Gallaudet University.
[Kristin] (0:45 - 0:49)
Regina, sorry to tell you, today we're going to talk about aging.
[Regina] (0:49 - 0:53)
You always choose the cheeriest topics, don't you, Kristin?
[Kristin] (0:53 - 0:55)
Yeah, this is going to be super cheery, Regina.
[Regina] (0:55 - 1:14)
When it comes to aging, Kristin, you already took away my French fries and grilled cheese, telling me they will give my face more wrinkles. So maybe I'm just going to opt out of this episode entirely. I already feel guilt every time I have French fries.
[Kristin] (1:14 - 1:46)
I feel like my work is done here, Regina. All right. This one comes straight out of something I saw on Twitter recently.
And sorry, I am never going to call it X, so too bad. Someone posted the following tweet along with a graph. They said, Regular exercise is linked to slower biological aging, but only in people sleeping 7 plus hours.
People who slept under 6 hours and exercised actually aged faster. And this tweet went viral.
[Regina] (1:46 - 2:30)
Wow. Let me make sure I understand this, Kristin. So if I regularly get enough sleep, then exercise helps me not age as fast.
I'm not surprised about that. But you're saying if I'm chronically sleep deprived, then exercise makes it worse, not better? That part doesn't make sense to me.
Yeah. Yeah. So statistics.
Can I mention statistics here? In statistical jargon, that's what we call an interaction effect, which is fun to look at. So I'm excited about this episode, but I still want to see why exercise is bad for me if I don't sleep.
I mean, when I don't sleep, I don't feel like exercising either, but I thought it was still good for me.
[Kristin] (2:30 - 2:40)
Yeah. Here's the thing, Regina. This claim is really counterintuitive, which of course is what made it go viral.
Counterintuitive things go viral. So guess what we're doing today, Regina?
[Regina] (2:40 - 2:46)
Are we going to do a deep dive into the evidence behind this counterintuitive viral claim, Kristin?
[Kristin] (2:46 - 3:20)
We sure are. The claim that we're going to evaluate is the second part of that viral tweet. The claim that regular exercise speeds up biological aging in those with too little sleep.
And this claim came straight from a 2024 paper published in Scientific Reports. We're going to dissect this specific paper today. Okay.
Statistically, as you mentioned, we're going to talk about interaction effects. We are also going to talk about NHANES data, questionnaire design, survival analysis, cubic splines and piecewise linear regression.
[Regina] (3:20 - 3:29)
I love all of those topics, almost as much as I love French fries and sleeping. Come to think of it, which is appropriate for this episode.
[Kristin] (3:30 - 3:39)
Okay. So this paper, again, published in 2024 in Scientific Reports, which is a Nature journal. So you'd think it would be a decent paper.
[Regina] (3:40 - 3:47)
The way you say that makes me think not so decent. We may not be so enthusiastic about this paper, huh?
[Kristin] (3:47 - 4:01)
Oh, yeah. I think we will be getting out some soapboxes in this episode. I do want to point out that I'm not the only person who picked up on and critiqued this tweet and this study.
Brady Holmer also wrote about this in a blog post on Medium.
[Regina] (4:02 - 4:05)
Oh, excellent. We will put a link to that in the show notes.
[Kristin] (4:05 - 4:11)
Yes. Regina, let's start by talking about the methods of this study. So first of all, they used NHANES data.
[Regina] (4:12 - 4:18)
We've seen NHANES data before in this podcast, back in, I think, the Vitamin D episodes.
[Kristin] (4:18 - 4:28)
That's right. But there was so much else to talk about with Vitamin D that we did not go into any detail then about NHANES. So let's pause now and talk about NHANES.
[Regina] (4:28 - 4:39)
Right. It's a really important dataset because a lot of national health statistics come from NHANES. It's a well-done, ongoing study, huge dataset.
[Kristin] (4:39 - 4:51)
Exactly. NHANES stands for National Health and Nutrition Examination Survey. It's run by the Centers for Disease Control and Prevention, and it's basically one of the workhorses of U.S. health research.
[Regina] (4:51 - 5:12)
And if I remember correctly, Kristin, the researchers go all around the country and interview people about all kinds of health things, diet, sleep habits, smoking, exercise, and they also have like a little mobile exam center clinic thing where they can put you on a treadmill or take your blood. Right?
[Kristin] (5:12 - 5:24)
Yeah, that's right. The key thing about NHANES is it's cross-sectional. You are not following the same people over time.
Rather, you are taking a snapshot of the U.S. at any given moment.
[Regina] (5:24 - 5:31)
Right. The study itself is ongoing, but each wave of data is like this snapshot freeze frame of the country.
[Kristin] (5:32 - 6:06)
Exactly. They release the data in two-year waves. Some form of NHANES has actually been going on since the 1960s, but the modern version started in 1999.
So we have data back every two years, starting with the 1999 to 2000 wave. It is publicly available data, so anyone can look at it, but the data are just a little tricky to analyze because they use a complex survey design. That means they oversample some smaller subgroups in the U.S. to make sure that they get enough people in those subgroups to make reliable estimates.
[Regina] (6:06 - 6:18)
That means when we analyze the data, we need to w8 those oversampled groups differently so that the overall results look like the true distribution in the U.S.
[Kristin] (6:19 - 6:38)
Exactly, because we want to be able to make conclusions that apply to the entire U.S. OK, so this 2024 paper used NHANES data, and the first aim of this paper was just to look at trends over time in the amount of sleep that Americans are getting.
They didn't look at aging yet. They just analyzed sleep data from 2001 to 2020.
[Regina] (6:38 - 6:57)
Sleep data, how exactly do they measure sleep in NHANES? Do you have to report your sleep habits? Because we've talked before about the problems with self-report, or did they have technology?
I'm hoping like an Apple Watch or an Oura Ring, something like that.
[Kristin] (6:57 - 7:37)
It was just self-report, Regina, and that's very important because, of course, that affects the accuracy of the results. So the first aim of the paper was purely descriptive, just to describe trends over time in how much sleep Americans get every night on weekdays or workdays. They did show histograms of the data from each wave, but they didn't actually give any numbers.
And I wanted to know how I compare. So I decided to download the NHANES data and calculate those numbers myself. I used data from 2015 onward.
And, Regina, do you think that you are doing better or worse than most Americans on the amount of sleep that you get each night?
[Regina] (7:37 - 7:50)
Oh, gosh. I am probably doing average to slightly better because I am worthless without sleep. So the only way I can even function is if I get enough sleep.
[Kristin] (7:51 - 8:27)
OK, so you think you're doing better. I'm actually definitely doing worse because I usually get, you know, like 6 something hours per night on weekdays. And it turns out that puts me in the lowest quartile, the bottom 25 percent of Americans, because the first quartile was at 7 hours.
The median was 8 hours and the 75th percentile was 8.5 hours. That tells us that the middle 50 percent of Americans get between 7 and 8.5 hours per night. And, Regina, the average was 7.7 hours. So where do you fall in that, Regina?
[Regina] (8:27 - 9:06)
OK. I am looking at my Oura Ring data right now while we're recording this, pulled it up. And it says December to March of this year, I averaged 6 hours and 24 minutes per night, which is not so great, Kristin.
No. No. Let me go back to 2025.
It says March to June, let's see, 6 hours, 15 minutes. Whoa. All right.
Not as good as I thought. And last night, 5 hours and 30 minutes. So I'm going to guess that overall, I'm in the lowest quartile as well.
[Kristin] (9:07 - 9:10)
Yeah, that is surprising to me, Regina. I think you're doing worse than me, actually.
[Regina] (9:11 - 9:35)
I think this explained a lot about my life right now. Maybe why I'm so tired all of the time. But wait a minute.
I see that back in 2024, for two months, two whole months, I averaged 7.5 hours a night. But after that, it dropped drastically. And Kristin, maybe it was because that was when we started podcasting.
[Kristin] (9:36 - 9:38)
You're going to blame everything on me.
[Regina] (9:38 - 9:43)
I am. No boyfriend, no sleep, all the podcasting, all on you.
[Kristin] (9:44 - 9:50)
I'm overworking you. But, Regina, don't worry, because there's going to be some good news for us later in this episode. Hold that thought.
[Regina] (9:51 - 9:55)
You're going to tell me I get to skip exercise and go eat French fries instead?
[Kristin] (9:56 - 9:58)
Not exactly, but there is a silver lining here.
[Regina] (9:58 - 9:59)
Okay.
[Kristin] (9:59 - 10:30)
So back to the paper, they compared sleep duration across 10 different NHANES cycles. So starting in 2001 to 2002, all the way up until the 2019 to 2020 cycle. And they included data from about 48,000 people.
Here's what they concluded. This is straight from their abstract. An examination of the temporal trends in sleep duration revealed a declining proportion of individuals with insufficient and markedly deficient sleep time since the 2015 to 2016 cycle.
[Regina] (10:30 - 10:55)
Okay, that is not the clearest sentence I've ever encountered. So let me unpack this for a moment. Okay.
Declining proportion of individuals with insufficient and deficient sleep, which is a double negative in there, right? So are they saying Americans are doing better on sleep since 2015? They're saying fewer people are sleeping too little?
[Kristin] (10:56 - 11:05)
Good translation, Regina. Yeah. They found that since 2015, Americans are getting less short sleep and more long sleep.
So it's a good thing.
[Regina] (11:05 - 11:10)
Hmm. Okay, but define short sleep and long sleep for me.
[Kristin] (11:10 - 11:29)
Okay. The researchers split sleep into four categories. Less than 6 hours per night was classified as very short sleep.
Less than 7 hours was short sleep. 7 hours to under 8 hours was considered normal sleep. And 8 or more hours was long sleep.
[Regina] (11:29 - 11:49)
I'm not sure I would have thought that 8 hours was long sleep. But apparently compared to us, it clearly is. All right.
So the researchers found that fewer Americans are getting short sleep, which is less than 7 hours, and more are getting long sleep, 8 or more hours.
[Kristin] (11:50 - 12:49)
Exactly. And if you look at figure two in their paper, you can see the actual numbers. That figure gives the percentage of Americans with very short, short, normal, and long sleep in each of the NHANES cycles, and a pattern jumps out immediately.
From 2001 to 2014, the percentage of Americans with either short or very short sleep stays pretty consistent. It's right around 40%. But there is a sudden drop in 2015.
All of a sudden, this percentage drops from 40% to just 25%. Just 25% of Americans are in the short or very short sleep categories. And that stays stable from 2015 on.
And long sleep shows the opposite pattern. From 2001 to 2014, the percent of Americans with long sleep is consistent around 35%. But in 2015, again, there is a sudden jump upwards.
Suddenly, about 50% of Americans are in the long sleep category, and that stays stable from 2015 on.
[Regina] (12:50 - 13:07)
All right. Wait a minute. Basically, overnight, you're saying we had a huge drop in the number of people getting too little sleep, and also a massive, immediate jump in people getting plenty of sleep.
But, Kristin, this cannot be real, right? Yes. No way.
[Kristin] (13:07 - 13:37)
Thank you. Exactly. This can't be real.
There is no plausible mechanism for that kind of immediate population-wide shift. Like sleep habits don't change that fast across an entire country. That would have required Americans to collectively overhaul their sleep habits in a single year that is just wildly implausible.
So, Regina, I'm going to ask you to guess what happened here, and I'm going to give you a hint. It is not a mistake in how they analyzed the data. I actually checked the numbers myself.
[Regina] (13:38 - 13:45)
I like that you're giving me that hint. That would have been my first guess. So, Kristin, did they change the NHANES questionnaire?
[Kristin] (13:46 - 14:13)
Exactly. They changed the questionnaire. They changed the question about sleep duration starting in the 2015 to 2016 cycle.
So, this is not a real change in sleep in America. It's just an artifact of a change in the questionnaire. So, Regina, now I'm going to ask you the sleep question as it was asked in NHANES through 2014.
How much sleep do you usually get at night on weekdays or workdays?
[Regina] (14:13 - 14:19)
Well, clearly from the data, let's say it's about 6.25 hours.
[Kristin] (14:19 - 14:32)
And this value is rounded to the nearest integer, so I would record a value for you of 6. Now I'm going to ask you the question as it was asked starting in 2015. Regina, what time do you usually go to sleep on weekdays or workdays?
[Regina] (14:33 - 14:35)
Oh, 10 p.m.
[Kristin] (14:36 - 14:38)
What time do you usually wake up on weekdays or workdays?
[Regina] (14:39 - 14:39)
5 a.m.
[Kristin] (14:40 - 14:54)
So, the interviewer for NHANES asks these two questions, and then the interviewer has to calculate, has to subtract to get the sleep duration. So, here I would do 10 p.m. to 5 a.m., that's 7 hours. And I would record a value of 7 for you.
[Regina] (14:55 - 15:03)
Ah, so there's the difference right there. I get a value of 7 rather than 6 when you ask the question this new way.
[Kristin] (15:03 - 15:05)
Right. So, that could explain this massive shift.
[Regina] (15:06 - 15:37)
Because when you asked, when do I, quote, go to sleep, I interpreted that as when I'm finally kind of getting myself in bed. But do I actually go to sleep right away? Not always, no.
And do I actually wake up at 5 a.m. with my alarm? Or does my stupid body wake me up at 4.15 a.m., replaying every silly thing from my entire life that I regret, and I lie awake, tossing and turning until 5 a.m., and then go to bed?
[Kristin] (15:37 - 16:19)
I think a lot of us can relate to that, Regina. Yes. But yes, this shows you that the wording of the question matters.
Also, if you ask people, you know, how many hours of sleep did you get, they're going to anchor their answer to common answers out there, like 6, 7, or 8. But if you ask people about bed and wake times, they're going to anchor their answers to different norms and they're probably not doing the math in their heads. Also, Regina, I want to point out that they also changed how they coded the answers.
Before 2015, they only allowed integer values in the data. Starting in 2015, they were allowed to round to the nearest half hour. So all of this makes it look like sleep increased, but it's just due to a change in the question and the coding of the variable.
[Regina] (16:20 - 16:31)
That is fascinating. So I would love to think that in 2015, all of a sudden, America and I started sleeping more. Kristin, but you're telling me, no, we did not.
[Kristin] (16:31 - 17:08)
We did not. Here's the disturbing part, though. The authors just took these numbers at face value.
In their discussion, they write, So they never consider that the pattern could be an artifact. They just treat it as real. Now, to be fair, this mistake does not matter for the rest of the analyses in the paper because they were only able to get the aging variable on 13,000 people measured from 2001 to 2010 in NHANES.
So all of the people included in the aging analyses, they had the same sleep measure.
[Regina] (17:09 - 17:10)
Ah, so that's good at least.
[Kristin] (17:10 - 17:13)
Yes. All right, Regina, now I want to talk about how they measured aging.
[Regina] (17:14 - 18:04)
Oh, I'm not sure that I want to talk about aging, but I'm also curious. So let's take a nice short break first.
Welcome back to Normal Curves.
Today we are discussing a paper about sleep, exercise, and aging. One of these is my favorite thing. Two of these are not.
And we just talked about how they measured sleeping, the favorite thing, and we were about to talk about how they measured aging.
[Kristin] (18:04 - 18:31)
All right. Aging was the outcome variable for this study, and it's very interesting where it comes from because there isn't a, quote, aging variable in NHANES, but they used other data in NHANES to derive a measure that they call phenotypic aging. Regina, I prefer the term biological aging, so we're just going to call it biological aging today.
And it's just trying to get at how well are you aging? Like are you biologically young or old for your actual age?
[Regina] (18:31 - 18:44)
I kind of want to hear about this, but I don't really, Kristin. But when you say biological aging, the first thing that comes to my mind because I read too many of these health studies is telomeres. Is that what they did here, telomeres?
[Kristin] (18:44 - 18:51)
Good guess, Regina. That is a common way that we get at this biological aging construct, but that's not what they used here.
[Regina] (18:51 - 19:14)
Okay. So, Kristin, I'm going to try to test myself and remember all of the ways we've talked about in this podcast to measure biological aging. And the first one that comes to mind is collagen from the diaphragm from cadavers.
But now that I say that out loud, it would mean that they would have to kill the NHANES patient first. So, I'm guessing that was not it.
[Kristin] (19:15 - 19:22)
That is not how they measured aging here, but that cadaver study was such a great study. So, I'm going to refer everybody to our Sugar Sag episode to learn more about that.
[Regina] (19:23 - 19:48)
Sugar Sag episode, we talked a lot about aging there, and that was where we talked about the AGE wand, the age wand, which I was going to use to predict sexual function in my dates, but was also related to aging. And that's the one where they also looked at aging by counting the number of wrinkles that you had in your face, most depressing of all. So, either of those?
[Kristin] (19:48 - 20:12)
No, actually, those were really fun, but they did not use anything so fancy here. What they used was an algorithm that someone else had published in 2018 that allows you to estimate biological age from some common blood tests that a lot of people get in their routine blood work. That algorithm was reported in the journal Aging.
The lead author was Morgan Levine, and I want to take a little time and talk about his paper because it's pretty interesting.
[Regina] (20:13 - 20:13)
All right, I'm intrigued.
[Kristin] (20:14 - 20:43)
Okay, so their algorithm, it's basically like an aging calculator. They started by fitting a statistical model using what we call training data to train the model from about 9,000 people from the old NHANES dataset pre-1999. Then they used a completely separate group of people to test the model, and that's called their test data.
And they used about 6,000 people from the modern NHANES dataset. And of course, the purpose of test data is to see if the model actually works when it's applied to new people.
[Regina] (20:44 - 21:00)
I love this. It's training sets and test sets of data. They help us build better models that don't overfit data, and it's just good statistical practice.
And if I remember correctly, we talked about this in the Sugar Sag episode as well.
[Kristin] (21:01 - 21:24)
Yeah, that wonderful cadaver study also used training and test data. Right. Okay, so here they built a model to predict death, actually.
The outcome was time to death, and they were able to do this because they linked the NHANES data to national death records, and they took 42 different clinical measures and tried to find the combination of those that best predicted when someone would die.
[Regina] (21:25 - 21:35)
Predicted when they were going to die. Kristin, this is cheerful. When are we going to talk about unicorns and chocolate cake and rainbows?
[Kristin] (21:35 - 21:45)
Not a cheerful model, but this might cheer you up, Regina, because I'm going to tell you what type of model they used. They used a penalized Cox regression to predict death.
[Regina] (21:46 - 21:54)
Did you just say penile Cox regression? I heard you say, because, yeah, that cheered me right up.
[Kristin] (21:55 - 22:23)
Okay, yeah, penalized, but I knew you were going to like that play on words. Penalized, which, just for our listeners, means that model weeds out variables that aren't very predictive. So we started with 42 clinical measures and the model weeded it down to 9 clinical variables plus chronological age.
That's what was left in the model. And these clinical measures weren't anything fancy. It was things like basic blood chemistry, albumin, white blood cell count, creatine kinase.
[Regina] (22:23 - 22:26)
Oh, so stuff you'd see on a normal blood panel.
[Kristin] (22:27 - 23:25)
Exactly. And then, Regina, they took these variables and they built what's called a Gompertz survival model. And, Regina, do you like how I'm just going to throw in a bunch of fun statistical terms to cheer you up as we're talking about death, Gompertz, it's really fun to say.
Thank you. I'm actually teaching survival analysis right now. And the key idea is this.
If you want to fully specify a survival curve, you have to decide upfront how the risk of the outcome changes with time. And that choice determines the shape of the curve. So here we are looking at the outcome of death.
If the rate of death stayed constant as you aged, you would get an exponential curve. If the rate of death kind of increased linearly, steadily with age, you would get a Weibull curve. But if death increases exponentially with age, then you get a Gompertz curve.
And it turns out that a Gompertz curve is the right one for modeling death as you age because unfortunately, the rate of death increases exponentially.
[Regina] (23:26 - 23:27)
Oh, my goodness. Yeah.
[Kristin] (23:27 - 23:33)
So did you know this fun fact, Regina, that as adults, our mortality rate roughly doubles every 7 to 8 years?
[Regina] (23:34 - 23:54)
Oh, just stop. Oh, my gosh.
Just stop. I thought this was supposed to cheer me up. What are you doing?
I'm feeling like, I don't know, this is the Carpe Diem, gather ye rosebeds while ye may before ye exponential decay.
[Kristin] (23:54 - 24:02)
Oh, that was beautiful, Regina. You rhymed. That feels like a rhyming life moral right there.
Good job.
[Regina] (24:03 - 24:19)
Oh, thank you. Death inspires me, apparently. Death and statistics.
All right. But Kristin, this is a death prediction model. How did they get from there to biological aging, both depressing, but different outcomes?
[Kristin] (24:20 - 24:42)
Great question, Regina. So the idea is that you can plug in someone's values for these 9 clinical measures plus their actual chronological age, and the model predicts their mortality risk. And then if their mortality risk looks like, say, the average 60-year-old, then their biological age will be put at 60, even if they're actually only 50 or if they're like 70.
[Regina] (24:43 - 24:57)
So biological age here really means how bad your health profile looks, right? It's how high your chances are of dying soon. Okay.
That's actually kind of cool, I'll give you that. Depressing but cool.
[Kristin] (24:57 - 25:08)
It is, yeah. But the fun thing, Regina, and why we're spending so much time on this, is that the Levine paper actually published the model, including all of the coefficients in their model.
[Regina] (25:08 - 25:11)
Ooh, did you make us a spreadsheet calculator, Kristin?
[Kristin] (25:11 - 25:14)
Even one step better. I made a web app.
[Regina] (25:14 - 25:26)
Oh, I am impressed. A web app. That's fancy.
I mean, I wish it were for a less depressing subject than my biological age, but I am still super impressed.
[Kristin] (25:27 - 25:41)
Thank you. And the best part, Regina, is that unlike telomeres, which people can only really get measured in studies, these are all standard blood tests, and so anyone can go to this web app and plug in their numbers and find their biological age, according to Levine.
[Regina] (25:41 - 25:49)
Ooh, we are going to put this app, Kristin, on our website directly, normalcurves.com.
[Kristin] (25:49 - 25:54)
Yes, along with the 43 questions to fall in love, which is a cool web app that you built, Regina.
[Regina] (25:55 - 26:10)
Mine is more cheerful, I'd like to point out, than yours, but maybe what it is, you start with yours, and you see how close you are to death, and then you go over to the how to fall in love, right? Because you have to carpe diem and gather your rosemeads.
[Kristin] (26:11 - 26:25)
Oh, that is a good sequence there, Regina. I do want to point out, though, that your app did not have a Gompertz model in it, and therefore, I win, because cool statistical distribution trumps love and happiness, Regina.
[Regina] (26:26 - 26:27)
In your book, maybe.
[Kristin] (26:28 - 26:44)
But the important question, did you try my web app? Because I plugged in my numbers, Regina, and it told me I am 42.9 years old, so this Levine paper is getting five smooches from me, and it's totally true, it's perfect, there is not a thing wrong with it, and of course, I'm not biased whatsoever.
[Regina] (26:45 - 26:55)
I can absolutely believe you being 42.9, or even younger than that. I can't wait to try, well, I'm a little scared to try mine.
[Kristin] (26:56 - 26:59)
You should put yours in, I'm sure it's going to look good for you, too, Regina.
[Regina] (26:59 - 27:07)
Yes, and if it gives me anything under my actual chronological age, then absolutely five smooches, no questions at all.
[Kristin] (27:08 - 27:41)
Yeah, I think it's going to, and of course, we love this model because it's favorable to us, but Regina, if I'm being fully transparent here, it does have some limitations. Chronological age, it turns out, is doing most of the work in this model, and the other blood markers are not really adding much to it, because the correlation between chronological age and this biological aging construct turns out to be 0.94, which means that most of the variation in this biological age measure tracks with actual age, and the biomarkers add only a small layer of additional information on top.
[Regina] (27:42 - 27:53)
So it's anchoring by your age, your actual chronological age, so if you're in your 50s, you're not going to somehow get a value of you being like 15 or something. That's right.
[Kristin] (27:53 - 28:08)
All right, Regina, back to the 2024 Scientific Reports paper. They took this calculator from Levine and applied it to about 13,000 NHANES participants who had had these blood markers done, and again, this was people from 2001 to 2010.
[Regina] (28:08 - 28:15)
Right, so we don't have to worry about that whole sleep question snafu. This was all before they changed the question.
[Kristin] (28:16 - 28:29)
Correct. So they looked at the relationship between sleep duration and aging in these 13,000 people, and importantly, Regina, they adjusted their models for chronological age as well as some other potential confounders like smoking.
[Regina] (28:30 - 28:37)
Hmm, and what did they find? Because I would predict longer sleep, less aging.
[Kristin] (28:37 - 28:47)
Interestingly, they found that the relationship is not linear, it's more U-shaped, and both shorter and longer sleep were associated with faster aging, Regina.
[Regina] (28:48 - 29:00)
I can imagine that being true, right? If you're getting two hours of sleep a night or 20 hours of sleep a night, then something else is going on in your life. There is a sweet spot in the middle.
[Kristin] (29:00 - 29:24)
Yeah, a lot of health behaviors look U-shaped like this. Here, though, the bottom of the U, the lowest point for aging, was around 6.5 hours per night, at least in Figure 3. And I even threw that figure into Graph2table to have it extract the exact bottom of the graph, and it was right around 6.5. The Graph2table app, our boyfriend was sharing him.
[Regina] (29:24 - 29:48)
He's a very generous boyfriend. Okay, optimal sleep, you're saying, is 6.5 hours. That was the lowest aging point on this graph, which is basically, Kristin, you and me.
So this is great news. I'm not sure why I don't feel younger, but I will take it.
[Kristin] (29:48 - 30:03)
Yeah, see, I told you this episode has a silver lining. We're getting the optimal sleep according to this analysis. All right, so let me just talk about Figure 3.
It's just a picture of their model, but it comes from a model called a cubic spline. And Regina, can you tell us a little bit about cubic splines?
[Regina] (30:03 - 30:43)
Oh, I love cubic splines. I always think of them as the difference between different kinds of dance. So if you're fitting a straight line model to the data, it's like you are telling the model, you must be a straight line, you must behave in this particular way.
But with a cubic spline, you're being much more loosey-goosey and you're letting the data dictate what's happening. And I think of it as the difference between like a waltz or something that is very precise and constrained and an improv dance. And the cubic spline is the improv dance where you just, you go out there and you let loose on that dance floor.
[Kristin] (30:44 - 31:01)
Oh, I love that analogy. Yeah, exactly. This cubic spline lets the data trace out a curve like this U-shape.
But Regina, Figure 3 is just a picture. We never get any details of the model, but they took the same data and they fit a different model on it called a piecewise linear regression.
[Regina] (31:01 - 31:11)
Oh, I love these two. All of these are what we call semi-parametric models, which just mean they let you improvise based on the data a little bit more.
[Kristin] (31:11 - 31:35)
Exactly. Piecewise linear regression just means that you split the data into pieces and fit a straight line to each piece of the data. Here, they split the data at 7 hours, meaning they fit two lines that meet at 7 hours.
Because these are straight lines, this creates a V rather than a U. And the bottom of the V, the lowest point for aging, is at 7 hours.
[Regina] (31:36 - 31:45)
But Kristin, I thought the bottom of that U that we talked about was 6.5 hours. That was the optimal, not 7. Why didn't they split it at 6.5 instead?
[Kristin] (31:46 - 32:25)
Yeah, this is a little weird, a little inconsistent, Regina. Now, they claim to have let the data choose the split, the bottom of the V. But I'm skeptical.
I wonder if maybe they imposed the split at 7. Because as you noted, 7 is different than 6.5, which is what the spline analysis found. And of course, different methods can give slightly different answers, but you would expect those to be a little closer.
And then conveniently, 7 hours is exactly the cutoff they use to define those different groups, short versus normal sleep. So 7.0 just feels a little too perfect, a little too neat to me.
[Regina] (32:25 - 32:29)
It could happen, but it's a little suspicious. Seems a little fishy. Yeah.
[Kristin] (32:29 - 33:20)
Yeah. And Regina, in addition to the cubic spline and the piecewise linear regression, they also ran another model on the same data where they compared the different sleep categories. So short, very short, and long sleep, they compared to normal sleep.
And remember, normal sleep was 7 hours to under 8 hours. Here's what they found. Short sleep, that 6 to under 7 hour category, short sleep was no worse than normal sleep.
There was no significant difference. And Regina, that actually lines up nicely with the spline analysis, where the lowest aging was around 6.5 hours in that short sleep group. Very short sleep, though, the less than 6 hours a night group, compared with the normal sleep group, was associated with about a 0.6 year increase in biological aging. And long sleep, 8 or more hours, was associated with about a 0.7 year increase.
[Regina] (33:21 - 34:13)
OK. So we're talking about basically like an aging penalty, how much increased aging you get if you were sleeping either too little or too much. But you said the effect size here was 6-tenths of a year and 7-tenths of a year, which is just in the order of what, like 8 months or something?
It's not exactly a huge penalty there. No. And it's also kind of weird, though, when you step back and think about it, because we're always told to get at least 8 hours of sleep.
But here you're saying the optimal range is more like 6.5 or 7 hours, depending on the model they're using. So according to this paper, we're all good on like 6 and a half hours of sleep? Yay?
It's only if you get less than 6 that it's a problem?
[Kristin] (34:13 - 35:17)
Yes, that's what the data here are saying. And Regina, I think this is good news for us. And actually, this is probably the counterintuitive finding that they should have emphasized in the paper and made go viral.
Truth be told, though, Regina, I know you're going to be shocked, but I'm not even sure that I believe these results as much as I want them to be true. Here's the problem. Chronological age is a huge confounder here, right?
It's strongly related to both biological age and to sleep, because older people actually tend to sleep either less or more. And even though the models do adjust for age, I'm worried about our old friend residual or leftover confounding. Because remember, they built their aging outcome using a Gompertz model, where the risk increased exponentially with age.
So age is built into our outcome in this very strong nonlinear way. But when they control for age using a linear regression, they're probably just assuming a straight line relationship between chronological age and biological age. That's a mismatch.
And when you misspecify relationships like this, it means some confounding is going to leak through.
[Regina] (35:18 - 35:36)
Well, that's a good point, Kristin. So in some models, they're basically saying aging is a straight line. But then in other models, they're saying, no, it's nonlinear, but then they're kind of mashing them together.
And when you do that, you're going to end up with all kinds of problems. Yeah?
[Kristin] (35:36 - 36:08)
Yeah, exactly. And I'm also just worried that maybe all of this we're fitting just noise, Regina. Not even confounding, it's just noise.
Because remember, chronological age, which is in the model, already explains almost all of biological age. So there's not much variation left for sleep to explain. So we might just be fitting noise.
Basically, so far, Regina, lots of statistical red flags in this paper. All right, so that's the sleep and aging part. But now, Regina, I want to talk about the result that actually went viral.
This is when we add exercise to the picture.
[Regina] (36:08 - 36:48)
Oh, exercise. Okay, this is going to be very interesting. Let's take a short break first.
Welcome back to Normal Curves. Today, we're looking at sleep, aging, and exercise. And we've already talked about sleep.
We've already talked about aging. We are going to get now to the exercise part.
[Kristin] (36:49 - 37:33)
Remember, I want to remind everybody, the claim that went viral that we started this episode with, that we're evaluating today, that came out of the exercise analysis. And it's not like the person who made that tweet made it up. The claim actually comes directly from the paper.
The abstract of the paper says, according to the dose-response relationship between sleep duration and phenotypic age, long sleep duration can benefit from regular exercise activity, whereas short sleep duration with more exercise tended to have higher phenotypic age. Again, that's not the clearest sentence in the world, Regina. But basically, what they are saying is that for short sleepers, exercise might actually backfire.
And that's the claim that we're going to scrutinize now.
[Regina] (37:34 - 37:38)
That exercise backfires for short sleepers, definitely. Show me the data behind this one.
[Kristin] (37:39 - 38:01)
Okay. First of all, exercise in NHANES is a self-report measure, just like sleep. It's not from like an activity watch.
And the authors used that measure to divide people into three exercise groups. No exercise, one to 149 minutes per week, and 150 minutes or more per week. And this is of moderate to vigorous recreational leisure time exercise.
[Regina] (38:02 - 38:16)
And that 150 minutes is not arbitrary, right? That cutoff. Because if I remember correctly, that two and a half hours per week, that is the national guideline for how much exercise people are supposed to get for good health.
[Kristin] (38:17 - 38:36)
Yeah. Exactly. It's not arbitrary at all.
So, Regina, do you want to know how did Americans do on exercise in NHANES? Mm-hmm. All right.
They report that 65%, about two-thirds, are in the no exercise group, 11% in the medium exercise group, and almost a quarter were in that high exercise group getting adequate exercise.
[Regina] (38:36 - 39:00)
Wait a minute. Two-thirds of Americans were inactive zero minutes per week? Like, are you sure?
Because, come on, guys. You couldn't just say a minute, you get a minute of exercise a week? Five minutes?
Do we really have that many Americans who are doubling down on their couch potato identity? Whoa.
[Kristin] (39:00 - 39:32)
You know, Regina, great Spidey sense here. You're absolutely right to question this because I am pretty sure that they did something weird here. This seemed a little off to me, too, so I went to check the exercise data in NHANES.
It turns out to be a lot more cumbersome to check because that exercise value is derived from a bunch of different questions they ask. But I did a very quick check of the data just from that period, and I got different numbers. It was around 40% reporting no recreational moderate to vigorous physical activity.
[Regina] (39:33 - 39:42)
Actually, that 40%, I can believe, Kristin, that we have less than half the people who are not getting any exercise. Yeah, that's better than the two-thirds.
[Kristin] (39:42 - 40:19)
Yeah, and Regina, because these data are from 2001 to 2010 and they're publicly available, I looked for papers where people had published on this before. And yeah, my numbers seem to line up more closely with what other people have published, so that 65% just seems wrong. I tried to figure out what they might have done wrong here, and I'm not sure, but my best guess is I think they might have only included the vigorous exercise questions and they might have forgotten to include the moderate exercise questions because I can get my numbers to line up more closely with them if I do that.
Now, that's not standard, and they never said that they did that in the paper, so this is just another weird red flag.
[Regina] (40:20 - 40:21)
Yeah, okay, not good, not good.
[Kristin] (40:21 - 40:45)
Yeah, this seems off, but it's nothing we can fix. So, now let's look and see what they did with this exercise variable. They next asked whether the relationship between sleep and aging is modified by exercise.
So, they were technically looking for what we call a statistical interaction or effect modification. And Regina, we actually talked about interactions back in the dating wish list episode.
[Regina] (40:45 - 40:55)
I remember that. Kristin, would you like to do a little statistical detour on interactions right now just to set the stage for your results?
[Kristin] (40:55 - 40:55)
You know I would.
[Regina] (40:57 - 41:14)
And let's riff off of a regression example that you actually came up with in that dating wish list episode because remember the boat guy that I'd had a date with? He had 7 boats, and you thought that was hilarious.
[Kristin] (41:15 - 41:22)
I still think it's hilarious. I don't know why I find this so hilarious, but yeah, that cracks me up, Regina, that he had 7 boats.
[Regina] (41:23 - 41:59)
7 boats, yep. He was trying to impress me, so you used that to come up with this great inspired example to illustrate linear regression. And you said, hey, does the number of boats that a man has, does that predict romantic relationship happiness?
And I think that was a brilliant example. And I think we can take it one step further here, Kristin, because I was recently shopping online and I saw a handbag that cost $400,000.
[Kristin] (41:59 - 42:08)
Did you say $400,000 for a handbag? I mean, you could buy a house for that in most places in America. What are you talking about?
[Regina] (42:09 - 42:41)
It was an Hermes handbag. I don't even know how to pronounce it, so clearly I'm not buying one. It was an Hermes 2023 Matt Himalaya Niloticus Crocodile Diamond Kelly 28.
I'm sure that there are some people, those words all have meaning, important meaning. They mean nothing to me. Do they mean anything to you?
[Kristin] (42:41 - 42:45)
They mean nothing to me either. I have no idea what you're talking about.
[Regina] (42:46 - 42:57)
I thought this was just insane, even more insane than owning 7 boats, because at least with the boats, you could do something with them. You could take them out on the water.
[Kristin] (42:58 - 43:06)
I mean, yeah, I'm going to go out on a limb here and say that that handbag is more insane than this paper that we're scrutinizing today.
[Regina] (43:08 - 43:29)
For you, that is actually saying a lot. Yeah. What are you going to do with a $400,000?
So it is actually made of crocodile skin and it does have diamonds on it. But at that point, you can't actually take it out. You're not going to like go to the grocery store with it.
[Kristin] (43:30 - 43:47)
Can you imagine? Regina, I don't even own a handbag anymore. I think I just take my phone.
I don't understand this culture around you got to have the perfect handbag. I do understand shoes. I like shoes, but handbags I never got.
You have to carry them around and you might lose them and you forget them. Like, I don't get it.
[Regina] (43:48 - 44:31)
I know. And there goes your $400,000 if you accidentally leave it in the Uber. I don't think you're getting that back.
But it wasn't just $400,000. I started looking around at all of the bags. There were so many bags that were like $50,000 or $10,000 even.
So many of these. Then I naturally started wondering what type of person likes handbags so much that they are going to pay $50,000 for one. And then just to bring it back to today, why I think that this relates today, maybe liking expensive bags is also related to whether you can find relationship happiness with a guy with more boats, right?
[Kristin] (44:32 - 44:33)
Oh, interaction effect. Very nice. Yes.
[Regina] (44:33 - 45:03)
That is what an interaction effect is. Exactly. So, let's make it concrete.
Let's say for the sake of this example, bringing it back to the stats, let's oversimplify, Kristin, and say there are two types of women. Those that like expensive handbags for some crazy reason and those of us who don't. And I can guess what group you fall into.
I actually confess that I like really nice camping backpacks, but I don't think that's the same thing.
[Kristin] (45:03 - 45:07)
That does not count. Those are practical, Regina, yes. Those are practical.
[Regina] (45:08 - 45:37)
So, statistically, we can look at the expensive handbag group of women with their Hermes crocodile diamondy things and look at how many boats their boyfriends or husbands have and how happy their relationship is. And we can see if their relationship happiness increases with boat ownership. And what's your prediction on that?
[Kristin] (45:37 - 45:41)
Oh, that definitely. Strong, positive relationship there. Absolutely.
[Regina] (45:43 - 45:53)
And then, exactly, and then we look at the rest of us and see if the same relationship holds. And what is going to be your prediction on that one?
[Kristin] (45:53 - 46:05)
I am guessing for the rest of us, it's not going to be a clear, positive relationship. It might even go in the other direction, because I'm going to say I wouldn't have had a second date with 7-boat guy either.
[Regina] (46:07 - 46:51)
Yeah, that one sadly ended in its infancy. If only I had been showing up to the date with my Hermes crocodile bag, then history would be different. Uh, yeah, but it's not.
So that is basically the interaction effect. And Kristin, of course, to demonstrate that there is an interaction effect, we need to actually show statistically that the regression line between boats and happiness for the expensive handbag group is significantly different than the line between the boats and happiness in the non-handbags group.
[Kristin] (46:51 - 47:25)
Regina, that is a really important point. So I want to pause and emphasize it for a minute. It's not enough to just show that the association is significant in one group, but not significant in the other group.
You actually have to formally test for interaction by calculating a p-value for interaction. You have to show that the effects in the two handbag groups differ significantly from each other. And, you know, you see this all the time, Regina, where someone will say significant in group A, but not significant in group B, and therefore there is an interaction.
But that is not how you demonstrate interaction.
[Regina] (47:26 - 47:33)
That is absolutely not. And, Kristin, of course, this is one of both of our major pet peeves.
[Kristin] (47:33 - 47:34)
Yes. Yeah.
[Regina] (47:34 - 47:47)
Okay, so, Kristin, that is my upgraded version of your already inspired boat guy analogy. So now we've got boat guy and we've got Hermes handbag groups.
[Kristin] (47:47 - 48:17)
I think that's an amazing upgraded analogy, Regina. I love it. Thank you.
Okay, so now getting back to the paper at hand, they here were asking whether the association between sleep and aging, kind of like boats and relationship happiness, does that depend on how much you exercise, which exercise group you're in instead of which handbag group you're in. Right. And what did they find?
So, Regina, they did two different analyses, and I know you're going to be completely shocked to hear this, but the two analyses don't match.
[Regina] (48:18 - 48:19)
Shocking.
[Kristin] (48:20 - 49:00)
So I'm just going to start by talking about their cubic spline analysis, which is pictured in figure 4b. It's basically the same cubic spline analysis we already talked about, but now they've added exercise to the model. So instead of one curve, we get three u-shaped curves, one for each exercise group, no, medium, and high exercise.
If you look at the right side of the graph, the long sleep side of the graph, there does appear to be some difference between the three exercise groups. So aging goes up with longer sleep for everyone, but it goes up fastest for the no exercise group and slowest for the high exercise group.
[Regina] (49:00 - 49:09)
Okay, so far that matches the first part of their viral claim that exercise slows aging if you're getting a lot of sleep.
[Kristin] (49:09 - 49:37)
Right, except, Regina, that they committed our pet peeve here. They never formally tested for interaction. There's no p-value for interaction anywhere, and so we don't really know if those three lines are statistically different from one another.
Actually, Regina, I'm pretty sure that if they had calculated that p-value, it would not be significant, because even though these lines visually look different, I think the differences are consistent with just random fluctuation, just noise.
[Regina] (49:37 - 49:56)
Oh, no. So right there, that undercuts the entire story. They never properly tested for interaction.
So those differences they're pointing to between exercise groups, those differences could just be random noise. They're over-reading the pattern here.
[Kristin] (49:56 - 50:00)
Yeah, exactly. But, Regina, it's actually worse than that.
[Regina] (50:01 - 50:02)
How could it be worse?
[Kristin] (50:03 - 50:14)
So remember their big claim that exercise is bad for you if you don't sleep enough? It turns out there's nothing in their data to support that claim. There's not even like a non-significant visual pattern.
[Regina] (50:15- 50:16)
Wait, what? Nothing to support it?
[Kristin] (50:17 - 50:35)
Nothing at all, because if you look at the left side of their graph, this is where people are sleeping less than 6 or so hours per night.
Aging increases with declining sleep for all three exercise groups, but it increases pretty much the same amount. The three lines are almost on top of one another.
[Regina] (50:36 - 50:44)
Wait, so where did they come up then with this claim that exercise essentially backfires when you don't get enough sleep?
[Kristin] (50:44 - 51:10)
Regina, I don't know. It's a total mystery. My best guess is that if you look at those three lines, the high exercise group is technically above the other two by just a hair.
So maybe they looked at that and said, high exercise is slightly above the other two, and therefore it is worse. But that's just silly. Those lines are practically overlapping.
So there's clearly no meaningful difference. And obviously it would not be a statistically significant difference.
[Regina] (51:10 - 51:14)
Wow. So that's just making a claim out of nothing.
[Kristin] (51:14 - 51:17)
Yeah, but Regina, believe it or not, it gets worse.
[Regina] (51:18 - 51:20)
I don't believe you. How could it get worse?
[Kristin] (51:21 - 51:34)
Well, remember that I said that they did two analyses here and they didn't exactly match? The problem is that this next analysis, not only does it not support their viral claim, it actually shows the opposite of their viral claim.
[Regina] (51:34 - 51:38)
The opposite of their claim? What do you mean?
[Kristin] (51:38 - 52:29)
So in addition to the cubic spine analysis, they ran a linear regression analysis. They fit a separate linear regression model for sleep and aging for each exercise group. And these results are in figure 4A.
For the no and medium exercise groups, the results were consistent with what we're seeing in the rest of the paper, where shorter and longer sleep were associated with faster aging. Not all of those associations were statistically significant, but that was the pattern. But Regina, the interesting part was that if you look at the high exercise group, the pattern is completely reversed.
The short, very short, and long sleep were all associated with decreased or slowed aging compared with normal sleep. The regression coefficients were all negative, and this even reached statistical significance for very short sleep.
[Regina] (52:29 - 53:04)
Wait a minute, the claim we were talking about said that if you sleep too little, you don't sleep enough, regular exercise makes you age faster. But in fact, you're saying this result says that if you sleep too little, regular exercise slows down aging. Yep.
It makes you better. So, Kristin, you're saying if you tend to exercise a lot and you have the choice of sleeping either a normal amount or sleeping too little, not getting enough sleep, you should make sure you don't get enough sleep. Is that what you're saying?
[Kristin] (53:05 - 53:16)
That is exactly what I'm saying, Regina. That is what their linear regression model suggests. If you exercise a lot, it's better to sleep less than 6 hours than to get 7 to 8 hours.
[Regina] (53:16 - 53:17)
That's crazy.
[Kristin] (53:17 - 53:34)
It is kind of crazy. And, Regina, what's actually interesting is this result directly contradicts their results from figure 4B, right? Because, pop quiz, what is the shape of the curve that you would expect to see in figure 4B for that high exercise group based on these regression results that I just gave you?
[Regina] (53:35 - 54:00)
Oh, that's a good question. This is a fun exercise. Okay, you said that the worst results were for the normal sleepers for some reason, the 7 to 8 hour folks, and the best results were for the short sleepers and long sleepers.
So the best aging was for us awesome short sleepers and for the awesome lazy but long sleepers, right? So we'd expect an upside down U here, yes?
[Kristin] (54:01 - 54:10)
Yes, exactly. Based on the regression model, we should see an upside down U, but in figure 4B, the line for the high exercise group is not upside down. It's a right side up U.
[Regina] (54:10 - 54:13)
That is weird and it makes no sense.
[Kristin] (54:13 - 54:28)
It just doesn't match. And my best guess here, Regina, is that when they did the cubic spline, I think they threw all of their data into a single model rather than splitting it up. And since that no exercise group is so big, maybe it's just driving the shape for the rest of the groups.
[Regina] (54:29 - 54:45)
Ah, yeah. Okay, this is not great, Kristin. I'll just summarize it that way.
When you can flip the story just by changing the model, that is a huge red flag. It means that the result is not stable. It's not robust.
[Kristin] (54:46 - 55:47)
Yes, and Regina, this just strengthens my belief that this entire analysis, this entire endeavor is just an exercise in fitting noise because the model changes so much with slightly different modeling choices. But if there is anything here beyond noise, what the data suggest is that if you exercise a lot, that it's good to get too little sleep, that getting too little sleep will actually slow aging, not accelerate it. So it's the opposite of their counterintuitive viral claim.
Okay, Regina, now I'm gonna quickly summarize all the problems that we've seen with this paper. They misinterpreted a change in the NHANES questionnaire as if it were a real change in sleep behavior in America. They ran multiple models on the same data and these models sometimes contradicted each other.
They miscoded their exercise variable. They did not formally test for interaction, but then made claims as if they had found an interaction. And they misinterpreted the results from one of their models, which led them to make an incorrect claim that then went viral.
[Regina] (55:47 - 55:48)
Oh, this is crazy.
[Kristin] (55:49 - 55:51)
Regina, do we have time for one final complaint?
[Regina] (55:53 - 55:54)
Of course we do.
[Kristin] (55:54 - 56:18)
I wanna talk briefly about the writing in this paper. Regina, we already pointed out a couple of times that some of their conclusions in the abstract, they were a little hard to parse. And I teach a whole course called Writing in the Sciences on Coursera where I try to get scientists to write better, I wanna talk about why this overly complicated, bloated style of writing, even though it's very typical for the academic literature, it's not how we should be writing.
[Regina] (56:18 - 56:41)
It is not. And it's unfortunate because it doesn't actually need to be this way. People feel like they need to be formal and write in a style that shows how smart they are and how many big words they know.
But really, it does not make them sound smarter and it is very hard on the reader. No one actually reads what they write when they do that.
[Kristin] (56:41 - 57:03)
And you know, I think this actually links back to the statistics, Regina, because I'm wondering how this paper got through peer review, got through editors. And one hypothesis is maybe the peer reviewers didn't really read it, right? Because maybe it was just so complicated that they didn't understand anything and therefore they just let it slide through.
How else would these very obvious statistical mistakes have been missed?
[Regina] (57:03 - 57:51)
Yeah, that is very plausible. But side note, Kristin, science really needs to start paying for good peer review. I think a lot of people who are not in academia or science, they would be shocked to hear that the vetting for scientific literature is largely based on volunteer labor, free labor.
Often they're overworked grad students or postdocs or assistant professors who don't really have an incentive to do a careful job. And now that I say that, Kristin, I suppose you and I don't really have a financial incentive either to do what we are doing here, which is essentially post-publication peer review. Volunteer labor, yeah.
[Kristin] (57:52 - 57:59)
That is true, Regina. I put an awful lot of time into this paper unpaid. I think, Regina, we have a pet peeve incentive.
[Regina] (57:59 - 58:11)
That is an excellent point. It's a very powerful incentive, right? I need my daily dose of indignation, and this is how I'm getting it.
[Kristin] (58:12 - 58:23)
We enjoy getting our soapboxes out once in a while. Okay, getting back to the writing, I want to just read one paragraph from this paper just to illustrate this problem, and then I'm going to show you how it could be better.
[Regina] (58:23 - 58:27)
Oh, this is a fun game. I love when you do these writing makeovers.
[Kristin] (58:27 - 58:38)
Actually, in writing in the sciences, on Coursera, Regina, I actually have these videos where I'm editing entire essays in real time, kind of like those YouTube videos that you see, how to build an Ikea furniture desk.
[Regina] (58:38 - 58:52)
But it's better than that, actually, Kristin. Your course is great. It's kind of like those great Ikea videos, plus a home makeover and a beauty makeup tutorial all rolled into one.
[Kristin] (58:52 - 59:03)
I'd like to think that the writing is more beautiful after. All right, here is the passage I'm going to read. It's from their discussion.
They're giving some background about what's the biology that might link sleep to aging. Are you ready?
[Regina] (59:03 - 59:08)
Are you going to say it in your normal voice, or are you going to read it in your auctioneer voice?
[Kristin] (59:08 - 59:11)
I'm going to have to read a little fast just to get us through this, because it's kind of long. Here we go.
[Regina] (59:11 - 59:12)
Okay.
[Kristin] (59:12 - 1:00:02)
When it comes to the biological mechanisms about the relationship between sleep duration and hallmarks of aging, it ought to be underscored that critical hormonal modulators implicated in the sleep homeostasis framework, such as serum concentrations of testosterone, were shown to be influenced by inadequate sleep duration and disturbance in circadian rhythms. Furthermore, previous literature posited that an escalation in inflammatory processes could serve as a plausible intermediary mechanism responsible for the augmented aging observed in abnormal sleep. It has been reported that transitory deficiency in sleep duration precipitates a reduction in the levels of circulating metabolites orchestrating redox homeostasis and induces alterations in epigenetic profiles, thereby triggering multifarious downstream effects on biological function.
In addition, the accelerated aging associated with extreme sleep duration can be interpreted by cellular senescence, which can be reflected by changes in telomere length. Did you like that, Regina?
[Regina] (1:00:03 - 1:00:05)
Oh, did you say something? I was napping.
[Kristin] (1:00:07 - 1:00:09)
Did we just lose all our listeners right there?
[Regina] (1:00:11 - 1:00:18)
Yeah, you're right. This is a perfect example of bloated academic language. That was painful.
[Kristin] (1:00:19 - 1:00:40)
All right. Here's how I would edit it. Several biological mechanisms have been proposed to link sleep duration to aging, poor or irregular sleep may disrupt hormones in circadian rhythm, increase inflammation, reduce molecules that protect cells from damage, and alter how genes are regulated.
Extremely short or long sleep has also been linked to signs of cellular aging, such as shorter telomeres.
[Regina] (1:00:41 - 1:00:56)
Oh my gosh, I actually understood that and stayed awake for the entire thing. I'm feeling kind of emotional now, like there's people in the home makeover videos right now. I'm crying because of the beauty.
Of this.
[Kristin] (1:00:58 - 1:01:12)
Okay, on that note, Regina, I think we are ready to wrap up the episode and rate the strength of the evidence for the claim, which was that regular exercise speeds up biological aging in those with too little sleep.
[Regina] (1:01:13 - 1:01:40)
Okay, good claim. Now I'm fully appreciating the entirety of the story. And we rate claims on this podcast with our highly scientific trademarked smooch scale, one to five smooches, where one smooch means little to no evidence for the claim, and five means a lot of evidence in favor of the claim.
Kristin, how high, how many smooches are you going to give this one?
[Kristin] (1:01:41 - 1:02:18)
You know, Regina, you know that I have not thrown a martini in the face in a while. This is a martini in the face paper for me. And you know what?
I am not just throwing a martini in the face at the paper. I am also throwing it at the editors and the peer reviewers who let this through peer review because it's ridiculous. It's ridiculous.
There is no evidence supporting this claim. In fact, if anything, the paper says the opposite. And there are just so many things wrong statistically in this paper.
It really shouldn't have been published and it probably should be retracted. So if someone out there, one of our listeners, wants to write a letter to the editor and take it from here, I've done the groundwork. Go.
[Regina] (1:02:19 - 1:02:31)
You have done plenty of groundwork. Well, I cannot argue with that. I am just going to throw some Gatorades in the face instead of martinis in the face.
[Kristin] (1:02:32 - 1:02:33)
Oh, healthier.
[Regina] (1:02:34 - 1:02:41)
Did you know that they have Gatorade margaritas? They're called Gatoritas. Blue Gatoritas.
[Kristin] (1:02:42 - 1:02:48)
I mean, I can see that Gatorade has this nice, beautiful, bright color. So I can see that it might be useful in a cocktail. Absolutely.
[Regina] (1:02:48 - 1:03:05)
And then you add tequila, so it makes it extra healthy. Okay, we've got some martinis and Gatorinis in the face on this one. What about methodological morals?
There's so many here.
[Kristin] (1:03:05 - 1:03:14)
Oh, there's a ton here, Regina. But here's the one I'm going with. Before you believe something shocking, ask what had to go wrong to make it true.
[Regina] (1:03:14 - 1:03:20)
Oh, very nice. Yes, that counterintuitive thing. Stop and think about it.
Nice job.
[Kristin] (1:03:20 - 1:03:21)
How about you, Regina?
[Regina] (1:03:21 - 1:03:28)
How about this one? If slight modeling changes flip the story, there wasn't much story to begin with.
[Kristin] (1:03:29 - 1:03:34)
Oh, I love that. Yeah, so many things here. It's just slight differences gave us different answers.
Yes, absolutely.
[Regina] (1:03:35 - 1:03:37)
But could I have a bonus one, actually, Kristin?
[Kristin] (1:03:37 - 1:03:39)
Oh, we need a bonus for this episode.
[Regina] (1:03:39 - 1:03:55)
Yes, absolutely. This is less a methodological moral and more unethical life pro tip. Here it is.
If you do not want your analysis critiqued, then just make it impossible to understand.
[Kristin] (1:03:56 - 1:04:02)
I want to point out that we're being ironic there. We are not suggesting that you do that. Yes, I love it.
[Regina] (1:04:02 - 1:04:37)
Hence the unethical life pro tip. Yeah, there we go. Well, Kristin, this entire episode has been fascinating and not quite what I expected.
I thought I was going to get chided the whole time for me not sleeping enough, not exercising enough and looking old. But I feel like that's not what the end story has been about. This was way more interesting and not the sort of statistical sleuthing that we often do.
This took some unexpected turns. So thank you, Kristin, for a really good time.
[Kristin] (1:04:37 - 1:04:43)
Thanks, Regina. And thanks, everyone, for listening. Bye.










