Polarized Training Deep Dive and TrainerRoad’s Training Plans – Ask a Cycling Coach 299

Strava and anything it links to, Zwift, TR - there’s tons of data out there but it would be pretty tough to design a solid case-control or cohort study that wouldn’t be full of significant limitations. Large scale observational studies are typically hypothesis generating because they will only identify independent associations. You can do analyses that might find statistical/conceptual mediators or look at dose-dependency to strengthen a suggestion of causality but randomized controlled trials are the gold standard. We’ve now seen the dependence on RCTs to study therapeutics for COVID all over the news.

The TR team has dropped little hints about their data set, like most people not following plans, or a subset of users benefiting from SSBHV. I would bet the data set would just show mostly very poor adherence to plans or high attrition but you can’t tell the reason why people deviate from the plan. The subset that follows plans rigidly is going to be subject to selection bias. You’ll probably find in that group a lot of slow twitch triathletes who don’t like riding in groups/outside due to danger or live in places with bad weather. I think intervals.icu and TR can show time in zone type data but I don’t know how you would draw anything meaningful from this. Some people have smart trainers but no on-bike PM’s. Some people have PMs that read quite different from their smart trainers.

What outcome makes sense? Achieving FTP above a certain level - estimated by a CP curve? Watts/kg depends on people weighing and self-reporting accurately. Change in FTP? That’s going to be heavily influenced by past training experience. Newbies or highly trained athletes rebounding from time off will show gains.

I think there are many reasons why we don’t see big observational studies out there.

3 Likes

@_Matthew, @JPasdf: you are making interesting observations.

Matthew, your comments about observational studies are spot on. But remember that many of those comments also applied to the cohorts that Seiler made his name studying.

That said, think of this: every time that TR puts a factor [Z, say] into their ML algorithm, they are in effect proposing a hypothesis, namely that Z affects some training outcome [whichever outcome we want to define]. They then observe whether factor Z has any effect on the actual performance outcomes achieved by us lot. If it does then we have a new observation, that Z seems to influence performance outcomes. And bingo, we have a new hypothesis that can be tested through randomised control trials. Given the cost and difficulty of running RCTs, the TR data will have proposed an hypothesis that has a reasonable chance of being confirmed.

Consider the case of HRV. Does this really predict performance on a day to day basis? Do other HR metrics do any better?

1 Like

As a way to more easily compute the zones:

Counter point…

Knowing you can accomplish hard b2b2b days is empowering and confidence building. That even if your legs don’t feel great when you hop on the bike you can still knock out big watts and accomplish a workout or push yourself to new limits.

Now when you get to hour 5 of your race and you have an hour climb ahead of you…you know your legs will be there if called upon. Or if you get on your bike to warmup for a race and you don’t quite have that pop you think you should…you know you can still execute.

1 Like

I would be surprised if they are actually using models to predict performance outcomes. They already said they aren’t ready to use outdoor unstructured ride data, so you can assume the models are limited to readily available data. Variables have to be very specific (continuous, categorical: binary vs. ordinal) when they go into statistical models.

While the term machine learning is applied, my guess is that the models are being built around the RPE’s that beta testers fed the system after completing workouts. Their successful completion of workouts with categorization like SS 1-10 and RPE will give a workout profile. Future rides and RPE’s will “teach the model.”

At the very least, I would add in covariates like age, sex, TSS history, as well as take power curve data to classify someone as a sprinter, TT, or all around rider. They probably have some woop data and testing hr data to support the subjective RPE. Sleep data would probably help model fit a ton.

The model would learn how RPE/fatigue metrics are influenced by missed rides based on the behavior of these metrics in response to specific stimuli (categorized TR workouts like VO2 level 9 or something).

The model would say, hey you missed a workout, you were supposed to do X, we know from the data of other riders that if you try Y now, your RPE/fatigue metrics will be too high, so we will give you Z instead.

Or, we gave you A but you said it felt too easy, based on other data, an increase in difficulty of N should do the trick, so do workout B.

This would meet the need of adaptive training. The analysis probably grows from subsequent workouts to include all RPE data available and that complexity might graduate from what one might call simply logistic regression to “machine learning.”

1 Like

Absolutely nothing, that’s why I said there is a chain of dependencies which you seem to reject. Why do you think that isn’t reality?

I think for many people here they are interested in year over year FTP increases, placing in A races, moving up in race categories. These are long term goals that the physiology research world doesn’t investigate because it is hard to do that research. In contrast a 40k TT performance in a lab after 8 weeks of intervention is easy to measure, but is that the outcome most of us are interested in? Does continuing with one of the training methods for 2 years instead of 2 months actually lead to a different result?

4 Likes

I don’t think it’s realistic to study or model race performance. I would think of a training plan like a study plan for a high school student taking the SAT. The study plan should be judged on its ability to get users high scores on the test. It shouldn’t be judged on users acceptance into prestigious colleges. It is good to have a high score, but there are so many other factors involved.

Maybe other sports like swimming, running track/distance are determined solely by fitness. In cycling, we have drafting, cornering, pack dynamics - you don’t have to be the fittest rider to win races.

I do think it would be interesting to see data for plan adherence and changes in fitness over the course of an entire season though. In terms of fitness, this could be W’, CP, TTE - not just a proxy of power at MLSS.

1 Like

I see @hubcyclist replied to this, but not TR

It would be good to know exactly what @Nate_Pearson was suggesting. If we have an HR monitor and have Garmin 30 series device, does he want us the record the activity on the Garmin rather than TR?

Yes

No

They don’t identify the bottlenecks as I was describing them. I’m NOT talking about identifying your weakness like which metrics of your fitness signatures is low. I’m taking about the bottlenecks to what you’re trying to improve.

I explained how I would train the model in my post up above. Since I don’t have a huge data set it some experiment to work off if I can’t give you a full concrete example because that has to be looked for.

Some assumptions:
-Different intensities train the body in different ways (see your link to training peaks)
-different intensities strain different physiological systems (see the link to the podcast on mitochondria I posted above)

So based on that seems easy to think that there can be a chain of dependencies. High intensity intervals produce lactate, that depends on your mitochondria to process it. So your mitochondria can be a bottleneck limiting what you can do. Identifying the most optimized training intensity distribution would be to train the bottlenecks that are limiting advancement.

Can you answer this? Why do you think when you exercise you don’t just only target that you want to improve? Why is it important to target other areas?

I like many things about Xert, but I am not sure the classification into types says much if anything about the demands of a particular rider type. F.ex. they say a GC specialist has great 8 minute power and a Roleur great 6 minute power. At least, I don’t believe there is any underlying complex modelling of performance requirements.

Packaging different focus durations into a training program is probably better than hitting the same focus duration over and over again. But is there any science to this? In any case, I don’t see anything in regards to identifying and targeting limiters in regards to a particular event (type).

It would probably also require a fairly sophisticated understanding of the performance demands of the target event?

Agreed, there are definitely times (events) where b2b2b hard days can be really beneficial.

Personally I would find that physically and mentally tough week after week.

I guess the POL counter point is: I’ve done plenty of 4hr rides so 5hrs isn’t that much longer. This climb looks like a VO2 max effort - yep done plenty of that too. Not a problem - lets keep going :slight_smile:

IMHO there is no right / wrong - just what works for the individual.

2 Likes

I’m having trouble understanding your graphs as when I try and guess the values you’ve used for each zone I get different PI values to what you’ve stated. Are you able to share the raw data used? Thanks.

Yet another possible metric but read part 5 and look at the limitations and you’ll see yet again there are a multitude of factors that could affect outcomes, and thats before you even look at the fact that currently only a few HR straps produce data that is even close to being sufficiently artifact free to be used for any quality results.

I’m sure the research and tech will get there at some point but we are still a long way from that right now. But once again I honestly think its missing the point to some degree - if you train to some intensity level that ultimately impacts your ability to perform your high intensity sessions, then you’ve got it wrong. It doesn’t take long to know how hard to go on easy days to allow yourself to hit the z3 days fresh enough to nail them. It’s not like trying to bash out another SST session when you feel a little tired. Trying to nail 4x8m @105% will quickly tell you if you went too hard in the previous days…

1 Like

Well, here’s hoping I got this right…

For each of the Z1 percentages (from 100% down to 0%) I back calculated the ratios of the other two zones that gave a certain P.I.

The table below presents percentages but as discussed above, you have to use a decimal ratio or drop the x100 from the equation:

P.I. = 2
Z1 Z2 Z3 P.I.
100% 0.00% 0.00%
99% 0.50% 0.50% 2.00
98% 0.99% 1.01% 2.00
97% 1.48% 1.52% 2.00
96% 1.96% 2.04% 2.00
95% 2.44% 2.56% 2.00
94% 2.91% 3.09% 2.00
93% 3.37% 3.63% 2.00
92% 3.83% 4.17% 2.00
91% 4.29% 4.71% 2.00
90% 4.74% 5.26% 2.00
89% 5.18% 5.82% 2.00
88% 5.62% 6.38% 2.00
87% 6.05% 6.95% 2.00
86% 6.47% 7.53% 2.00
85% 6.89% 8.11% 2.00
84% 7.30% 8.70% 2.00
83% 7.71% 9.29% 2.00
82% 8.11% 9.89% 2.00
81% 8.50% 10.50% 2.00
80% 8.89% 11.11% 2.00
79% 9.27% 11.73% 2.00
78% 9.64% 12.36% 2.00
77% 10.01% 12.99% 2.00
76% 10.36% 13.64% 2.00
75% 10.71% 14.29% 2.00
74% 11.06% 14.94% 2.00
73% 11.39% 15.61% 2.00
72% 11.72% 16.28% 2.00
71% 12.04% 16.96% 2.00
70% 12.35% 17.65% 2.00
69% 12.66% 18.34% 2.00
68% 12.95% 19.05% 2.00
67% 13.24% 19.76% 2.00
66% 13.52% 20.48% 2.00
65% 13.79% 21.21% 2.00
64% 14.05% 21.95% 2.00
63% 14.30% 22.70% 2.00
62% 14.54% 23.46% 2.00
61% 14.78% 24.22% 2.00
60% 15.00% 25.00% 2.00
59% 15.21% 25.79% 2.00
58% 15.42% 26.58% 2.00
57% 15.61% 27.39% 2.00
56% 15.79% 28.21% 2.00
55% 15.97% 29.03% 2.00
54% 16.13% 29.87% 2.00
53% 16.28% 30.72% 2.00
52% 16.42% 31.58% 2.00
51% 16.55% 32.45% 2.00
50% 16.67% 33.33% 2.00
49% 16.77% 34.23% 2.00
48% 16.86% 35.14% 2.00
47% 16.95% 36.05% 2.00
46% 17.01% 36.99% 2.00
45% 17.07% 37.93% 2.00
44% 17.11% 38.89% 2.00
43% 17.14% 39.86% 2.00
42% 17.15% 40.85% 2.00
41% 17.16% 41.84% 2.00
40% 17.14% 42.86% 2.00
39% 17.12% 43.88% 2.00
38% 17.07% 44.93% 2.00
37% 17.01% 45.99% 2.00
36% 16.94% 47.06% 2.00
35% 16.85% 48.15% 2.00
34% 16.75% 49.25% 2.00
33% 16.62% 50.38% 2.00
32% 16.48% 51.52% 2.00
31% 16.33% 52.67% 2.00
30% 16.15% 53.85% 2.00
29% 15.96% 55.04% 2.00
28% 15.75% 56.25% 2.00
27% 15.52% 57.48% 2.00
26% 15.27% 58.73% 2.00
25% 15.00% 60.00% 2.00
24% 14.71% 61.29% 2.00
23% 14.40% 62.60% 2.00
22% 14.07% 63.93% 2.00
21% 13.71% 65.29% 2.00
20% 13.33% 66.67% 2.00
19% 12.93% 68.07% 2.00
18% 12.51% 69.49% 2.00
17% 12.06% 70.94% 2.00
16% 11.59% 72.41% 2.00
15% 11.09% 73.91% 2.00
14% 10.56% 75.44% 2.00
13% 10.01% 76.99% 2.00
12% 9.43% 78.57% 2.00
11% 8.82% 80.18% 2.00
10% 8.18% 81.82% 2.00
9% 7.51% 83.49% 2.00
8% 6.81% 85.19% 2.00
7% 6.08% 86.92% 2.00
6% 5.32% 88.68% 2.00
5% 4.52% 90.48% 2.00
4% 3.69% 92.31% 2.00
3% 2.83% 94.17% 2.00
2% 1.92% 96.08% 2.00
1% 0.98% 98.02% 2.00
0% 0.00% 100.00% 2.00

Mike

2 Likes

@_Matthew It looks as though we are having a side conversation here!

I agree that what they are talking about is using ML rather than regression techniques. But what I am suggesting is this.

Suppose that the machine has learned from the existing kinds of data about us that we know TR has access to – age, gender, weight and training history. From this it proposes workouts for us and can learn from our performance on those workouts.

Now suppose that TR says, feed us your HRV [or whatever data] and then feed that into the machine. The machine has to learn whether the HR metric makes any difference to performance on the proposed workouts. As far as adaptive training is concerned, the question is simply whether the HR metric makes any difference to the proposed workout – ie, whether it improves the ability to accurately propose appropriate workouts.

But as far as science is concerned, the question is what the machine learns about the implications of HRV for performance. If the machine learns that HRV seems to make no difference, one hypothesis for the RCT people. If the machine learns that HRV seems to make a difference [perhaps for some types of people], then there is another hypothesis for the RCT people to go after.

That is, there are two interested “users” here – we, the people being trained; and the scientists, looking for hypotheses that have a good chance of being correct [or, for the pedants, of not being proven incorrect].

4 Likes

Really enjoyed the podcast and give two thumbs up to the TR team, Amber, and Nate.

2 Likes

I don’t see how you’re making this assumption. Yes optical HR sensors don’t do well during exercise at reporting accurate rr time but chest straps work fine:
https://www.hrv4training.com/blog/hardware-for-hrv-what-sensor-should-you-use

There’s a big thread about HRV which includes posts by someone involved in developing and marketing a HRV app. That uses either the phone’s camera or a chest strap.

1 Like