Engineer's Critique and Insights of TR AI workouts and FTP

Yeah, another thread on this, but I think I’ve got some insights from my engineer’s brain that may help others, and what good is the autism if it doesn’t help in particularly niche and nerdy situations like this?

Caveats:

-I don’t work for TR so I don’t know this is how it works, but I have some I think good guesses albeit dumbed down from reality.

-I don’t have any skin in the game besides wanting the system to train me so I don’t have to think about it (gives me more time to think about how it trains me).

-I’m not an exercise physiologist or coach or trainer or anything besides a scientific-minded layman.

-I actually don’t care about FTP, I only care about good workouts, but I understand why some people care (and in many cases care a bit too much lol)

AI Workouts - Excellent

I’ve seen a lot of comments here and on Reddit that indicate their AI selected workouts have been really good despite the AI FTP changing their supposed FTP. My workouts have been excellent. The intensity always seems to be correct for my hard workouts. This makes sense because the workout AI doesn’t use FTP to select a workout as they’ve straight up said a bunch of times, and I believe them.

Based on interpreting some things said on the podcast and some of the responses to others on this forum by TR staff, I’m pretty sure the workout selector AI looks at something like your 90 day power curve and finds workouts that are above that power curve in two domains - VO2 Max and Threshold. For VO2 it finds workout with a power curve above your 90 day curve particularly in the sub-4 min domain (maybe up to 10 mins), and for Threshold it finds workouts where the power curve is above your 90 day curve in the 4-min to 60-min (maybe fewer, like 40 or 45 mins) region.

From that subset, the workout AI will pick a workout that meets your time demands, and depending on how you rated your previous workout on the 90 day curve (moderate, hard, v hard…) it selects for a workout curve closer to (very hard) or further above (moderate) the 90 day curve.

I think this system works extremely well for me, and seems it works very well for most from the comments. And the methodology makes intuitive sense to me based on progressive overload principles, and backs up why most have been saying their workouts have been good despite wildly varying feelings about their AI FTPs.

AI Predictor and Post-Workout Survey Interaction- Room for Improvement

The predictor seems to be kind of sensitive to the end of workout questionnaire, but I do think it’s ultimately self-correcting. I workout in Zwift and their end of workout rating is 1-10, but they don’t actually correlate very well to TR’s qualitative assessment (Post-Workout Surveys – TrainerRoad).

I find a 7 or 8 RPE typically has at least a rep in reserve, so it is “hard” and not “very hard” per TR, but it is automatically logged as Very Hard. I think it doesn’t matter too much long term, and actually might be better for long-term growth as it’s more measured progress, but something to watch for and could leave some short-term gains on the table.

This also could kind of hurt people in the short-term if they see this and intentionally misrepresent their effort levels to impress the FTP predictor (guarantee some are this insecure). However, I think it will just lead to getting spanked on a workout from the workout selector AI as it will give you a workout with a power curve that’s a big bump above your 90 day curve.

AI FTP - Room for Improvement

Per the podcast and things I’ve seen posted by TR, your AI FTP is based on a Threshold Level 3 workout. I interpret this to mean that a Threshold 3 workout using the AIFTP is guessed by the Workout AI (based on your 90 day curve) to be hard or very hard with a fairly low chance of failure, and based on the workouts in my calendar that seems to be approximately the case.

I think revising your FTP in a manner similar to this actually could work quite well. It makes sense to me that a 60 or 75 min threshold workout should be able to provide some progressive overload, so increasing FTP and reducing the work durations makes sense to me.

However, it is my lay opinion that the threshold level 3 workout structures are actually just too easy (I’m thinking Fang Mountain -2, eg) resulting in an overestimated FTP. I have much more background in running, and the threshold structures executed in running there are more like Threshold 4 to 4.3 (eg Stromlo 4x8mins with 4 min rest) workouts in TR. As is currently, Stromlo sits just below my 90 day curve but after my anticipated AI FTP bump, the AI reckons that Stromlo will be too hard for me. I could be mistaken because cycling and running aren’t 1:1, but I think the Stromlo or San Pedro structures would be a fantastic benchmark/bread-and-butter Threshold workout. I think the workout AI is correct, that at my predicted AI FTP, these workouts will be too hard, but this leads me to believe it’s the AI FTP that is too high.

As currently constructed, I think the threshold development and VO2 development will be fine regardless, because of the previously discussed way the workout AI selects workouts means that you’re bumping up your power curve in that domain with the easier workout structures. However, the consequence of the slight inflation to FTP is that workouts below your power curve, eg the sweet spot and endurance domains could reasonably come in kinda hot, particularly for those pushing lower watts as each “zone” shrinks.

TR touts the tempo and sweet spot zones as a great area for growth, and I won’t disagree, and I understand that physiology happens in a gradient or spectrum and it isn’t light switches for the most part, but I think overestimating FTP for someone working a lot of SS could be maybe the worst scenario. I don’t think the low tempo/endurance transition matters as much, so those operating more polarized it’s less important.

Anyway, I think benching off something like a 4.0 or 4.2 providing a very small power curve bump, instead of basing on a 3.0 may address the sweet spot concern as well as some others people have been having just feeling their FTP is impossibly high. But, I understand you have data and have smarter people than I who this is their field, so I’m not going to push. Like I said up top, I don’t actually care about my FTP, but I do find it interesting.

Closing Remarks

The workout selection has been so, so good, so chapeau for that! I’ll be interested to follow the development of the AI FTP and predictor models. Best of luck and don’t let the haters get you down. I can tell y’all aren’t half-assing this.

17 Likes

Well put mate. And big up da TR staff.

I have to agree they smashed it.

I agree with take on AiFTP being normalised on the 4s not the 3s. And as someone who is still wedded to FTP it is still a bit difficult to let go, but I coping, day by day, baby steps and an all that. Ghat said, that’s my only issue and that’s because I’m an addict. The workout selection has been 10/10. I’m getting pushed just enough. I see the selection trying to nudge my performance up in different lower durations within a specified workout domain. It’s not relying on WLs, I can see it picking different structures because I need to improve. For example i have good 30s power and can keep repeating it, but I get served a lot of 2 min power to pus me there which is working, while also keeping my 30s power moving up.

I will swap a slightly confusing take on my FTP (confusing for me, not speaking for all), for the impressive training I am getting.

Good work TR​:clap:t5:

3 Likes

Wow, long post, will ask ChatGPT to summarise it for me…

6 Likes

I think you may be right that normalising back to a Threshold 4.0 might give an FTP number that is more similar to what people expect. However, looking at the stats Nate shared, this might lead to many TR users seeing even more of an FTP reduction which may have lead to a rebellion.

As it happens, my FTP increased by 1W so it wasn’t a big deal for me.

2 Likes

A 7 should map to a hard. 8 and 9 map to very hard with 10 mapping to all out.

5 Likes

Is Fang Mountain-2 what you are getting as your first Threshold after your detection? I thought most were getting a Cloud Ripper so something in the 3.6 to 4.0 range.

My first one was Yukon a 3.7 but that was because I did a zwift race the day before.

So we already are in the space of high 3s. I agree that having a way to shift threshold up would be nice but I haven’t seen any evidence that the new FTP is indexed to a 3.0 workout.

Just did an experiment, I added an interval Threshold after my AI detection and it’s a 3.9:

3 Likes

Your description of the AI workout selection does not match how models are trained and how inference works. Think more in terms of it predicting your RPE and heart rate for the next 5 (1?) seconds of a workout and iteratively predicting for the full workout time. It uses your previous workouts, RPE ratings, rest time, and heart rate to feed into predicting each 5 second window. At the end it selects the workout that results in a final RPE prediction curve that best matches the programmed ideal curve.

Predict means the model has been trained over numerous rounds to understand how to weight the impact of historic 5 second intervals on the next 5 second interval. Feed it 100,000 users x 1 year of data, get initial weights (think OLS), feed it additional 50,000 users x 1 year of data to refine the weights, and give it a final 25,000 users x 1 year of data. stop adding more rounds when the weights are converging well enough and you aren’t getting payoff from more data.

The threshold level 3 is again a prediction loop and search algorithm - pick a few 3.0 workouts, run the prediction loop for the workouts for different FTP values, pick the ftp value that gives an RPE prediction curve closest to the ideal.

2 Likes

After the first detection it gave me Ishika 2.8 and after my next prediction tomorrow it has Avalanche Spire -1 for 2.9.

That you’re receiving 3.6 to 4.0 is interesting and certainly contrary to what I expect. A variable is that you’re certainly a more developed rider than I am, so that’s a likely factor there. Someone else commented about workout selection being based on a predicted RPE curve rather than power curves (certainly those would correlate well, but which is the mechanism makes a difference). Maybe the predictor just knows you tolerate a lot of work at threshold at a lower RPE. Would you say you are particularly strong in threshold type efforts?

Oh, interesting. Thanks for the insight. That mechanism for workout selection makes sense based on how if you change a workouts RPE it will change your planned rides. And it makes sense if that’s the mechanism why my reasoning that they are comparing the power curve (I was thinking OLS, as you mentioned) of the workout to your 90 day curve makes sense, too. I would expect that power curve comparison to be doing some heavy lifting, and with past RPE, rests, and HR refining the RPE prediction.

I’m not a computer scientist, so my understanding of computational complexity is pedestrian, but it seems like an excessive amount of computing to look at all 3000+ workouts. It predicts RPE really fast when you open a workout, but I wouldn’t want my computer to look at every workout each time it’s picking a workout. I would probably start with selecting a subset using a curve comparison, or probably even simpler just to grab from +1/-0.5 from current workout level or something, before implementing the predictor.

I know I’m wrong about how this works lol, but trying to be only kind of wrong instead of totally. It’s fun to think about.

I wouldn’t say I’m particularly strong in Threshold. I am particularly weak in Anaerobic work though. :slight_smile: Or I could be in newb gains mode as I try real structured training vs Peloton with vibe mountain biking and that is bumping up my relative threshold level

Thanks for giving your workouts. I guess it shows the AI FTP isn’t as cookie cutter as raise FTP until a 3.0 is hard then set AI FTP there. When you got your initial new FTP did it go up much? Or go down? I had just had a detection the day before and then went up 2 watts. The previous days detection was a decent bump from 268 to 283 then 285 the next day. Most recent AI bump was 293 predicted to go up to 306.

My initial detection only slightly raised my FTP… :wink:

I’m quite large at 6’2” 200lbs, so W/kg is medium low, but predicted to go up 8% to 285 tomorrow despite not much riding. I’m a trail runner primarily, so if I feel I need rest for future running, I don’t ride the bike. All newbie gains taking my decent fitness from running and adding cycling specific muscle and neuro pathways. No running past week or two due to onset of some plantar fasciitis :frowning:.

I knew it wasn’t as simple as I was guessing, but your case has me puzzled. If the AI FTP is running RPE prediction loops on threshold workouts, why is it that yours is not moving you back down in structure to 3.0 after the update? Maybe they are benchmarking people to assorted workout levels and seeing how they perform to try to find an optimal level or workout to use for the benchmark? Probably will never know, but I do like a mystery.

I think my first threshold workout after detection was like a 5.9. Vo2 max work were in the 7s. I haven’t done a threshold or vo2 max workout since cross season back in December. I could probably do them since my FTP right now is considerably lower than it was in december, but we’ll see this next week or so

1 Like

Whoa, that’s interesting. My case is almost as they described with my threshold levels going down to around 3 after each update, but you’re the second person on this thread that this isn’t true for, so clearly it’s not so simple as give them an FTP where a level 3 workout is 50/50 hard/very hard or whatever. It’s more interesting to be wrong haha. I have more questions.

1 Like

Now that I go back and look, my first vo2 max workout after my last detection was an 8.3. Crazy. I didn’t do it and did a SS ride I think. I’ve marked most of my endurance rides as easy because, well, they’re easy. Some moderate based on duration. They’re basically my recovery watts from when I was doing intervals in the summer/fall last year. Tempo and SS are usually moderate or hard pending the type of work. I may make some changes to the plan as it has me doing very little endurance/ss work. Then again, I get a lot of my endurance from splitboarding or xc skiing (2-3 days a week).

I have another detection coming up soon. I’ve been swapping those high level vo2 work for some lower ones.

1 Like

Well just so you have some evidence that the new FTP can be indexed to a 3.0 workout. My first workout after FTP increase is Moose’s Tooth -1 with a 3.0 WL. I have no problem with it so far my workouts have been perfect for me!

2 Likes

I think I only ever got one thresh close to 3 after the very first ftp update during the beta, which raised my ftp. Everything else has been closer to 4. After ftp update this coming Wednesday (+3.8%), it’s got a 4.2 on Friday.

2 Likes

Just got a new FTP increase (+4.8%). Next threshold workout on my calendar is a 4.2 also (Starlight). I got pretty minimal increase in workouts last block despite exceeding power/duration because rides were feeling easy. Power value for the next ride (with new FTP) pretty much exactly matches the power I increased my last rides to with the prior FTP.

Predicted RPE is split between medium/hard. That matches my expectation for the ride.

Overall the progression on my AI workouts seemed slow last block, with a lot of rides feeling too easy. New starting point looks spot on though.

I put in my notes on several rides last block that the RPE survey needs an option for “this was too easy and felt like the wrong training zone”

2 Likes

Maybe it has something to do with your threshold being relatively strong either vs VO2 or vs longer endurance efforts… Do you think maybe it’s trying to keep your VO2 or endurance work easier since maybe you don’t hit as high in either of those domains?

I think it could make sense that the VO2 work could be set to very high level. If you compare the power profile on the 8.3 workout to your 90 day power profile, is the 8.3 VO2 workout much higher than what you’ve done? If you’re a cross racer, you might just have a really high top end relative to your threshold, so you can punch hard relative to your FTP.

My AiTFP rolled over last Wednesday with a 15w increase (245 - > 260) and I had my first threshold workout on Saturday which was served up as Ishka 2.7

Difficultly was HARD as predicted. You can see my HR climbing for the overs and decreasing during the unders so bang on id say for over / under threshold interval workout at a 260w FTP.

Next Saturday AI has lined me up for Brutal 3.4 which will give a nice lift on the NP power records for the last 6 weeks. The unders are also “under” enough that I think this will be achievable at a “Hard” rating.

If I ask for a harder workout, all the 4.0 are not recommended, I would agree with this as my HR would likely not recover enough during the unders and result in a failed workout.

Therefore for me anyway, basing my FTP around a productive 3.0 Level threshold workout seems about right.

3 Likes