ADDRESSED: Mounting Frustration with the AI Update

Hi Caro,

Thanks for looking into my calendar, I appreciate the response. I think if you take a close look at the 3-hour ride I did on Saturday, it becomes less of a red flag than it initially appears. I was assigned a two-hour endurance workout from TR. I rode outside and actually did a 90-minute climb at an easy pace (my powermeter died so unfortunately this was based on HR and RPE). The next 30 minutes were a 30-minute descent with no real pedaling. The next hour (recorded as a separate workout) was actually a photoshoot I helped out with for a team sponsor. That ā€œrideā€ basically involved pedaling around in Z1, maybe low Z2 for a hundred yards or so and then stopping for a few minutes and then repeating. I genuinely don’t believe either of these rides created any excess fatigue.

My Monday strength training was purely upper-body and core work. No legs, and I’ve also been consistently strength training for years. I am certain my strength training adds additional stress/fatigue, but it’s been extraordinarily consistent for the last decade, so I don’t think this particular session was any harder or easier than any other in my history.

I have noticed, however, that my de-load week this time was full of actual, full-length, endurance rides. In the past I am pretty sure TR has recommended shorter, and easier rides that felt more like recovery spins rather than actual endurance workouts. I am wonder if that was due to the recent update?

Yeah! You’ve been crushing it! Most everything was in spec from this period you were talking about. The only struggles I’m seeing are coming from Threshold workouts, but that’s minimal. Overall things were going great!

When I run the admin tool to find out what your FTP would have been with the new version of AI FTP Detection on Jan 10 (the date you were given 316 by the old version), it shows 309, so I’d use that as the point of reference.

Crucial to point out your FTP did NOT drop from 316 to 309. It was already at 309.

The first workout you were prescribed after the switch to using the new version of AI FTP Detection was on the easier side. That said, and judging by the comment, I’m not sure if you were trying to sway the model with the survey response, but you rated this one Easy even though you hit HR of 177, which is 5bpm higher than what you have hit in other Sweet Spot workouts recently that you rated as Very Hard. I don’t know for sure, but I think this was a bit of an anomaly for the model to deal with, and possibly affected your next Threshold workout more than it would have if the survey was not answered as Easy. Of course, I could be wrong in my assessment that this was not Easy, but in other situations where you’ve marked workouts as Easy you are typically between 140-160HR, and Moderate is typically 150-160HR.

After this it looks like your next VO2 Max workout was appropriately difficult.

Then for your next VO2 Max workout, it was 55% likely to be Very Hard and 40% likely to be Hard. You rated it Hard, which isn’t far out of spec, but you hit 182HR in this one and had tons of 6 week PRs, so this one looked like it was pushing your limits well. Whether ā€œHardā€ accurately reflected what you felt is something I can’t know, but signs would point to this one being Very Hard, but still not far out of spec if it’s rated ā€œHardā€.

Your next Threshold workout was 65% likely to be Hard and 25% likely to be Very Hard. It’s an objectively harder workout than what you got the week before and you reached 179HR. If we look at the average power you did for those intervals, it was only 2w off of your 6 week PR, so that info, combined with your HR data would suggest this was harder than ā€œModerateā€. Again, I’m not sure if you were trying to sway the model here or if it did indeed feel moderate, but the data makes a ā€œModerateā€ rating a bit puzzling.

After that you had a recovery week and then you had Blue Top.

You ended this workout early and hit an HR of 182, and based on what you’ve said here, I assume this was due to intensity, but that you didn’t mark that down as the reason. Instead you marked ā€œOtherā€ as the reason and stated ā€œYour shitty fucking AI.ā€ as the reason :melting_face:. It would be super helpful if you could mark the reason why you had to end this one early. :slight_smile:

Looking at the prescribed power on this workout, it was only a handful of watts higher than what you had done for 2.5-3.5min for the last 6 weeks, and when we look at that same duration compared to your all-time power, it was 128-148w lower.

TR AI doesn’t use Workout Levels to pick your workouts, so if we view them from this lens, it gets confusing. It’s better to look at the power you’re doing during the sets/intervals and see how that compares to what you’ve done recently.

When you do this for all of the VO2 Max and Threshold workouts you’ve been prescribed since starting to use the new version of AI FTP Detection, you can see that there’s actually a pretty linear progression.

That first workout after starting to use it was a big conservative, but that’s the only outlier I’m seeing when looking at the Power Records charts.

But above all, and this is super important, TrainerRoad AI has been picking your workouts for over six months now behind the scenes. So when you perceived things were going great, that was all due to TrainerRoad AI and that hasn’t changed.

The only ā€œchangeā€ is that you started using the new version of AI FTP Detection. It didn’t lower your FTP, you just started measuring FTP with a different scale that is proven to deliver better training outcomes.

In terms of the workout prescriptions after that point, other than one workout being a bit on the easy side (and predicted to be so), your training looks pretty well locked in based on the Power Records charts.

My suggestion would be to stick with it through this next block and to fill out your surveys objectively based on what it asks of ā€œHow did that effort feel?ā€. This will always lead to better results.

44 Likes

@Jonathan said said ā€œshittyā€ :winking_face_with_tongue:

(Yeah, yeah, I know it was a quote, but it had to hurt to type)

14 Likes

The AI system doesn’t know if you’ve slept badly, had a bad day at work, big argument with your spouse, had too much to drink the night before etc etc. That’s one reason why you may have to tone down some workouts and not others.

7 Likes

I also failed a workout today (Thursday). Antelope +1. I had successfully completed Geiger +2 on Tuesday but just had absolutely nothing going for me today. It felt like I was riding through treacle. Heart rate was several beats higher than Geiger for similar power. recovery week next week but cloudripper -2 to get over on Saturday before then :face_exhaling:

I’ll bet CR-2 gets replaced

I love seeing Jonathan or other TR crew jumping in and analyzing folks ride histories and such, but here’s a possible big issue for me. The after ride survey. We’ve all seen the chart on how to rate how a workout felt, and I would say capital FELT. I find it very difficult honestly to know where exactly I should be rating things even with that knowledge and even though I’ve been on here for years. Then I see Jonathan breaking down someone’s HR and their 6 week power PR’s that they hit in a workout and how that maybe should have meant the workout should have been rated this way, but they called it something else.

It’s too much. I can’t be digging into the stats on every ride I do to make sure my HR or my power was X. I thought the survey was just how it felt on that day. I feel like we’re getting mixed signals on how ā€œcorrectā€ we need to be answering the survey or we will then get the wrong stuff as follow-up rides. How important is the survey really I guess? I don’t think the training schedule should be set by it. You have all the data, the power we did and the HR response to it. I thought that was the whole point of this advanced system that has all of my data and all of the data from millions of rides from other people, that it can crunch all of that and give me the right training. I would much prefer that over, well you called that last ride hard and we thought it would be moderate, therefore you now get easier stuff for the next week. I don’t want my coach to give me workouts only based on how they felt, I want what is going to make me faster.

10 Likes

The biggest issue I’ve got with the survey is that it’s the same survey no matter the type of workout.

An anaerobic workout will basically always be hard for me, even if it’s ā€œeasyā€, it’s still hard. Because, it makes my legs hurt. Same with v02 workouts, even if I can do a lvl 6, a lvl 2 will feel hard.

Should these be marked easy as they are easy relative to other similar workouts? Or should they be marked hard, because they are hard?

I feel that I’m not managing to get my point across correctly…Let’s try again.

1h @ 50% = easy

1h with 5x 5 min intervals @ 100% = easy

But they are very different easy.

7 Likes

No need to over-think this: if it felt hard overall, rate it hard. If it felt easy, rate it easy. Etc.

I just rate things how they felt and don’t spend much time thinking about it. For reference, I’ve generally rated Endurance workouts Easy, but if sufficiently long enough (eg. >2.5-3hrs) then it could become Moderate. Sweet Spot workouts are normally Moderate at lower TTEs, rising to Hard for longer TTEs. Threshold are usually Hard rising to Very Hard at longer TTEs. VO2max Hard to Very Hard, or very occasionally Moderate when starting out with less taxing workouts.

That’s been my normal pattern, historically, but obviously I depart from these norms where my RPE felt differently, such as a SS workout feeling very hard due to tiredness, which are precisely the signals the AI is looking for to indicate something could be going off track and an adjustment might be needed.

11 Likes

I don’t know why you think it does that, and it does not do that. The model is driven predominantly by measured data, but feedback is an additional essential part - just as it would be if working with a human coach - in order to validate ā€œthe modelā€ that the system has of you, and to adjust as necessary.

4 Likes

I’m simply asking how important the surveys are because I see them used in examples as to WHY things get changed on the calendar. They should be part of the equation, but the actual data (that I’m not willing/able/have time to dig into) should be the biggest piece of the puzzle by far.

You can do a little experiment. Go to your last workout (hopefully rated hard). Change it to max effort and watch the change on your yellow and red days, on your future workouts, and your predicted ftp. Then do the same for easy. Then restore the true rating. The effects are huge.

It’s important to answer honestly, because you can trick the system into inflating your ftp, but you can also trick the system into giving you workouts you’ll fail, like OP.

2 Likes

You’ve now changed what you’re saying, after previously suggesting the system was only using RPE data (ā€œI don’t want my coach to give me workouts only based on how they feltā€).

It is using your data as the primary input, else we’d have people with 150 FTPs being asked to do Threshold workouts at 400w, and vice versa, plus other crazy examples like this, which we are not seeing….

What makes you think that the system isn’t using your data as the primary input into its model?

From the examples posted where TR staff have drilled down into a few individual workout histories, the main thing that seems to have occurred is people being very inconsistent in how they answer their surveys. Not subtle, borderline issues where someone is unsure over Hard vs. Very Hard type of responses, but strange looking examples such as for example where someone achieves close-to-a-PR power record but respond that it was Easy, in a manner that’s quite inconsistent with how they responded to a previous survey at similar levels. Some of these instances appear like people are trying to manipulate the AI for some reason, rather than just giving honest feedback on how the workout felt. Keep it Simple applies here - just rate it how it felt overall - no need to overthink it.

10 Likes

Wrong. Re-read my original question. I never ever said it only uses the survey to program. I’m asking how much of the pie, or how important it is.

Also, I don’t give them much thought, I just answer them. My issue was that I don’t know if I’m answering them correctly and if that is somehow screwing things up. Same as working with a coach, we need to understand what is being prescribed and trust it. We only have the survey to give our side of the equation, we can’t talk it out as to why it may have felt that way or whatever. But unlike a coach, TR has the AI or whatever that can churn through all of my workouts and all of the workouts of similar athletes so that it can then decide if what I just did actually looked hard or very hard or easy given my usual power and HR response to that kind of power. I want that intelligence to guide my training more than, well you answered this on a survey. It just seems more scientific to me. So, again….how important are the surveys?

2 Likes

I think users need to understand that rating intense workouts as ā€œeasyā€ isn’t necessarily pushing the model to give you much more difficult workouts. It’s seen this trick before, many times, as evidenced by so many forum posts. It’s not an algorithmic equation, it’s a smart ML model.

ā€œEasyā€ is reserved for endurance days where during the hardest part of the workout you’re sending work emails and answering phone calls about the 4 hour cable guy appointment window being pushed out to 6 hours. Remember, when you call an intense day easy, you are also saying that that recovery ride the other day is just as hard as today’s V02, where you set power PRs and nearly maxed out your HR. Yes, it’s smart enough to know the difference, which also means it’s smart enough to know when you’re BSing.

There are more clever ways to manipulate the program, but I would just stick to what @Jonathan says and be super honest about the relative effort.

8 Likes

We’re entering into semantics here: the effects can be material, but they’re not huge, because the model is still anchored by your measured performance data. A ā€œhugeā€ effect in my taxonomy would be something 30-50% off your forecast FTP, not tweaks to upcoming workouts, or suggesting that a rest day be taken, or a modest forecast change to forecast FTP 3 weeks out. IMO these examples are just what you’d expect to occur as RPE feedback gets incorporated by the AI in order to validate its model of you/me/us.

3 Likes

Also, one of the really cool things about TR is that you can go back and rerate all of those intensity days you called easy through a more objective lens. This should help the model do its best work for you.

2 Likes

You clearly like my answers, so I’ll continue… :laughing:

You’re answering the survey correctly if you (A) answer honestly and (B) answer consistently.

I think the piece of the puzzle you’re missing is that the AI needs to continually validate its model of you.

Prior to you completing a workout, it has simulated how it thinks you’ll perform it, and produces a forecast for expected difficulty. After you’ve completed the workout it has your data from the workout - ā€œyour actual performanceā€ - but it doesn’t know how it felt to you.

Your survey response provides this information, so that it can validate whether it’s model of you is correct. For example, if it forecasted a difficulty of Easy for your workout yet you responded with Hard, then something is amiss. It likely cannot see, or infer, that from your performance data, and instead needs your subjective assessment in order to obtain that information, to adjust its model, and take whatever steps it then deems appropriate. It cannot read your mind as to how it felt, hence the survey feedback is required. Without this feedback, the ā€œcontrol systemā€ of keeping your training on track would just not work well at all, either digging you into giant fatigue holes or not stretching you sufficiently to improve.

6 Likes

I’ve said the same thing in the past. I’ve never in my entire riding career done a VO2 max workout that’s rated less than Very Hard. And I’d say that most are All Out. To me, if a VO2 workout is less than that, you didn’t go hard enough. Every VO2 max workout should be Very Hard or All Out. That’s the point of the workout.

In a similar sense, I’m not sure I’ve ever done a Threshold workout that wasn’t rated Hard or Very Hard. And before anybody says my FTP is too high I can hold that power for just about an hour so I think it’s set fine. Threshold is just hard. If you’re doing Threshold and it’s easy then your FTP is set too low.

So what I get is Z2 is always Easy or Moderate. SweetSpot is Easy (rarely), Moderate, or sometimes Hard. Threshold is almost always Hard, sometimes Very Hard. VO2 is always Very Hard or All Out.

It might be nice if it was zone specific. So for a Threshold workout, instead of Hard, I could say yea it was Hard but it was actually Moderate for a Threshold workout. Or yea that VO2 max was still Very Hard overall, but it was in the easier side for VO2. Might create more confusion though

I also wonder if HR can throw things off. I don’t run fans on Z2 rides because I don’t like them. So I can have an indoor Z2 ride that’s easy but my HR is 140bpm (my max is around 185). It feels easy. I don’t feel like it’s any harder without fans. And I rate it Easy. But I’m sure the system sees the 140bpm and thinks I’m overexerting myself. I’ll do the same power outside and my HR is 110.

5 Likes

To @CaptainThunderpants ā€˜s point -who I assume by ā€œdoing them properlyā€ is out of ERG mode and pretty much ignoring the power target - if you do it that way then VO2max could always be very hard or all out - but the system should also capture that you over or under achieved the power target.

As long as you rate the work you did then you should be fine.

3 Likes