Zwift step test too high vs TR?

That video is fundamentally flawed as a comparison of the tests - the power is being driven by TrainerRoad and both bits of software are just taking the last 1 minute of power and multiplying by 0.75 to get FTP. The Zwift protocol is different and was not tested. On the Zwift screen you can see that the power targets are way off.



It’s like weighing yourself on different scales.

I think you just have to pick a method that works for you (i.e. has the TR test set your zones accurately so that sweetspot is manageable but uncomfortable, and vo2 is horribly painful but just about possible to complete?), and stick with it.


All good questions, I haven’t side by sided and would love to have those kind of gains, haven’t had that in years. I’d just be shocked if i went from 216 to 230 in 4 weeks.

In terms of power, I’ve been using the same PM (calibrated before every ride). Was actually thinking of taking the TR test again to see if I can match the results. Funny enough, Zwift estimated by FTP at around 214 (close to the 216) based on my previous races where I’d like to puke at the end, which makes me think that the TR ramp is much more accurate.

I had watched the same video indicated above and thought it seemed flawed. If the ending calculation is the same then it doesn’t matter how you get there if it’s all being driven by the same source.

All this said, FTP is a funny relative metric and in my opinion does 2 things 1) “Appropriately” Sets your workout thresholds, but that is subject to 2) not potentially creating an “artificial ceiling”. For example, if i “know” my FTP is 216 and I’m doing a 230 interval, one would naturally start thinking “I cant sustain this for long since it’s above my FTP” resulting in mentally checking out > physical checkout > not training at your capabilities > not seeing real improvements… think you all get where i’m going here and recognise how important the mental part is…

May just make sense to boost it to work harder and if it doesn’t work, drop it down…

1 Like

The difference between the two is only 6% and given that your actual achievable threshold power differs every single day, I would worry about it much.

If you go into every workout targeting a precise power number you’ll have days you can hit it and some you can’t. As long as you’re in the ballpark, train consistently using whatever number and method you like and you will see improvement.


Unfortunately this is an inherent limitation of estimating workloads from a ramp test without physiological data to reveal what is actually happening in the body. The particular peak workload/power you land on will be protocol dependent, therefore so to will be any threshold estimate if it’s based on a flat percentage of peak power.

Shorter stages or higher steps will produce a higher peak power and higher threshold estimate. Longer stages or lower steps will produce a lower peak power & threshold.


Many (most?) studies use ramps of 20-30 W/min over continuous (ramp test) or 1-3min stages (incremental step test) and are designed to take the subject to exhaustion in ~12-15min. The specific protocol often depends on the target population, ie. more well-trained subjects will have a steeper ramp to reach exhaustion in 12-15min, while less-trained/general pop will have a shallower ramp to reach exhaustion in the same time.

So both Zwift & TR are in line with traditional ramp test protocol. I don’t think we can say either is overall more accurate than the other. It will depend on your individual physiology, which neither measure.

I’m less familiar with the design of the Zwift ramp test, but from the TR podcast and articles, I know they’ve dialed in their ramp protocol (6% steps, 75% of peak 1min power) based on a massive dataset from the TR user population. Probably the widest data set on ramp tests ever collected. So I trust the TR protocol is valid on average… but you don’t care about average. You care how accurate the test is for you.

IMHO, best thing you can do with the results of either ramp test is just start training knowing your FTP is probably somewhere within that ~15 W range. And dial in your workout targets as appropriate without being to beholden to a specific FTP number.


Assuming both tests use the same percentage of the final step to set your FTP, which I doubt is true. Both of those protocols may work fine if they are calculated differently to end up at a similar result.

1 Like


  • Starts at 100w after a “free ride” warm up with no set wattage.
  • Uses a fixed 20w step per minute.


  • Starts at 52% of starting FTP.

  • Uses 6% of the starting FTP per minute.

  • 20w would be 6% steps for a 333w FTP.

In Zwift:

  • If you have an FTP around 100w, you may have real trouble with this version.
  • If you are further away from 333w FTP, there may be more of an impact in the feel of the steps.
  • The lack of adjustment to a base FTP around the rider seems like a problem, depending on where your FTP really is.
1 Like

Why doesn’t someone with Zwift actually try it out. It’s not like you actually have to ride to exhaustion to work out the multiplication factor. As far as I understand from the video the factor is 0.75.


Looking at the TrainerRoad bell curve I’d say 100w isn’t a bad place to start for a lot of cyclists on an FTP test. That is the issue with anything besides an actual lactate test, it’s all in the math and each protocol will be skewed in one aspect or another. No test will make everyone happy.

I personally am of a mind that ramp tests in general are too biased to short power and reward training plans heavy on HIIT that I don’t like to overemphasize. Doesn’t mean it’s wrong. Just not for me.


That’s not necessarily true. There’s an unofficial ramp test you can download to do on Zwift that has 2.5 minute steps going up 25W each time, and calculates your FTP as 0.825 of your best 2.5 minutes. Which means that you only ever reach 121% of your FTP.

1 Like

Totally agree. The value of the ramp test is that it doesn’t take too much out of you and doesn’t take long, so can be performed regularly.

On the other hand, if you are looking to perform steady state endurance rides a la Gran Fondo style, then you are unlikely to be hitting peak power on a regular basis. Therefore, I lean towards the 20’ test or even the best 60’ seen in a time trial race.


That’s still VO2 max territory, well above threshold, which was my point about ramp tests. They calculate threshold using short effort at supra threshold intensity.

Not trying to argue for or against. To each his/her own. Ramp tests are quick and less taxing than longer protocols so obviously more popular and marketable.

Great post here and why I was trying to get at with my explanation. Your graphic explains it well.

What is “right” is simply subject to what population you’re in rather than the actual test.

I’m sure if you had two populations of the same subjects who completed both tests, they would be in proportionally in line with each other within the platform but not necessarily across platforms. In otherwords everyone’s FTP would be x percent higher or lower on the respective platform.

I’ll probably try the 230 (which I know is probably too high from my zwift races which estimate me at 215-220) and see what happens

As a PSA for folks:

Every short test (Ramp Test, 8 minute test, and 20 minute test) has an anaerobic contribution to the results and they all have curves with a range of variability. The reason the Ramp is popular with people is because it removes pacing, which helps the execution and consistency of the results.

If you want to narrow the range of variability, do a long test at threshold to exhaustion.

But for the love of all things sacred, if you want to track progress, pick a type of test and stick to it for several cycles. Test hopping makes your results very difficult to compare.


I canceled my Zwift after they didn’t grandfather me in to what I was already paying. I guess I’m just spoiled by TR :stuck_out_tongue:

I just used it for the occasional race anyway and to have something other than the blue blocks to look at. Now I just watch racing vids on YouTube instead.

1 Like

I still have access to it and can look to see if I have time to fit in a Z Ramp and TR Ramp within a reasonable time for simple comparison (non-scientific).

1 Like

Would be great to hear someone else’s feedback on this and their experience. Let us know your results if you do it!

1 Like

So gave a workout a test at my new ftp… Now, maybe it was mental, and maybe I’m being a baby… But, tried Dade+1, 2:30 vo2max intervals and was probably more difficult than expected. Found myself gassed after about 2min leaving those last 15-30 seconds as a total fail. Having dropped it 5% to my old ftp made it difficult but manageable.

Long and short I think the zwift test probably overestimates, or I’ve psyched myself out.

1 Like

I did both Zwift and TR ramp test in a very close days. Zwift gave me 246 which I don’t think is accurate because I knew that I can’t sustain 246 for an hour. In contrast, TR gave me 231 which I believe that is more closer to my fitness is. I felt that Zwift is using fixed 20 watt a step which brought you up to the higher intensity faster so you can actually sustain higher power because you were not exhausted for too long. In the other hand, TR is based on 6% of your FTP that in generally is a better approach.


FTP is not pegged to a set duration, it can be anywhere from 40-70 minutes depending on your fitness. Zwift’s MAP test is intended for elite male (one of three test protocols developed by British Cycling). Non-elite male should use a 25w ramp and 15w for female. TR’s is a 6% of FTP ramp which will give a much longer burn (depending on the starting point, 14w in your case). N=1, my non-elite MAP, TR’s ramp test, Kolie Moore’s empirical FTP test, and 30 minute TT matched pretty well.