Zwift step test too high vs TR?

Hey everyone, long time TR user here. I’ve used the TR ramp test for quite some time now (from beta) and have generally tested pretty consistently with nominal improvement. After spending some time doing weekend zwift races, I found my ftp increased nicely from 208 to 216 (using the same step test), probably due to the sustained nature. After another 4 weeks riding at 216 I tested again, this time using zwifts step test.

First thing I noticed was the steps are 20w vs 6% of ftp, which in my case was 20w vs around 12w. As a result, I was able to sustain a higher wattage (300+) for 1min, possibly because I got there quicker, rather than the slower ramp to get to my previous max of 288. Zwift then determined my ftp to be 230 (which I’d surely take than increase from 216).

So the question is… What’s going on? Is this zwift thing accurate? Have I actually improved from pushing harder? Or is it a result of being fresher at higher wattage resulting from a shorter test?

Granted, ftp is nothing but a benchmark to set yourself to for intervals and after trying 230, if I’m able to suffer my way though, I can’t imagine being any worse off, if anything it may result in more gains?

Any thoughts here?

Zwift and TR ramp test are different models to calculate FTP.
TR’s model is modified by lots of TR users’ big data, but Zwift’s model isn’t. So I think TR’s ramp test is much more accurate.

Have you done a TR Ramp Test for a side-by-side comparison? I don’t think your new FTP is beyond the realms of possibility.

1 Like

Agree… sounds like you had 4 productive weeks. If you question the test - take an easy day and do the TR ramp test tomorrow.

1 Like

It doesn’t seem like an unreasonable gain. What’s your power source and have you changed anything recently?

The two tests seem faily equal - Zwift FTP RAMP Test vs TrainerRoad RAMP Test - YouTube

That video is fundamentally flawed as a comparison of the tests - the power is being driven by TrainerRoad and both bits of software are just taking the last 1 minute of power and multiplying by 0.75 to get FTP. The Zwift protocol is different and was not tested. On the Zwift screen you can see that the power targets are way off.



It’s like weighing yourself on different scales.

I think you just have to pick a method that works for you (i.e. has the TR test set your zones accurately so that sweetspot is manageable but uncomfortable, and vo2 is horribly painful but just about possible to complete?), and stick with it.


All good questions, I haven’t side by sided and would love to have those kind of gains, haven’t had that in years. I’d just be shocked if i went from 216 to 230 in 4 weeks.

In terms of power, I’ve been using the same PM (calibrated before every ride). Was actually thinking of taking the TR test again to see if I can match the results. Funny enough, Zwift estimated by FTP at around 214 (close to the 216) based on my previous races where I’d like to puke at the end, which makes me think that the TR ramp is much more accurate.

I had watched the same video indicated above and thought it seemed flawed. If the ending calculation is the same then it doesn’t matter how you get there if it’s all being driven by the same source.

All this said, FTP is a funny relative metric and in my opinion does 2 things 1) “Appropriately” Sets your workout thresholds, but that is subject to 2) not potentially creating an “artificial ceiling”. For example, if i “know” my FTP is 216 and I’m doing a 230 interval, one would naturally start thinking “I cant sustain this for long since it’s above my FTP” resulting in mentally checking out > physical checkout > not training at your capabilities > not seeing real improvements… think you all get where i’m going here and recognise how important the mental part is…

May just make sense to boost it to work harder and if it doesn’t work, drop it down…

1 Like

The difference between the two is only 6% and given that your actual achievable threshold power differs every single day, I would worry about it much.

If you go into every workout targeting a precise power number you’ll have days you can hit it and some you can’t. As long as you’re in the ballpark, train consistently using whatever number and method you like and you will see improvement.


Unfortunately this is an inherent limitation of estimating workloads from a ramp test without physiological data to reveal what is actually happening in the body. The particular peak workload/power you land on will be protocol dependent, therefore so to will be any threshold estimate if it’s based on a flat percentage of peak power.

Shorter stages or higher steps will produce a higher peak power and higher threshold estimate. Longer stages or lower steps will produce a lower peak power & threshold.


Many (most?) studies use ramps of 20-30 W/min over continuous (ramp test) or 1-3min stages (incremental step test) and are designed to take the subject to exhaustion in ~12-15min. The specific protocol often depends on the target population, ie. more well-trained subjects will have a steeper ramp to reach exhaustion in 12-15min, while less-trained/general pop will have a shallower ramp to reach exhaustion in the same time.

So both Zwift & TR are in line with traditional ramp test protocol. I don’t think we can say either is overall more accurate than the other. It will depend on your individual physiology, which neither measure.

I’m less familiar with the design of the Zwift ramp test, but from the TR podcast and articles, I know they’ve dialed in their ramp protocol (6% steps, 75% of peak 1min power) based on a massive dataset from the TR user population. Probably the widest data set on ramp tests ever collected. So I trust the TR protocol is valid on average… but you don’t care about average. You care how accurate the test is for you.

IMHO, best thing you can do with the results of either ramp test is just start training knowing your FTP is probably somewhere within that ~15 W range. And dial in your workout targets as appropriate without being to beholden to a specific FTP number.


Assuming both tests use the same percentage of the final step to set your FTP, which I doubt is true. Both of those protocols may work fine if they are calculated differently to end up at a similar result.

1 Like


  • Starts at 100w after a “free ride” warm up with no set wattage.
  • Uses a fixed 20w step per minute.


  • Starts at 52% of starting FTP.

  • Uses 6% of the starting FTP per minute.

  • 20w would be 6% steps for a 333w FTP.

In Zwift:

  • If you have an FTP around 100w, you may have real trouble with this version.
  • If you are further away from 333w FTP, there may be more of an impact in the feel of the steps.
  • The lack of adjustment to a base FTP around the rider seems like a problem, depending on where your FTP really is.
1 Like

Why doesn’t someone with Zwift actually try it out. It’s not like you actually have to ride to exhaustion to work out the multiplication factor. As far as I understand from the video the factor is 0.75.


Looking at the TrainerRoad bell curve I’d say 100w isn’t a bad place to start for a lot of cyclists on an FTP test. That is the issue with anything besides an actual lactate test, it’s all in the math and each protocol will be skewed in one aspect or another. No test will make everyone happy.

I personally am of a mind that ramp tests in general are too biased to short power and reward training plans heavy on HIIT that I don’t like to overemphasize. Doesn’t mean it’s wrong. Just not for me.


That’s not necessarily true. There’s an unofficial ramp test you can download to do on Zwift that has 2.5 minute steps going up 25W each time, and calculates your FTP as 0.825 of your best 2.5 minutes. Which means that you only ever reach 121% of your FTP.

1 Like

Totally agree. The value of the ramp test is that it doesn’t take too much out of you and doesn’t take long, so can be performed regularly.

On the other hand, if you are looking to perform steady state endurance rides a la Gran Fondo style, then you are unlikely to be hitting peak power on a regular basis. Therefore, I lean towards the 20’ test or even the best 60’ seen in a time trial race.


That’s still VO2 max territory, well above threshold, which was my point about ramp tests. They calculate threshold using short effort at supra threshold intensity.

Not trying to argue for or against. To each his/her own. Ramp tests are quick and less taxing than longer protocols so obviously more popular and marketable.

Great post here and why I was trying to get at with my explanation. Your graphic explains it well.

What is “right” is simply subject to what population you’re in rather than the actual test.

I’m sure if you had two populations of the same subjects who completed both tests, they would be in proportionally in line with each other within the platform but not necessarily across platforms. In otherwords everyone’s FTP would be x percent higher or lower on the respective platform.

I’ll probably try the 230 (which I know is probably too high from my zwift races which estimate me at 215-220) and see what happens

As a PSA for folks:

Every short test (Ramp Test, 8 minute test, and 20 minute test) has an anaerobic contribution to the results and they all have curves with a range of variability. The reason the Ramp is popular with people is because it removes pacing, which helps the execution and consistency of the results.

If you want to narrow the range of variability, do a long test at threshold to exhaustion.

But for the love of all things sacred, if you want to track progress, pick a type of test and stick to it for several cycles. Test hopping makes your results very difficult to compare.