I think you have some assumptions baked into your thoughts here that are coloring your conclusions in ways you may have missed.
-
The ramp test works.
This is a known false assumption. It works well for many (it works pretty well for me, maybe a bit high) but it flat out does not work for many others. High anaerobic capacity breaks it, outside stress breaks it, many other things break it. Its an estimate tool that works pretty well for the middle of a bell curve of people, not a test. -
The ‘Same Situation’ exists.
You are getting a cold and will get sick tomorrow but dont know it yet. You slept 35 minutes less on average for the last 4 days. You had coffee/didnt. There are millions of things. I can take the same test a few days apart and get an 8% different result under ‘identical’ conditions. -
You need to do a max effort to to compare data.
You just dont. If we know what your ‘all out’ looks like, what your ‘hard’ looks like etc. We know those things for a few million other sets of data. Many of those other sets of data also have real max hour efforts. You can combine them pretty well, its just hard. The ‘old’ way to do stuff like this was to make a model (like the coggans chart). The more modern (insert big data/ML/Buzz word here) way is not to look at a sample or average but look at ALL the data. Billions of data points and take your best guess from there. Use all that data to predict your ftp last month, use this months data to see if our prediction was right and tweak it. Use that best guess to predict forward, then use actual forward results to test the prediction and tweak it. Machine learning is a funny phrase cause it is totally not learning, it is looking at lots of data and getting lots and lots of wrong answers really really fast till you find the least wrong one. You dont need a ‘base measurement’ you use all the measurements. If doing x makes 90% of people faster that is great but a quality ML will also have data for the 10% and do something better for them as well once it has enough data and the logic is sound (its still early days here for TR, they are in the fun part now of troubleshooting a magic black box). A ramp test works for 30-60% and screws everyone else forever. -
TR cares that the ftp they get is really your ftp.
They dance around this but I think you will be happier thinking of the number they give you as your functional training power. They dont need the number that you could in theory on a perfect day do for an hour. They need a number that sets workout levels. It would have been easier for them to call it the flux capacity setting and make it 0-10000 and avoid the confusion but marketing requires it be the ftp concept people know.