What does the future of training look like to you?

A big dataset is a giant flywheel. TR is monetizing it by developing its ML models and by being able to run statistical analyses. From my experience, you need about 1,000 data “points” (usually comprised of many data files) to train and test a ML model. (In this context a data point could be a part of the training history of a particular TR athlete.)

With 500 athletes, you’d be hard-pressed to do a lot of things once you factor e. g. gender and age into account.

Compare this with what FasCat Coaching has done: their xFTP algorithm was (going from memory) tested on around 100 data files from 6 athletes.

The second point is the cost of each data “point”: if that point is a chunk of training history of many athletes, that’d be expensive. In some industries and branches of science usable datasets costs millions.

That is assuming that most of the other companies want to use their dataset to improve the training of their athletes. Companies like Strava, me thinks, want to monetize their dataset by selling ads or data that lead to ads.

Yes, and to me the criterion is whether in the eyes of athletes this has improved their training or not. Compared to when I joined, many common operations are at least a lot easier or completely automated (adapting training plans to life conditions, illness/time off the bike, etc.).

I don’t mind going back to friction shifters, so long as they’re Simplex Retrofriction. None of that Campag rubbish. :wink:

Garmin has heat maps as well. Sure doesn’t have Wahoo or other GPS makes / phone users, but still a pretty big dataset.

I have used TP for a number of years (free and paid), including, for a brief period, as a coach. I didn’t count individual training plans, but it is a lot.

I also have experience in ML. Key is automated generation of datasets, in particular metadata. At my old job, dataset generation was 90+ % of the work. No exaggeration. That was because of a multitude of factors, e. g. that the workflows were not designed with long-term data management and retention in mind. The changes we proposed would have meant more work for the process engineers who are the ones generating the data. Plus, there were lots of other complicating factors (tools from different manufacturers that lack key capabilities, no control over their software, etc.).

Seeing how, hmmm, old school TP is, I wouldn’t count on the necessary tech being in place.

Who is they? TP? How many of those are cycling workouts? How many training plans are comparable? And most importantly, how much effort would it be to generate datasets for scientific analysis (by ML or otherwise)?

No, but dataset generation probably is, which is the lion‘s share of the work. Roughly speaking, if you want to do ML in a systematic fashion, you need

  • automated dataset generation based on certain criteria and
  • good analysis tools for the entire dataset.

(This is based on my own experience in that area — admittedly in a completely different industry, but I think these paragraphs also apply to TR.)

These tools are totally invisible to the end user. TR has started experimenting with ML in 2015, I think (going from memory here). And the ML-based features it has rolled out indicate to me what kind of capability they have in the dataset generation and analysis workflow. They first rolled out Progression Levels, which indicates to me that at that time, they could select single workouts that match certain criteria in order to answer questions like “Of those two comparable workouts, which is harder and by how much?”

The latest feature is an ML-based Plan Builder that proposes e. g. training volume and number of intense days based on your training history. To me this suggests that TR can now create datasets consisting of entire training histories that match certain criteria. That isn’t easy, you need to specify and implement tons of boring stuff such as

  • how to package the data,
  • how to keep track, save and standardize metadata,
  • write custom analysis software that makes use of these standards,
  • have an efficient interface to the database, etc.

All of that takes years of development as there are tons of stakeholders (because you might need to make changes to the database, etc.). And then you can ask the scientifically interesting questions.

Given TP’s comments you posted, I am led to believe that none of that infrastructure is in place whereas TR’s released features suggest you can do that with TR’s data pool. Hence, my comment that TR’s data pool is a gold mine.

But maybe the comparison to a gold mine is not the right one since it takes significant effort to mine and refine gold ore. Maybe TP is the gold mine (or palladium mine), and TR is a complicated water tap by comparison? :wink:

ML can definitely help coaches. How many times have people recreated the same workout in TP? A lot of coaches would likely welcome something like AT if they knew what it would do and offer enough knobs to tweak things.

IMHO their idea to not use ML will eventually put them out of business.

I’ve never had that much luck with the heat maps. What I’d pay for is the ability to sort it by things like:

most popular 25 mile loops
most popular 50 mile loops
most popular 2 hour rides
etc.

I’d also want to exclude short commutes and things. I’m more interested in training rides.

Indeed, that’s what I did pay for: every time I moved in the past year, I’d subscribe to Strava for a month or two to figure out what routes of given length are good options if I want to go on a ride.

I’ve used Strava several times while traveling to find routes. I’d start with the heat maps. Then I’d look at the segments, find a few folks about my ability then I’d go to the rides that included my stretch to see their whole route. Do that a few times in you find the local routes pretty quick.

I’ve done that too. Find a bike club in the area and then find the people riding > than 100 miles a week and see where they ride.