A few months ago I reupped my WHOOP subscription due to getting back into training and having problems sleeping / recovering correctly. I used to use my Garmin Fenix (no subscription at the time) but found the sleep metrics to be a little lacking in accuracy or depth and preferred WHOOP’s recovery metrics to it.
Recently, WHOOP has launched an AI Chatbot integrated with your data named “Coach” which can prescribe all sorts of things and do a deep dive into your data. This interested me but really it boiled down telling me things I already knew or not having a long enough leash to continue with a prompt outside of its own domain so to speak.
I completed a TrainerRoad workout today, a Z2 hour long endurance ride, which for me resulted in 430 Calories burned. I use a Kickr v5 indoors and I believe it to be fairly accurate. My WHOOP however, using only HR said I burned 220. To my knowledge, using the total work registered by the power meter is a fairly solid metric to determine caloric expenditure, especially in a controlled environment. So in an effort to dispel any biases, I asked Coach why the stark difference.
It told me that “power meters can overstate calories for less efficient rider or understate them for highly efficient ones.” I brought up how HR can be affected by extraneous things too like caffeine, sleep, and stress to its response of “WHOOP’s algorithm is conservative to avoid overestimating, especially for steady, low-intensity efforts” and to ensure my strap was properly placed. It is, it just wasn’t a very hard ride.
It then went on to describe that HR is a better indicator of overall strain since it isn’t just coming from the pedals but instead the whole body (upper body, sweat, etc). And while I can see that holding some credence, when there is a power meter in the mix for a steady state ride indoors, I have to believe it is more accurate. I told it that I needed accurate readouts so I can track my caloric deficit and plan nutrition accordingly and it instead said that I may need help from member services and ended the chat.
Asking the same thing to ChatGPT, it revealed (what I am assuming is true) that Power based caloric expenditures are more accurate than HR ones for exactly the reasons I listed. It makes me wonder if WHOOP has trained their model (inadvertently or otherwise) to favor their brand over proven science, even to stretch the truth in some cases to ensure it doesn’t get shown in a bad light. To me, that makes it pretty useless and is kind of a troubling trend in AI especially with the proliferation of chatbots that are supposed to be used in unbiased analytical contexts. To be fair it is in “Beta” but in the months I’ve occasionally used it, it doesn’t really seem to be any different in its responses or ability.
Love TR though and their AI integration is fairly useful. Rant over, thoughts?