The trouble with Key Results

Posted on Sun 19 March 2023 in blog • 8 min read

I like Mastodon. I really do. Ever since turning my back on the birds(h)ite, I enjoy the consistent quality of the discussions I’ve been having on the Fediverse. It’s nice that I can disagree with someone, without it turning into a roiling flamefest.

I recently had one such disagreement.

Coming across a post (in German) in which someone extolled the virtues of the OKR method, I took the liberty to reply with a simple, “I once wrote something about that”, with a link to my Meaningless Metrics, Treacherous Targets article. Clearly, its metrics obsession — every Key Result must be quantifiable and measurable, otherwise it’s not a Key Result — is something that I consider highly problematic about the OKR method, for the reasons which that article outlines.

A person other than the original poster¹ then stepped in and defended the method by arguing in favour of having objectives. Those by themselves are of course not something that I disagree with in the least. It’s just that without Key Results (and the fixation on metrics they bring in), it’s not OKR anymore. It’s not a novel approach either: without metrics obsession, you can trace objectives-based management/leadership back to least 1888, when the Prussian army extended Auftragstaktik to all levels of command.

But then the person pointed out that in their mind, Objectives were really more about habit-forming than about short-term goals.² They gave the example of an Objective being “living healthy”, and a Key Result being “exercise twice a week.”

And now that’s an interesting proposition that I want to get into in a little more detail. Because obviously living healthy is a good and sensible goal to pursue. But OKR is a great way to muck it up — just like it’s a great way to muck up most good and sensible goals one might pursue.

Let me explain.

First off, “living healthy” is a goal that an OKR practitioner employing accepted Best Practices would probably reject outright, because it is “business as usual”. Living healthy is clearly a long-term objective, not one that you should define for a year or a quarter or a month, pursue with great rigour, and then move further on down your priority list in favour of the next period’s OKRs.

It is also not an “actionable” goal, because living healthy is as much about doing things (exercise, get enough sleep, eat healthy) as it is about refraining from things (smoke tobacco, consume alcohol and drugs).

But, let’s think the scenario through and let’s conjure up a person, aged about 40, male, 185cm tall, slightly overweight, not always eating well. Let’s call him Frank.

Assume further that Frank is a former habitual smoker that managed to quit five years ago. And suppose Frank wanted to use the OKR method to attain an Objective of “living healthy” for a quarter, setting the following Key Results:

Do 26 hours of exercise (that’s 1 hour, twice a week, for the 13 weeks in a quarter).
Get an average of no less than 7 hours and 30 minutes of sleep per night.
Smoke zero cigarettes and refrain from consuming tobacco products in any other form.
Eat junk food no more than 5 times.
Attain a weight of 82kg at the end of the quarter.

Now, all of these Key Results sound perfectly reasonable. They are something that a person in that situation should strive for, are they not.

Now assume that — in accordance with the OKR method — we consider the Objective achieved when (and only when) all Key Results are met. And assume further — as is commonplace in organizations that use OKR — that there is some sort of reward that awaits Frank upon meeting his Objective. Assume, for example, that Frank can put his defined Key Results into an app, and if he meets his Objective the app gives him a badge that we’ll call the Goal Keeper award. Frank can share this with his friends on social media, and it gets Frank a 20% discount on his next purchase of athletic shoes, at his preferred store.³

OK, now. We are 11 weeks into the quarter, and Frank is in the following situation:

He has already completed 32 hours of exercise, because he managed to put in one extra hour of exercise, in more than half of the preceding weeks.
He has slept a little less than planned, and his average stands at 7 hours and 15 minutes per night.
He has smoked not a single cigarette and has not consumed tobacco in any form.
He has had junk food only 3 times.
Since he has dampened his junk food cravings on several occasions by eating more sugar (which does not factor into a Key Result by itself), his weight now stands at 85kg, 3kg above his goal weight.

Now, on face value, what should Frank be doing in order to live healthy (which was his original goal)? Clearly, he should probably continue his exercise regime, go to bed a little earlier, and consume a bit less sugar. Quite probably, he can lose those three extra kilograms easily in six weeks, at a healthy weight-loss of 500g/week.

But with a potential reward looming that is tied to meeting the Objective (this is where Goodhart’s Law comes in), it stands to reason that Frank will follow a different line of reasoning:

“I can lose 3 kilograms in two weeks if I crash-fast. It will drain all my energy, but I don’t need that energy for exercise anymore, and if it makes me really tired and I sleep for 9 hours a night for the next two weeks, so much the better: it’ll push my sleep average above 7.5 hours, so I’ve got that box ticked as well.”

And that is a clear example — and a depressingly common one — of a perverse incentive: Frank now has an advantage out of being less healthy at the end of the period than he otherwise could have been: at the end of the quarter, he will be two weeks out of training, he will have slept more than he needed (which has no health benefits), and his caloric balance will probably be solidly upset, so that his weight will bounce right back up once he has passed the end of quarter, collected his discount code, and ceased his crash-fast.

But it gets worse. Up until this point, we have assumed that Frank setting goals for himself did so naively, with no plans to game them — he only did so, eventually, when he felt an incentive to do so. But of course, that’s not how humans operate, at least not beyond the first iteration. The next time around, Frank will build some wiggle room into setting the goal measures in the first place.

To discuss what that means, let’s return to the tobacco consumption goal. It’s blindingly obvious what’s a “good” number of cigarettes to smoke for a former habitual smoker: zero. The tobacco inhalation itself serves no beneficial purpose at all — in contrast to eating junk food, which at least does contain energy that your body can burn for useful purposes. And even a single cigarette may re-trigger substance dependency. So, clearly, zero it is.

But, setting this goal is extremely risky in terms of attaining the overall Objective, using the OKR method: if Frank had a weak moment and said yes to a cigarette proffered by a colleague on a break during an extremely stressful workday, and thereby failed his zero cigarettes goal, and this happened two weeks into a new quarter, any motivation to stick to the other Key Results for the remaining 11 weeks evaporates. I cannot meet my Objective anyway, the reasoning goes, so why should I bother trying for the other Key Results?

And so, the “clever” thing to do in the twisted logic of the method is to set a target of no more than two or three cigarettes a quarter, rather than zero which is the obviously more healthy choice.

And now here comes an interesting twist: when you confront someone with such examples of why a particular method is inherently flawed, they quickly retreat to a position of “well no method is 100% perfect; pointing out an imperfection does not render the method invalid”. That is true in principle — except when the thing that is broken is what defines the method.

I’m also pretty certain that someone will come forward pointing out that the hypothetical example I presented here is construed, that I am discussing something no-one in their right mind would define as an Objective, and that my Key Results are all nonsensical. The problem is, whenever I then challenge them to give me a good Objective and 3-5 good Key Results, it usually takes about five minutes to identify either a way to game them, or a perverse incentive that they managed to build in.

Again, there’s no OKR without the KR. And the fact that Key Results must be measurable, and must be 3 to 5 in number, and meeting all of them is what constitutes meeting the Objective, no exceptions — that is what defines the method. And if that is broken, then the method is broken.

I’ve heard the same from Scrum practitioners who have told me that “we do Scrum, but without sprints” — yes sprints are one of the many things that are broken about Scrum, but if you find sprints perpetually strung back-to-back to be non-sensical (and you should!) then you must also find Scrum nonsensical, because without sprints there is no Scrum.

And I’ve come to find this sort of goalpost move increasingly annoying. You don’t get so say “let’s use X” and when confronted with the fact that X must include Y, which is counterproductive and toxic, retreat to a position of “well then let’s use X without Y, but let’s still use X”. You can’t. Just own up to the fact that without Y, there’s no X.

I am not including the names of the people involved in the conversation, nor am I linking to the original Mastodon thread from here. This is because I really don’t want to finger-point at any one person: the arguments that were brought forward in favour of OKRs in the discussion are arguments that I have heard from almost every proponent of the method that I have talked to. ↩
OKR does not foster anything long-term or sustainable in my mind; I think it emphasises short-term gains. But that’s just my opinion. ↩
In my example I use a “reward” that has a social component, and a component that looks like a monetary benefit but really isn’t (it just lets you buy something cheaper that you might not even need). In an organization the reward might be financial, but it might also be as simple as being recognized or commended by a manager, or even just having less fear of being hit by the next round of layoffs. ↩