Metrics matter
INTERVIEW – In this thought-provoking conversation, Michael Ballé and Mark Graban discuss metrics and the managerial attitude towards tracking them.
Words: Michael Ballé, lean author and executive coach, and Mark Graban, lean author and coach.
Michael Ballé: Reading your book, Measures of Success, I realized that when we introduce lean in a company, metrics are often the enemy: the endless reporting of targets and monitoring are precisely what keeps the company where it is and hampers change at any level. As a result, I feel that we tend to push forward with lean techniques, such as pull and andon and kaizen, and try to ignore the metrics in place, which generally don’t make much sense from an operational point of view. Thinking about it, I fear this is a mistake because metrics do matter, if only to the people who are measured by them. We should pay a lot more attention to what reporting is in place and what exactly it measures.
Mark Graban: Thanks for reading the book, Michael. I think it’s fair to say that metrics are a reality in most modern organizations. They can be helpful – for example a balanced scorecard of metrics around safety, quality, delivery, cost and morale. A relatively small number of metrics can help us gauge the health of a business and it can be aligned with “True North”. That said, we shouldn’t just track things that are easy to measure or “vanity metrics” that paint a picture of success. Additionally, metrics can lead to dysfunction that causes great harm if leaders focus too much on any metric (like cost). Or, if there’s pressure to hit a target “no matter what”, then we’re likely to see fudging of the numbers rather than real improvement.
MB: I’m not sure I believe that “what gets measured gets done”, as the saying goes, but I do believe that people find what they seek – and often measure what they seek as a way to find it. A lot of reporting I see is just there to reassure management that things are okay, really. I suspect that part of the difficulty is distinguishing metrics that have to be reported all the time (accidents, for instance, or sales), because there’s not one day they’re not going to be important, and metrics that should be enquiry based – numbers we check for a period of time because we either want to better understand how a process behaves on one matter, or because we want to try a change and check its impact. Ideally, we would have few permanent reporting metrics and many enquiry metrics, but of course organizations tend to work the other way around, because no one wants to be caught out by their boss asking “what is the number of... this month?” and not have the answer. Still, finding the right balance is not easy and coming upon examples of people who have gotten it right isn’t either.
MG: I like the way you frame that. Permanent reporting metrics, like employee injuries or incidents of patient harm, should always be tracked, even as we work to drive those toward zero. After reading my book, one person I know and respect paraphrased Ohno and Shingo as saying we shouldn’t measure things we should be trying to reduce. I think it’s a false choice to say we can’t measure and try to improve. Toyota absolutely has metrics around employee injuries and assembly defects. They are still on the long, if not never-ending path to get those to zero.
What you call “enquiry” metrics help us answer the important question: “Has our countermeasure improved our results?” Part of the Toyota Business Practices (a.k.a. 8-step problem solving) is to identify a gap between current performance and a target. Instead of just pressuring people to do better, lean leaders collaborate with employees to improve the system and, therefore, the results that are generated. Methods like SPC-based Process Behavior Charts can help us prove that we improved, looking for a significant and sustained shift in a metric instead of declaring victory after a data point or two.
MB: An excellent point you make is that SPC control limits are not targets or tolerances. They’re a statistical calculation to reveal how much natural variation is in the system the way it’s currently set up. This is a frequent battle on the shop floor, and I’m also very surprised every time we calculate actual control limits, even at six sigma, how wide the tunnel is and how much more the process is in statistical control than we would have expected. This would indicate that reality is a lot stickier than we think and that middle managers are indeed absurdly micro-managing something that doesn’t move that much – or, conversely, because they so micro-manage it, it is fairly stable. It’s really hard to know which way this works.
MG: You’re absolutely right about that – the target and our lower and upper limits are two different dimensions. It’s a very common question that I get, when somebody creates an SPC chart and a leader says, “I don’t like those limits, they are too wide.” Those limits are part of the current reality, as you said. If we’re not happy with the average performance or the level of routine variation (as illustrated by the calculated limits), then a leader needs to help improve the system as a way to improve the average and reduce variation. The limits are the “voice of the process”. The process is speaking to us through an SPC chart, whether we want to listen or not.
You bring up another important problem, when leaders overreact to every up and down in a metric. Many metrics tend to fluctuate around an average. They will tend to continue fluctuating within the lower and upper limits unless the system changes significantly. We create a lot of wasted motion when we overprocess, if you will, the routine variation in a metric. SPC teaches us that there is no root cause for any of the small changes in our metric, the so-called noise. When we identify a “signal” in the chart, such as a data point outside of the limits, that’s the voice of the process telling us where to go investigate. If it’s a signal that is in the direction of worse performance, we should start root-cause analysis and restore the system to its previous performance. If it’s a signal in the positive direction, we should make sure we understand what has changed so we can ensure that’s the new standardized work.
MB: I suspect that one of the reasons SPC charts are not used more in companies is not that they haven’t been taught – after all, six sigma programs have been at it for years – but for lack of curiosity on the causes of system behavior. As you point out, it’s often difficult to distinguish causation from correlation, particularly in multivariate, changing environments such as a business. Consequently, doing real 5 why exercises is exceptionally hard (funny how it’s still the old Ohno example of shavings in the pump that is quoted in lean books). It needs a profound, serious commitment to figure out what the real causes of process behavior are, and being determined to isolate real variation (special causes) from noise (common causes). Most people I find prefer to think in terms of targets and then feel confident they already know the causes and that what is needed is more managerial pressure to get results – Deming’s point with the red beads experiment. The root cause, as I see it, is not lack of knowledge, but lack of curiosity.
MG: It’s indeed interesting that many managers think the root causes of problems are things like “people aren’t trying hard enough” or “people aren’t being careful”. When I worked at GM in 1995, the division headquarters gave us some internal consultants they had hired away from Japanese suppliers. They set up a process to track hourly and daily production and to do a daily report out between shifts. Unfortunately, our plant superintendent used this data as a way to know who to yell and scream at. The data was supposed to be used for analysis and improvement, but he used it as a weapon. His two root causes that he’d yell about were “lack of urgency” and “lack of intensity”. Every day.
Dr. Deming would have said that he (and our plant managers) were responsible for the system. Instead of taking responsibility, they made a career out of blaming others. It’s really cruel and disrespectful to pressure people into somehow performing better than the system design will allow. The red bead game makes this very clear. I set a goal and offer an incentive for anybody who can get three red beads or fewer. That performance is better than the lower limit of the Process Behavior Chart for the number of red beads.
Asking for impossible levels of performance isn’t lean. Improving performance means improving the system, such as stopping the practice, in the game, of dumping red beads back into the container after they’re pulled. Improving a real-world system is more difficult and closing a performance gap generally requires multiple countermeasures and redesigns of the process, value stream, and systems.
MB: It probably doesn’t help that, in traditional management, people are rewarded or reprimanded on the basis of whether they reach their targets or not. The temptation to manage the numbers, not the process, is understandably hard to resist. Managing people around processes would mean to evaluate how seriously they take challenges, how responsible they feel for customer satisfaction, how hard they work at understanding what is really going on – using metrics for what you call the voice of the process – and how good they are at getting their teams to put their heads together and change things for the better. Ironically, although we’re discussing metrics, it would seem that lean management requires evaluating people on both their results and the effort they put into solving problems. After all, the obstetrician with the worse post-op record could be either a very bad surgeon, or the best who gets all the impossible cases.
MG: Yes, processes lead to results. I cringe when I hear people say things like, “Lean is all about process.” It’s both. The right process leads to the right results. This is very different than the older “management by objectives” approach where leaders say, “Your job is to the target – I don’t care how you do it.” As Donald J. Wheeler says, “without context, data have no meaning.” In the healthcare realm, I worry about doctors or hospitals being punished for taking on the sickest, most challenging patients that might make their mortality rate look high. I have a number of teachers in my family and I don’t like the idea of ranking or incentivizing teachers based on test scores. My mother was a teacher in a very poor area, where “the system” put many of the young children at a great disadvantage. As you said, you could have a great teacher who has students with poor test scores because the system didn’t want to hold the student back in the last grade because they couldn’t read yet. Any teacher ranking (or physician ranking) should include what an Olympic judge might call “the degree of difficulty” but then that maybe becomes too difficult to administer. As much as I write about metrics in the book, I also think it’s very true to say “not everything that matters can be measured,” whether that quote is attributed to Deming or Einstein.
MB: As I understand it, Deming’s contribution to quality in Japan was to teach Japanese industrials not to copy the American system of achieving quality through inspecting out defective work, but on the contrary to get it right first time at every step of the process – which meant component standardization at every level. I remember an old-time sensei showing me a “money board” in the plant: a board tracking right first time – because this is where the money really is. Still, I find that right first time is seldom measured in the companies I visit, and often actually hard to do so because none of the reporting systems are set up that way. Some metrics are clearly more useful than others, and some are easier to obtain than others. I wonder what our criteria should be – I know that Tracey and Ernie Richardson argue for measuring leading indicators (give you a hint of how the process will behave) rather than lagging indicators (tell you how the process behaved a while back), as most finance-based measures are, but I find few companies have such maturity. In many cases, unfortunately, finance is driving the implementation of ERPs to make it easier to get financial reporting done, not to easier to see non-right-first-time.
MG: Did you see news reports that claimed Tesla’s “first time yield” through assembly was only 14%? That’s quite a bit lower than the rest of the auto industry, of course. That’s being measured at the end of the line (vehicles not requiring rework), so we could call that a lagging indicator. If we were able to look at the causes of rework, we could go back and look at process indicators and metrics further upstream, such as a measure of dimensions for stamped parts or bolt torques. I think a lean thinker would prefer to understand and fix the root causes of rework instead of just measuring the amount of rework. SPC and the process behavior chart methodology can be applied to leading or lagging indicators. I think the lessons are the same: avoid overreacting to every up and down and use SPC to prove when there has been a meaningful shift in the metric, which allows us to better connect the cause-and-effect of our improvement work and our results.
MB: In the end, I’ve come to believe that quality is as much a result of systems and tools, such as SPC, as of “quality spirit” – that mysterious, invisible component that makes a person care about it. When we see a child looking into a well, we spontaneously pull them back, whether it is our kid or not. The real puzzle is how to develop the same spirit regarding quality, or non-right-first-time issues. Juran taught us the reliability of a process rests on its repeatability. Deming taught us that getting a process to be repeatable needs to improve it change by change through PDCA – which is paradoxical. Ohno taught us that to do both we need to be curious and investigate root causes. These ideas are in fact very demanding, as you show in your book, as the situations we live tend to have so many factors that it’s hard to know where to start. Leadership, I guess, is the answer, but beyond that fine sounding word, developing that elusive “quality spirit” remains a mystery, unless we create the tension of just-in-time and jidoka to reveal problems and measure the time it takes to find immediate countermeasures and then solve the deeper problem so that the fix stays fixed.
MG: You raise some really deep and important philosophical questions there. Dr. Wheeler writes that SPC is “a mindset with some tools attached.” We can say the same thing about TPS and lean, right? I think SPC is very compatible with a lean philosophy in a number of ways. For example, we should not blame individuals for performance that is driven by the system. In a way, “respect for people” means not punishing people for routine variation and I think it’s also respectful to not waste people’s time by asking them to do a root-cause analysis for a small fluctuation in a metric. SPC can show us when it’s appropriate to react (when we have a signal in a data). When we identify and eliminate the root causes of significant changes in a metric, we can step back and more calmly work on improving the underlying system in a less-reactive way. Whether it’s lean or SPC (and I think these two concepts should be linked and intertwined), we can’t just teach a tool like 5S or control charts to front-line staff – executives and leaders at all levels need to understand the mindsets and lead by example.
Read more
FEATURE – The Sourcing team at insurance company SulAmérica tells us about how they were able to transform the process in their department in order to become more efficient in dealing with their workload.
FEATURE - An initial look into the impact of lean management principles on the growth of young organizations hopes to encourage further analysis into why and how lean startups succeed.
FEATURE – In the last article in his series, the author discusses how you can mix and combine the different pull systems available to the lean practitioner.
WOMACK’S YOKOTEN – Somehow surprisingly, management schools teach very little about management, and when they do all learning is classroom-based. Instead, they should go to the gemba.