Thursday, April 7, 2011

More Circular Reasoning On Statistical Process Control

After my last post I received a number of comments on the website KPIExperts.  Most of them completely misunderstood my point, and their misunderstanding was so fundamental that rather than reply to their comments individually I decided to write a new post.

I myself have trained thousands of people in SPC over more than two decades. All the books on the topic that I read explained it the same orthodox way, and it was consistent with my graduate training in statistics. Yet over the years, my doubts about it increased.

The comments remind me of what E.T. Jaynes said in the preface to his book "Probability Theory: the Logic of Science." He said: "A previous acquaintance with probability and statistics is not necessary; indeed a certain amount of innocence in this area may be desirable, because there will be less to unlearn."

Probability is an extension of logic. Any argument based on probability must be logical. In my last post on SPC I pointed out the illogic in saying that "If the process mean does not change, then 99% of all future measurements can be expected to fall between the control limits. Therefore if a particular measurement falls between the control limits, or if 99% fall between the control limits, the mean has not changed."

The probability that a future measurement falls between the control limits given a constant mean is NOT the same as the probability that the mean has changed given an actual measurement. The two probabilities can be quite different. (For the mathematically oriented, p(A/B)≠p(B/A))

This does not mean SPC is useless. It means the logic used to justify the construction and use of control limits is wrong.

One commenter said I should read Donald Wheeler to learn how good practitioners do it. Another said that he had never "met an experienced Six Sigma expert who recommended to ignore data between control limits."

Here is what Wheeler says in "Understanding Variation: The Key to Managing Chaos."

On page 27 Wheeler constructs a control chart of monthly US trade deficits for a year. All 12 points fall within the limits. He says "… this chart indicates it that it will be a waste to analyze any one month to see what is different from preceding months."  In other words, because we can plot control limits and find points that stay within them, we should ignore everything else we know about trade deficits and what causes them, because they are within the limits! If that is not saying plainly to ignore information I do not know what is.

Wheeler goes on to talk about control limits and how they find signals. He describes on page 43 how one particularly high value was within the control limits and was thus not a signal. This again is illogical. If you know that a process generates white noise, and thus that you cannot extract a signal, you can reasonably infer that a data point within the control limits is not a signal. But if you do not know this, you cannot draw control limits and reasonably infer from these control limits that everything between them is white noise and therefore, not a signal. Wheeler even says "The Natural Process Limits are the voice of the process." Once again, this is saying ignore the data between the control limits, because the limits tell you everything you want to know.

Some commentators pointed out that the Westinghouse decision rules for finding out-of-control points or assignable causes. The rules get people to search for things such as two out of three points between two and three standard deviations from the mean, or seven points in a row on one side of the mean. While it is true that these rules do require people to look at data within the control limits (which is good), the logic used to justify them is, once again, fatally flawed.

Even in a white noise process the probabilities used to determine the rules do not work for existing data: they only work for data you have not yet collected! In a white noise process the probability of the next seven data points (not yet measured) being all above or all below the mean (with constant mean) is ½*½*½*½*½*½*½=1/128. That is a completely different, and much lower, probability than the probability of finding seven data points in a row on one side of the mean somewhere in the last N measurements. If you are unconvinced, generate 100 sets of 100 Gaussian random deviates and check for yourself with a statistical package.

In a process which we know not to be a white noise process—which is the reason for using SPC in the first place—the probabilities used to justify the Westinghouse rules still do not stack up, and for the same reasons they do not stack up for control limits.

One commentator said that "Patterns, [can often] only be detected by practitioners with extensive experience with many hundreds of control charts…[such as] designed system changes …  repeating patterns, cycles, drift … and freaks … such as a tool breakage, machine breakdown, or software glitch". This is good: they are using their background information about processes to make inferences about signals, rather than assuming the process is a white noise process, and using flawed determinations of probability.

This commentator also quoted Pyzdek to "show" my conclusions are invalid: Control charts, according to Pyzdek, "contain a vast store of information on potential improvements."

Yes they do.  Use SPC, use the information, and use your background information. But don't rely on flawed determinations of probability.

Thanks to Phil Green / KPI Library

No comments: