- Gabor Szabo

# How Should One Assess Operator Differences?

Updated: Nov 22, 2020

How are you all doing? I hope you are still safe and spending time doing what is important to you. I know many of us have faced some hardships over the past month and a half; I am no exception. It is, however, important that we not lose sight of the fact that there is light at the end of the tunnel.

I had been wanting to publish this blog post, and since you are reading it right now, it looks like I finally got around to it!

OK, so what I will talk about this time is how to practically assess operator differences. This poses the following question: what kind of operator differences are there?

Differences between consistency of operators

Differences between operators means

Other differences, such as operator-part interaction

**Differences between the consistency of operators**

Assess this first using the Multi-Vari/Range Chart. If you look at the below example, it is apparent on the Multi-Vari Chart that operator 2 is less consistent, i.e. they have more repeatability error than the rest of the operators. Looking at the Range Chart, it becomes clear that the difference in consistency is about 2-3 fold. Also, although there is only one subgroup range outside of the control limit (part 6), the ranges for operator 2 average higher than the rest of the two guys or gals. Operator 2 is doing something differently, which causes their measurements to be less consistent. It could be just that they are not as experienced/trained on the method as the two other operators, or they could actually be doing something differently. It is definitely worth taking a closer look at this. Remember, only consistent measurements are predictable, and this particular measurement process doesn't pass the smell test for consistency.

**Differences between operator means**

This is a good one. The traditional way to assess differences between operator means is the Reproducibility statistic calculated as a standard deviation (see both ANOVA and Range methods for Gage R&R). I do not recommend using the Reproducibility statistic as it provides very little value for practical analysis. Reproducibility error, in the context of Reproducibility as a standard deviation, is a measure of consistency and is driven by the repeatability error. The more repeatability error you have, the farther the operator means are likely to fall from each other. Remember, in a gage study, since multiple measurements are taken from each part, where the operator averages fall is determined by the average of multiple measurements, and the less consistent those measurements are, the more error there is in the means per the rule of the *standard error of the mean*. Let's refresh our memory on how that is calculated: it is the sample standard deviation over the square root of the sample size.

In a crossed gage study, this equation applies to the operator means in the following way: since the same samples are measured by all operators, we know that there is no difference between the true operator means; they are identical (since they come from all operators measuring the same parts). So, the standard deviation in the numerator refers to the random error (sampling error), that is the repeatability error! The sample size in the denominator refers to the number of samples times the number of trials, that is n x k. This modified equation looks like this:

Think of it this way: given *this much* repeatability error and this many samples, this is how much error can be between operator means. Now, take 3 times the amount of error to each direction of the grand mean (the mean of all operator averages), and you have constructed decision limits around the grand mean.

If all operator averages fall within these bounds, the differences between them are merely due to repeatability error. If any of them falls outside of these limits, there is likely true operator bias present.

What is operator bias? It is a systemic difference that is not driven by the repeatability error; if there is detectable operator bias present, it is likely that the biased operator is doing something inherently different from the other operators. An example of operator bias is seen in the below figure where the outer diameter (OD) of cylindrical component with a tapered end is measured. Operator 1 and 2 are measuring the OD right at the tip whereas Operator 3 is measuring it at the intersection of the non-tapered and the tapered sections. Since this is clearly a different method and results in the OD appearing bigger than measured at the tip, it will most probably show up as a statistically significant bias.

Now, let's look at what graphical method one can use to assess operator bias. The name of the chart that is most useful for assessing operator bias is, you guessed it, the Operator Bias Chart. It was introduced by Don Wheeler in his book Evaluating The Measurement Process (EMP). The chart is essentially constructed from plotting each operator average, grand mean of operator averages and the below decision limits as explained earlier in this post.

Look at the below example: the Multi-Vari Chart shows that there appears to be a noticeable difference between the operator averages (there is a slope to the green line connecting the operator averages). Operator 1 seems to be averaging higher than Operator 2, and Operator 2 seems to be averaging higher than Operator 3. Is this operator bias or these differences are solely due to the repeatability error? Look at the Operator Bias Chart on the right hand side; it shows that all three operator averages fall within the decision limits, which means that the difference between the averages is solely driven by the repeatability error, which actually appears high compared to the product variation.

This second example shows another scenario where the green line on the Multi-Vari Chart connecting the operator averages has a slope very similar to the previous example. A quick glance at the Operator Bias Chart confirms a statistically significant bias for Operator 3.

The third example introduces a scenario where Operator 1 is very clearly and significantly biased from Operator 2 and 3. The Multi-Vari Chart shows this very well. However, on the Operator Bias Chart, all three operators seem to fall outside of the decision limits. What is going on here?

Remember, all charts and plots are to be interpreted practically. What is going on here is that the average for Operator 1 is so far from the rest of the operators that it significantly brings down the grand average causing Operator 2 and 3 to appear to be outside of the limits. But they in fact would fit within the width of the limits. This shows that this chart, like any other charts, requires practical interpretation. The conclusion here is that Operator 1 probably needs to be retrained to take their measurements the same way as Operator 2 and 3.

The specification limits for the above characteristic are 50 and 150 units. Adding these limits and re-plotting the chart one can notice that while Operator 1's bias is statistically significant (Operator Bias Chart), the difference is not necessarily practically significant or important (see Multi-Vari Chart) in relation to the specification limits. These kinds of situations should always be assessed practically taking into consideration product variation and position and the purpose of the measurement (product screening vs. process control).

**Key Takeaways**

Assess operator differences in a practical way instead of relying on summary statistics

Always start with assessing operator consistency

Verify if operator operator bias is present using the Operator Bias Chart

Always be practical and follow the three steps of practical analysis: Apply Common Sense, Visualize Your Data and Calculate Metrics. More on this in my eBook Practical MSA: Laying The Foundations.

As always, be safe and stay tuned for future blog posts!