This vignette covers the three classical Shewhart charts for continuous measurements (variables): I-MR for individual observations, Xbar-R for small rational subgroups, and Xbar-S for larger or unequal-sized subgroups. The choice between them depends entirely on the structure of your data.
When to use which
| Situation | Chart |
|---|---|
| Single measurements, no rational subgroup | shewhart_i_mr() |
| Subgroups of size 2-10, equal sizes | shewhart_xbar_r() |
| Subgroups of size > 10, or unequal sizes | shewhart_xbar_s() |
The reason for the cutoff at 10 is that for small subgroup sizes the range estimator of sigma is more efficient (in the Fisher-information sense) than the standard deviation estimator; from about onwards the standard deviation overtakes (Montgomery 2019, Section 6.4.1).
I-MR
bottle_fill is a 100-point series of fill volumes. The
default estimator of sigma is the moving range:
fit <- shewhart_i_mr(bottle_fill, value = ml, index = observation)
broom::glance(fit)
#> # A tibble: 1 × 8
#> type n phase sigma_hat sigma_method n_violations n_rules pct_violations
#> <chr> <int> <chr> <dbl> <chr> <int> <int> <dbl>
#> 1 i_mr 100 phase_1 1.18 mr 0 2 0
autoplot(fit)Robust alternatives
The mean of the moving range is sensitive to outliers. Three robust alternatives are built in:
fit_med <- shewhart_i_mr(bottle_fill, value = ml,
sigma_method = "median_mr")
fit_bw <- shewhart_i_mr(bottle_fill, value = ml,
sigma_method = "biweight")
fit_sd <- shewhart_i_mr(bottle_fill, value = ml,
sigma_method = "sd")
dplyr::bind_rows(
broom::glance(fit) |> mutate(method = "mr (default)"),
broom::glance(fit_med) |> mutate(method = "median_mr"),
broom::glance(fit_bw) |> mutate(method = "biweight"),
broom::glance(fit_sd) |> mutate(method = "sd")
) |>
select(method, sigma_hat, n_violations)
#> # A tibble: 4 × 3
#> method sigma_hat n_violations
#> <chr> <dbl> <int>
#> 1 mr (default) 1.18 0
#> 2 median_mr 1.18 0
#> 3 biweight 1.23 0
#> 4 sd 1.22 0The biweight is Tukey’s M-estimator with a redescending influence function (Tukey 1977; Mosteller & Tukey 1977). Use it when you suspect a small fraction of contaminated observations and don’t want them to inflate sigma.
Xbar-R
tablet_weight records 25 batches of 5 tablets each.
fit_xbar <- shewhart_xbar_r(tablet_weight,
value = weight,
subgroup = subgroup)
broom::tidy(fit_xbar)
#> # A tibble: 6 × 3
#> chart line value
#> <chr> <chr> <dbl>
#> 1 Xbar CL 250.
#> 2 Xbar UCL 253.
#> 3 Xbar LCL 248.
#> 4 R CL 3.53
#> 5 R UCL 7.46
#> 6 R LCL 0The Xbar limits use the constant
;
the R limits use
and
.
These constants come from Montgomery (2019), Appendix VI, and are
exposed via shewhart_constants():
shewhart_constants(c(2, 5, 10, 25))
#> n A2 A3 c4 d2 d3 B3 B4 B5 B6 D3 D4
#> 1 2 1.880 2.659 0.7979 1.128 0.853 0.000 3.267 0.000 2.606 0.000 3.267
#> 2 5 0.577 1.427 0.9400 2.326 0.864 0.000 2.089 0.000 1.964 0.000 2.114
#> 3 10 0.308 0.975 0.9727 3.078 0.797 0.284 1.716 0.276 1.669 0.223 1.777
#> 4 25 0.153 0.606 0.9896 3.931 0.708 0.565 1.435 0.559 1.420 0.459 1.541shewhart_xbar_r() insists on equal subgroup sizes
(otherwise the constants don’t apply uniformly). For unequal sizes use
shewhart_xbar_s() with
sigma_method = "pooled_sd".
Xbar-S
The S chart uses the bias-corrected sample standard deviation. Sigma is estimated as
# Same data, but using S instead of R (preferred for n > 10)
n_per <- 5
df <- tablet_weight |>
group_by(subgroup) |>
filter(n() == n_per) |> # in case of any incomplete batches
ungroup()
fit_xbs <- shewhart_xbar_s(df, value = weight, subgroup = subgroup)
broom::glance(fit_xbs)
#> # A tibble: 1 × 8
#> type n phase sigma_hat sigma_method n_violations n_rules pct_violations
#> <chr> <int> <chr> <dbl> <chr> <int> <int> <dbl>
#> 1 xbar_s 25 phase… 1.49 sbar 2 2 0.08For variable subgroup sizes:
fit_xbs_pooled <- shewhart_xbar_s(df, value = weight, subgroup = subgroup,
sigma_method = "pooled_sd")Reading the results
Every variables chart returns a shewhart_chart object
with the following per-observation columns
(broom::augment()):
-
.value— the plotted statistic (individual, Xbar, etc.) -
.center,.sigma,.upper,.lower— control-line series - For two-panel charts, additional
.mr*,.r*,.s*columns for the lower panel -
.flag_<rule>and.flag_any— TRUE/FALSE for each runs rule configured
The rules argument controls which rules are flagged. By
default, Nelson 1 (point beyond 3 sigma) and Nelson 2 (9 points same
side) are checked, and we recommend keeping at least these two. Add more
with:
fit_full <- shewhart_i_mr(bottle_fill, value = ml,
rules = c("nelson_1_beyond_3s",
"nelson_2_nine_same",
"nelson_3_six_trend",
"nelson_5_two_of_three",
"nelson_6_four_of_five"))
fit_full$violations
#> # A tibble: 0 × 5
#> # ℹ 5 variables: position <int>, rule <chr>, description <chr>, value <dbl>,
#> # severity <chr>Each additional rule increases the chart’s sensitivity at the cost of
shorter in-control ARL. See vignette("arl-simulation") for
how to quantify this trade-off.
References
- Montgomery, D. C. (2019). Introduction to Statistical Quality Control (8th ed.). Wiley. Chapter 6.
- Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley.
- Mosteller, F., & Tukey, J. W. (1977). Data Analysis and Regression. Addison-Wesley.
- Wheeler, D. J., & Chambers, D. S. (1992). Understanding Statistical Process Control (2nd ed.). SPC Press.
- Cryer, J. D., & Ryan, T. P. (1990). The Estimation of Sigma for an X Chart: or ? Journal of Quality Technology, 22(3), 187-192.