7. Estimation IV: Prediction Intervals for a Future Outcome :math:`x_0` and Tolerance Intervals for Content :math:`p` with Confidence :math:`1-\gamma` (Specs :math:`\text{LSL}/\text{USL}`)
==================================================================================================================================================================================
7.0 Notation Table
-----------------------
.. list-table::
:widths: 22 78
:header-rows: 1
* - Symbol
- Meaning
* - :math:`X_1,\dots,X_n`
- random sample (independent)
* - :math:`n`
- sample size
* - :math:`\bar{X}`
- sample mean
* - :math:`S`
- sample standard deviation
* - :math:`\mu`
- population mean (unknown)
* - :math:`\sigma`
- population standard deviation (unknown)
* - :math:`x_0`
- one future observation
* - :math:`\alpha`
- error rate for a :math:`100(1-\alpha)\%` interval
* - :math:`t_{\alpha/2,\;n-1}`
- t critical value (df :math:`n-1`)
* - :math:`p`
- tolerance content (population proportion covered)
* - :math:`\gamma`
- tolerance risk (confidence is :math:`1-\gamma`)
* - :math:`k`
- tolerance factor in :math:`\bar{x}\pm k s`
* - :math:`\text{LSL},\ \text{USL}`
- lower/upper specification limits
7.1 Introduction
-------------------
In earlier sessions, interval estimation focused on parameters, especially the population mean :math:`\mu`. A confidence interval answers a parameter question: which values of :math:`\mu` are plausible given the observed sample and a stated confidence level.
In many operations and quality settings, the practical question is different. Decision makers often need bounds for a single future item (shipment time, next unit’s diameter) or bounds that describe where most of the process output lies over the long run. This module distinguishes confidence intervals (for :math:`\mu`), prediction intervals (for :math:`x_0`), and tolerance intervals (for a large fraction of the population).
7.2 Learning Outcomes
------------------------
By the end of this session, students should be able to:
- Distinguish the inference target of a confidence interval, a prediction interval, and a tolerance interval
- Construct and interpret a two-sided prediction interval for one future observation under Normal sampling
- Explain why prediction intervals remain wide even when :math:`n` is large
- State what a tolerance interval guarantees (content :math:`p` with confidence :math:`1-\gamma`) and interpret it in a process setting
- Use tolerance intervals to report performance relative to engineering specifications (:math:`\text{LSL},\text{USL}`)
- Identify common interpretation errors and the role of the Normality assumption
7.3 Main Concepts
--------------------
7.3.1 Three interval targets
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Confidence intervals, prediction intervals, and tolerance intervals have similar-looking algebra but different targets and probability statements. The difference is not a minor wording issue; it changes what the interval is designed to cover.
The targets can be summarized as follows.
- A confidence interval targets the unknown parameter :math:`\mu`
- A prediction interval targets a single future outcome :math:`x_0`
- A tolerance interval targets a central portion of the population distribution, expressed as a content level :math:`p`
The same numeric confidence level (for example, 95%) does not make these intervals interchangeable. The event “covers :math:`\mu`” is not the same as the event “covers :math:`x_0`,” and neither is the same as the event “covers at least :math:`p` of the population.”
7.3.2 Confidence interval on :math:`\mu` (recap for comparison)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Assume :math:`X_1,\dots,X_n` are independent observations from a Normal distribution with unknown mean :math:`\mu` and unknown standard deviation :math:`\sigma`. The standard confidence interval for :math:`\mu` uses the statistic :math:`(\bar{X}-\mu)/(S/\sqrt{n})`, which follows a t distribution with :math:`n-1` degrees of freedom under Normal sampling.
A two-sided :math:`100(1-\alpha)\%` confidence interval for :math:`\mu` is:
.. math::
\bar{x} \pm t_{\alpha/2,\;n-1}\frac{s}{\sqrt{n}}
This interval is about estimation error of a parameter. As :math:`n` increases, the half-width decreases roughly like :math:`1/\sqrt{n}`, reflecting improved precision in estimating :math:`\mu`.
7.3.3 Prediction interval for one future observation :math:`x_0`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In forecasting and quality inspection, it is common to need an interval for the next unit produced or the next demand realization. A point predictor for a future observation is the sample mean :math:`\bar{X}`, but interval prediction must account for two sources of variability.
The first source is estimation uncertainty in :math:`\mu` (because :math:`\bar{X}` is random). The second source is the intrinsic unit-to-unit variability of a single observation around :math:`\mu`. These two components add in the prediction error :math:`x_0-\bar{X}`.
Under Normal sampling with unknown :math:`\sigma`, a two-sided :math:`100(1-\alpha)\%` prediction interval for :math:`x_0` is:
.. math::
\bar{x} \pm t_{\alpha/2,\;n-1}\, s\sqrt{1+\frac{1}{n}}
This interval is always wider than the confidence interval for :math:`\mu` at the same confidence level. The factor :math:`\sqrt{1+1/n}` shows that even when :math:`n` is large, there remains nonzero uncertainty about a single future unit.
7.3.4 Tolerance intervals for long-run process coverage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A manager may care less about the next item and more about long-run performance. For example, a buyer may require that “at least 90% of units fall within a specified range,” or a quality engineer may need to report bounds that contain most of the process output.
A two-sided Normal tolerance interval uses the form :math:`\bar{x}\pm k s`. The role of :math:`k` is not a simple critical value like 1.96 because both :math:`\bar{X}` and :math:`S` are random, so the achieved population coverage varies from sample to sample.
A tolerance interval statement has two levels:
- Content :math:`p`: the fraction of the population distribution intended to be covered
- Confidence :math:`1-\gamma`: the reliability of achieving at least that content across repeated samples
For Normal sampling with unknown :math:`\mu` and :math:`\sigma`, two-sided tolerance limits are:
.. math::
\bar{x} \pm k s
The factor :math:`k` is chosen so that one can assert, with confidence :math:`1-\gamma`, that the interval contains at least proportion :math:`p` of the population. In practice, :math:`k` is obtained from a tolerance-factor table or computed numerically from the Normal model.
One-sided tolerance bounds are used when only one tail matters, such as a minimum strength requirement. A lower one-sided bound is often reported in the form :math:`\bar{x}-k s`, interpreted as a conservative lower level that at least proportion :math:`p` exceeds, with confidence :math:`1-\gamma`.
7.3.5 Engineering specification reporting
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Specifications such as :math:`\text{LSL}` and :math:`\text{USL}` define acceptability for individual units. Interval reporting should match the decision question and the risk definition.
If the decision is about the next unit, a prediction interval communicates likely variability for a single future outcome. If the decision is about meeting specifications most of the time, a tolerance interval is the appropriate reporting tool because it addresses population coverage.
A practical reporting structure for a Normal process dimension includes:
- A confidence interval for :math:`\mu` when the mean itself is important for drift monitoring
- A prediction interval for :math:`x_0` when near-term single-unit risk matters
- A tolerance interval (two-sided or one-sided) when long-run conformance to :math:`\text{LSL}/\text{USL}` is the central objective
A common quality interpretation is the “containment comparison.” If a two-sided tolerance interval for content :math:`p` lies entirely inside :math:`[\text{LSL},\text{USL}]`, then one has statistical evidence (at the chosen confidence level) that at least :math:`p` of output meets the specs under the Normal model.
Example 7.1 (Order lead time: prediction vs confidence)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A distribution center monitors order lead time (hours) for a stable picking process. The manager wants a statement about the next urgent order rather than only the average lead time. A random sample of :math:`n=25` urgent orders gives :math:`\bar{x}=18.4` hours and :math:`s=3.2` hours, and Normal sampling is considered reasonable.
**Question:** Compute a 95% prediction interval for the next urgent order lead time :math:`x_0`, and explain how its meaning differs from a 95% confidence interval for :math:`\mu`.
The 95% prediction interval uses :math:`t_{\alpha/2,\;n-1}` with :math:`\alpha=0.05` and df :math:`24`. The half-width is :math:`t_{0.025,24}\, s\sqrt{1+1/n}` because prediction must include both unit-to-unit variation and uncertainty in estimating :math:`\mu`. With df :math:`24`, :math:`t_{0.025,24}\approx 2.064`, so the half-width is :math:`2.064(3.2)\sqrt{1+1/25}`, which is larger than the confidence half-width :math:`2.064(3.2)/\sqrt{25}`.
**Answer:** A 95% prediction interval is :math:`18.4 \pm 2.064(3.2)\sqrt{1+1/25}`, i.e., approximately :math:`(11.8,\ 25.0)` hours. This interval targets a single future order time :math:`x_0`, whereas a 95% confidence interval targets the unknown mean :math:`\mu` and would be much narrower.
Example 7.2 (Machine diameter: three intervals, three questions)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A machining process produces shafts whose diameters must be controlled. A technician collects :math:`n=10` consecutive diameters (mm) from stable operation and obtains :math:`\bar{x}=12.006` and :math:`s=0.012`. The engineering team wants to communicate both average performance and near-term risk for a single part, under an approximately Normal model.
**Question:** Write the 99% confidence interval for :math:`\mu` and the 99% prediction interval for the next diameter :math:`x_0`. Explain why the prediction interval is wider.
Both intervals use :math:`t_{\alpha/2,\;n-1}` with :math:`\alpha=0.01` and df :math:`9`. The confidence interval for :math:`\mu` is :math:`\bar{x}\pm t_{0.005,9}(s/\sqrt{n})`, reflecting uncertainty only in estimating the mean. The prediction interval is :math:`\bar{x}\pm t_{0.005,9}\, s\sqrt{1+1/n}`, which adds the intrinsic variability of one future observation.
**Answer:** The 99% confidence interval is :math:`12.006 \pm t_{0.005,9}\,(0.012/\sqrt{10})`, while the 99% prediction interval is :math:`12.006 \pm t_{0.005,9}\,(0.012)\sqrt{1+1/10}`. The prediction interval is wider because it is designed to cover a single future unit, not merely the mean, so it must account for unit-to-unit variation that does not vanish as :math:`n` grows.
Example 7.3 (Specification reporting: one-sided tolerance bound)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A supplier reports breaking strength (N) for a polymer component. The buyer requires that at least 99% of units exceed :math:`\text{LSL}=420` N, and the supplier wants a conservative lower bound that supports this claim statistically. A random sample of :math:`n=30` units gives :math:`\bar{x}=465` N and :math:`s=18` N, and the Normal model is used for reporting.
**Question:** Describe the correct interval tool for this requirement and write the form of the bound to report, including its interpretation.
This is not a prediction question about the next unit and not a mean question about :math:`\mu`. The requirement is about a high proportion of the population exceeding a minimum level, so a one-sided lower tolerance bound is appropriate. The reported bound has the form :math:`L=\bar{x}-k s`, where :math:`k` is chosen so that one can assert with confidence :math:`1-\gamma` that at least proportion :math:`p=0.99` of the population exceeds :math:`L`.
**Answer:** Report a one-sided lower tolerance bound :math:`L=\bar{x}-k s` with content :math:`p=0.99` and a stated confidence level (for example, :math:`1-\gamma=0.95`). The interpretation is: “We are 95% confident that at least 99% of all breaking strengths exceed :math:`L`,” and this statement can then be compared directly to :math:`\text{LSL}=420` N.
7.3.6 Figure 7.1: Confidence vs prediction interval length as :math:`n` changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This figure is based on theoretical interval formulas under a Normal sampling model with unknown :math:`\sigma`, so no real dataset is required. The plotted curves are deterministic functions of :math:`n` and the chosen confidence level, which is appropriate because the learning goal is to isolate how the interval target changes width. In this figure, :math:`n` denotes the sample size used to compute :math:`\bar{X}` and :math:`S`, and there is no repeated sampling.
To read the figure, first pick a confidence level from the dropdown and then compare the two curves at the same :math:`n`. The vertical axis shows interval length in units of :math:`\sigma`, so the comparison is scale-free and focuses on structure rather than measurement units. The confidence-interval curve corresponds to :math:`2t_{\alpha/2,n-1}S/\sqrt{n}`, while the prediction-interval curve corresponds to :math:`2t_{\alpha/2,n-1}S\sqrt{1+1/n}`.
The main message is that prediction intervals remain wide because a single future unit remains variable even when the mean is estimated precisely. As :math:`n` increases, the confidence interval shrinks rapidly because the estimation error of :math:`\bar{X}` decreases like :math:`1/\sqrt{n}`. In contrast, the prediction interval approaches a positive length because the dominant uncertainty is the new observation’s random deviation from :math:`\mu`, which does not disappear.
This width behavior determines which interval is appropriate for a decision. If the decision concerns average performance or drift in :math:`\mu`, the confidence interval is the correct tool because it quantifies estimation precision. If the decision concerns the next unit or next outcome, the prediction interval is required because it quantifies outcome uncertainty, not only parameter uncertainty.
.. raw:: html
7.3.7 Figure 7.2: Estimation error vs prediction error under repetition
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This figure uses simulated data from a Normal model to visualize why prediction intervals are longer than confidence intervals. Simulation is appropriate because the comparison depends on the distribution of errors across many hypothetical repetitions of the sampling-and-forecasting procedure. Here, one “repetition” means drawing a fresh sample of size :math:`n`, computing :math:`\bar{X}`, and then drawing one additional future value :math:`X_{n+1}`.
To read the figure, choose :math:`n` from the dropdown and compare the two histograms on the same horizontal scale. The estimation error is :math:`\bar{X}-\mu`, while the prediction error is :math:`X_{n+1}-\bar{X}`; both are centered near zero, but their spreads differ. The smooth reference curves represent the corresponding Normal distributions predicted by the model, providing a theoretical benchmark against the empirical histograms.
The key message is that prediction error has larger variance because it combines uncertainty from two random quantities. As :math:`n` increases, the estimation error distribution tightens around zero because :math:`\text{Var}(\bar{X}-\mu)=\sigma^2/n` decreases. The prediction error distribution remains relatively wide because :math:`\text{Var}(X_{n+1}-\bar{X})=\sigma^2(1+1/n)` approaches :math:`\sigma^2` rather than zero.
This directly explains interval selection in operations. A confidence interval is linked to estimation error and therefore becomes very narrow when data are plentiful. A prediction interval is linked to prediction error and therefore remains materially wide, which is the correct representation of risk for a single future unit.
.. raw:: html
7.3.8 Figure 7.3: Tolerance interval content varies across samples
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This figure uses simulated Normal samples because tolerance intervals are defined through repeated-sampling reliability, not a single dataset. Simulation is appropriate because the key concept is the probability that the constructed interval achieves at least the target content :math:`p` over many possible samples. In this figure, one “repetition” means drawing a fresh sample of size :math:`n`, computing :math:`\bar{x}` and :math:`s`, forming :math:`\bar{x}\pm k s`, and then evaluating the true population content covered by that interval.
To read the figure, choose :math:`n` from the dropdown and examine the histogram of achieved content values. The vertical reference line marks the target content :math:`p`, and the mass of the histogram to the left of that line represents samples for which the tolerance interval failed to cover :math:`p` of the population. The plotted distribution is empirical, but the tolerance factor :math:`k` is computed from the Normal model so that the failure probability is approximately :math:`\gamma`.
The main message is that tolerance intervals control long-run reliability, not exact content in every single sample. When :math:`n` increases, the histogram concentrates more tightly near high coverage because :math:`\bar{X}` and :math:`S` become more stable. For small :math:`n`, the variability in :math:`S` is substantial, so the achieved content fluctuates widely even though the tolerance factor is selected to keep the failure rate near :math:`\gamma`.
This is the correct reporting logic for specification compliance statements. A tolerance interval is designed to support a claim like “at least :math:`p` of output lies within these bounds, with confidence :math:`1-\gamma`,” which aligns with long-run quality objectives. Prediction intervals and confidence intervals do not provide this population-coverage guarantee, so they are not substitutes when specifications are expressed in terms of proportions.
.. raw:: html
7.4 Discussion and Common Errors
--------------------------------
A frequent error is to interpret a confidence interval for :math:`\mu` as if it described most individual outcomes. A narrow confidence interval can occur even when individual outcomes vary widely, so it does not certify that most items meet :math:`\text{LSL}/\text{USL}`. Another frequent error is to use a prediction interval to make a long-run proportion claim; a prediction interval is about one future item, not the bulk of the population.
Normality assumptions matter more for prediction and tolerance than for confidence intervals for :math:`\mu`. For moderate to large :math:`n`, the confidence interval for :math:`\mu` can be approximately valid under mild non-Normality because of averaging, whereas prediction and tolerance statements concern individual-level behavior and distribution tails. When the Normal model is not defensible, prediction and tolerance results can be misleading, especially for high coverage targets (such as :math:`p=0.99`).
A notation error that causes real confusion is mixing the meanings of :math:`\alpha`, :math:`p`, and :math:`\gamma`. In this module, :math:`\alpha` controls the confidence level of a confidence or prediction interval, while :math:`p` is the intended population content of a tolerance interval and :math:`\gamma` is the tolerance failure probability. Reports should state both :math:`p` and :math:`1-\gamma` explicitly to avoid ambiguous claims.
7.5 Summary
-----------
Confidence intervals estimate parameters, prediction intervals forecast a single future value, and tolerance intervals describe where most of the population lies. Under Normal sampling with unknown :math:`\sigma`, the prediction interval uses :math:`\bar{x}\pm t_{\alpha/2,n-1}s\sqrt{1+1/n}`, which remains wide because individual outcomes remain variable. Tolerance intervals use :math:`\bar{x}\pm ks` with a factor :math:`k` chosen to guarantee content :math:`p` with confidence :math:`1-\gamma`.
For engineering communication, interval choice must match the requirement statement. If the requirement is expressed in terms of long-run proportions meeting specifications, tolerance bounds are the correct reporting tool. Clear reporting includes the interval type, the target (mean, next value, or population content), and the associated probability statement.