3. Sampling Distributions III: :math:`\chi^2`, :math:`t`, and :math:`F` Under Normal Theory for :math:`S^2`, :math:`\mu`, and :math:`\sigma_1^2/\sigma_2^2` with Q–Q Diagnostics
==========================================================================================================================================
3.0 Notation Table
------------------
.. list-table::
:header-rows: 1
:widths: 32 68
* - Notation
- Meaning
* - :math:`X_1,\ldots,X_n`
- Independent random sample
* - :math:`\mu,\ \sigma^2,\ \sigma`
- Population mean, variance, sd
* - :math:`\bar{X}`
- Sample mean
* - :math:`S^2,\ S`
- Sample variance, sample sd
* - :math:`\bar{x},\ s^2,\ s`
- Observed values of :math:`\bar{X}, S^2, S`
* - :math:`n,\ n_1,\ n_2`
- Sample size(s)
* - :math:`v,\ v_1,\ v_2`
- Degrees of freedom (d.f.)
* - :math:`\alpha`
- Tail probability / significance level
* - :math:`\chi^2_v`
- Chi-square distribution, d.f. :math:`v` (:math:`\chi^2\ge 0`)
* - :math:`\chi^2_{\alpha}(v)`
- Right-tail chi-square critical value
* - :math:`t_v`
- Student :math:`t` distribution, d.f. :math:`v`
* - :math:`t_{\alpha}(v)`
- Right-tail :math:`t` critical value
* - :math:`F_{v_1,v_2}`
- :math:`F` distribution, d.f. :math:`(v_1,v_2)` (:math:`F\ge 0`)
* - :math:`f_{\alpha}(v_1,v_2)`
- Right-tail :math:`F` critical value
* - :math:`H_0,\ H_1`
- Null and alternative hypotheses
* - :math:`y_{(i)}`
- :math:`i`-th order statistic
* - :math:`f_i`
- Plotting position for :math:`y_{(i)}`
* - :math:`\Phi^{-1}(\cdot)`
- Standard Normal quantile function
* - :math:`z_i`
- Normal score :math:`z_i=\Phi^{-1}(f_i)`
3.1 Introduction
----------------
In the previous module, the sampling distribution of :math:`\bar{X}` supported inference about :math:`\mu` using Normal models and large-sample approximations. That pathway is simplest when the process standard deviation :math:`\sigma` is known, or when large :math:`n` makes estimation error in :math:`\sigma` relatively small.
In many operational and quality settings, :math:`\sigma` is not known and must be estimated from the same sample used to estimate :math:`\mu`. This module develops the exact Normal-theory sampling distributions for :math:`S^2`, for the standardized mean with :math:`S` in the denominator, and for ratios of sample variances. It also introduces quantile-based plots as diagnostics for whether Normal-based inference is plausible in practice.
3.2 Learning Outcomes
---------------------
After this module, you should be able to:
- State the sampling distribution of :math:`(n-1)S^2/\sigma^2` under Normal sampling and interpret degrees of freedom
- Use chi-square critical values to form probability statements and confidence intervals for :math:`\sigma^2` and :math:`\sigma`
- Define the :math:`t` statistic for inference on :math:`\mu` when :math:`\sigma` is unknown and explain why the tails are heavier than :math:`N(0,1)`
- Define the :math:`F` statistic for comparing two variances and correctly identify numerator and denominator degrees of freedom
- Construct and interpret quantile plots and Normal probability (Q–Q) plots as diagnostics for Normal assumptions
3.3 Main Concepts
-----------------
3.3.1 Sampling Distribution of :math:`S^2` and the Chi-Square Family
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When the population is Normal, the sample variance has an exact sampling distribution that does not rely on large-sample approximations. The key idea is to scale :math:`S^2` so that it matches a standard reference distribution with known tail areas.
Under independent Normal sampling with variance :math:`\sigma^2`, the statistic below follows a chi-square distribution with :math:`v=n-1` degrees of freedom:
.. math::
\chi^2 = \frac{(n-1)S^2}{\sigma^2} \sim \chi^2_v,\quad v=n-1
The parameter :math:`v=n-1` reflects that one degree of freedom is used to estimate :math:`\bar{X}` before measuring spread around that mean. This is why :math:`S^2` divides by :math:`n-1`, and why :math:`\chi^2` uses :math:`v=n-1` rather than :math:`n`.
Right-tail critical values are written :math:`\chi^2_{\alpha}(v)` such that :math:`P(\chi^2_v \ge \chi^2_{\alpha}(v))=\alpha`. Because :math:`\chi^2_v` is right-skewed for small :math:`v`, left-tail and right-tail cutoffs are not symmetric, and two-sided work requires two distinct values.
A standard two-sided :math:`(1-\alpha)` confidence interval for :math:`\sigma^2` follows by rearranging the chi-square statement:
.. math::
\frac{(n-1)S^2}{\chi^2_{1-\alpha/2}(v)} \le \sigma^2 \le \frac{(n-1)S^2}{\chi^2_{\alpha/2}(v)}
This same structure supports tests about process variability. A statistic far in the right tail suggests that the hypothesized :math:`\sigma^2` is too small, while a statistic far in the left tail suggests that the hypothesized :math:`\sigma^2` is too large. The right-skewness is strongest at small :math:`v`, and it decreases as :math:`v` increases because the distribution becomes more concentrated.
Example 3.1
^^^^^^^^^^^
A call center monitors variability of service times because high variability creates staffing risk and increases queue instability. Historical performance is approximately Normal, and the historical standard deviation is :math:`\sigma_0=2.0` minutes.
A new training program is introduced, and a sample of :math:`n=12` calls yields a sample standard deviation :math:`s=2.8` minutes.
**Question:** Is there evidence at :math:`\alpha=0.05` that variability has increased above :math:`\sigma_0`?
Under the Normal model and assuming independence, the upper-tail chi-square statistic is
.. math::
\chi^2 = \frac{(n-1)s^2}{\sigma_0^2}
Here :math:`(n-1)=11`, so :math:`\chi^2 = 11(2.8^2)/(2.0^2)=21.56`. The p-value is the right-tail probability :math:`P(\chi^2_{11}\ge 21.56)`, which is approximately :math:`0.028`. Since this probability is below :math:`0.05`, the observed sample variability is unusually large under :math:`\sigma_0=2.0`.
Therefore, there is evidence that the service-time standard deviation has increased beyond :math:`2.0` minutes, which indicates greater operational risk for the call center.
Figure 3.1: Chi-square sampling for :math:`(n-1)S^2/\sigma^2`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The figure is based on simulated repeated samples from a Normal population, because the chi-square result is exact under Normal sampling and simulation makes the sampling shape visible. One “repetition” means drawing a fresh random sample of size :math:`n` from the same Normal population and recomputing :math:`S^2` and :math:`(n-1)S^2/\sigma^2`. For this figure, :math:`n` is the number of observations in each repeated sample that produces one value of the statistic.
To read the plot, begin with the histogram, which is the empirical sampling distribution of :math:`(n-1)S^2/\sigma^2` over many repetitions. Then compare it to the smooth reference curve, which is the theoretical :math:`\chi^2_v` density with :math:`v=n-1`. Use the dropdown to switch :math:`n` while keeping the x-axis fixed so that shape changes can be compared fairly.
The main message is that :math:`\chi^2_v` is strongly right-skewed for small :math:`v` and becomes less skewed as :math:`v` increases. When :math:`n` increases, :math:`S^2` concentrates more tightly around :math:`\sigma^2`, so the statistic concentrates more tightly around its mean :math:`v`. This matters because variance inference is driven by tail areas, and tail stability improves with larger degrees of freedom.
A practical reading goal is to connect tails to decisions. Values far to the right correspond to unusually large sample variances relative to :math:`\sigma^2`, and values far to the left correspond to unusually small sample variances. The figure shows why right-tail and left-tail cutoffs are not symmetric for small :math:`v`, and why two-sided inference must use two different chi-square critical values.
.. raw:: html
3.3.2 The :math:`t` Distribution When :math:`\sigma` Is Unknown
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When :math:`\sigma` is unknown, replacing it with :math:`S` introduces additional randomness into standardized mean inference. Even under Normal sampling, the standardized mean no longer follows :math:`N(0,1)` in small samples.
Under independent Normal sampling, the statistic
.. math::
T=\frac{\bar{X}-\mu}{S/\sqrt{n}} \sim t_v,\quad v=n-1
has a Student :math:`t` distribution with :math:`v=n-1` degrees of freedom. The :math:`t` distribution is symmetric about zero, but its tails are heavier than :math:`N(0,1)` because :math:`S` fluctuates from sample to sample. The heavier tails represent the additional uncertainty from estimating variability using the same data.
Right-tail critical values are written :math:`t_{\alpha}(v)` with :math:`P(t_v \ge t_{\alpha}(v))=\alpha`. By symmetry, two-sided cutoffs can be written as :math:`\pm t_{\alpha/2}(v)`, and the degrees of freedom must match the sample size. As :math:`v` increases, :math:`t_v` approaches :math:`N(0,1)`, which explains why large-sample practice often uses Normal approximations.
Example 3.2
^^^^^^^^^^^
A packaging line targets a mean fill weight of :math:`\mu_0=50.0` grams, and deviations above target can increase cost. The process standard deviation is not known for the current shift, so it must be estimated from the sample before making a mean-based decision.
A sample of :math:`n=16` packages yields :math:`\bar{x}=52.1` grams and :math:`s=4.0` grams.
**Question:** What is the one-sided p-value for testing :math:`H_0:\mu=50.0` versus :math:`H_1:\mu>50.0`?
Under Normal sampling, the test statistic is
.. math::
t=\frac{\bar{x}-\mu_0}{s/\sqrt{n}}
Substituting the values gives :math:`t = (52.1-50.0)/(4.0/\sqrt{16})=2.10` with :math:`v=15` degrees of freedom. The p-value is :math:`P(t_{15}\ge 2.10)`, which is approximately :math:`0.026`. This probability is small, so the observed sample mean is unusually large under :math:`\mu_0=50.0`.
Therefore, the data provide evidence that the mean fill weight exceeds :math:`50.0` grams, suggesting a potential overfill issue with cost implications.
Figure 3.2: Sampling distribution of :math:`T` and the role of :math:`v`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The figure uses simulation from a Normal population so that the theoretical :math:`t` reference is the correct benchmark in each setting. One “repetition” means drawing a fresh Normal sample of size :math:`n`, computing :math:`\bar{X}` and :math:`S`, and then computing :math:`T=(\bar{X}-\mu)/(S/\sqrt{n})`. For this figure, :math:`n` is the within-repetition sample size that determines the degrees of freedom :math:`v=n-1`.
To read the plot, focus first on the histogram as the empirical distribution of :math:`T` across many repetitions. Compare it to the theoretical :math:`t_v` curve shown on the same axes, and then compare both to the standard Normal reference curve. Use the dropdown to change :math:`n` (and hence :math:`v`) and keep the x-axis fixed so that tail thickness can be compared directly.
The main message is that small :math:`v` produces heavier tails than :math:`N(0,1)`, so extreme standardized values are more plausible than under a Normal benchmark. When :math:`n` increases, :math:`S` becomes a more stable estimator of :math:`\sigma`, so the :math:`t` curve approaches the Normal curve and tail differences diminish. This matters because p-values and confidence intervals depend strongly on tail probabilities.
A practical reading goal is to interpret “extra tail mass” as a conservative adjustment for estimating :math:`\sigma`. For a fixed observed statistic value, the right-tail probability under :math:`t_v` is larger than under :math:`N(0,1)` when :math:`v` is small, which increases p-values and widens intervals relative to Normal-based calculations. The figure visualizes why small-sample mean inference requires degrees of freedom and cannot be treated as “just a CLT result.”
.. raw:: html
3.3.3 The :math:`F` Distribution and Comparing Two Variances
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In two-sample settings, variability comparison can be as important as mean comparison. Under Normal sampling with independent samples, ratios of scaled chi-square variables define the :math:`F` family, and ratios of sample variances become :math:`F` statistics.
If :math:`U\sim \chi^2_{v_1}` and :math:`V\sim \chi^2_{v_2}` are independent, then
.. math::
F=\frac{U/v_1}{V/v_2}\sim F_{v_1,v_2}
For two independent Normal samples of sizes :math:`n_1` and :math:`n_2`, with sample variances :math:`S_1^2` and :math:`S_2^2`, a standard variance-ratio statistic under :math:`\sigma_1^2=\sigma_2^2` is
.. math::
F=\frac{S_1^2}{S_2^2}\sim F_{v_1,v_2},\quad v_1=n_1-1,\ v_2=n_2-1
The order of :math:`(v_1,v_2)` matters because :math:`F_{v_1,v_2}` is not symmetric and is defined on :math:`[0,\infty)`. Right-tail critical values are written :math:`f_{\alpha}(v_1,v_2)` with :math:`P(F_{v_1,v_2}\ge f_{\alpha}(v_1,v_2))=\alpha`. A useful identity for switching tail work is the reciprocal relationship:
.. math::
f_{1-\alpha}(v_1,v_2)=\frac{1}{f_{\alpha}(v_2,v_1)}
A second structural link connects mean inference and variance ratios. If :math:`T\sim t_v`, then :math:`T^2\sim F_{1,v}`, which shows that :math:`t`-based mean inference can be viewed through an :math:`F`-ratio perspective under Normal theory. These relationships help keep numerator and denominator degrees of freedom consistent with the statistic definition.
Example 3.3
^^^^^^^^^^^
Two suppliers provide the same component, and a quality engineer compares variability in a critical dimension because high variability increases scrap risk. Measurements are approximately Normal within each supplier, and the two samples are independent due to separate production lines.
Supplier A provides :math:`n_1=10` parts with :math:`s_1=1.8` units, and Supplier B provides :math:`n_2=14` parts with :math:`s_2=1.2` units.
**Question:** What is the one-sided p-value for testing :math:`H_0:\sigma_1^2=\sigma_2^2` versus :math:`H_1:\sigma_1^2>\sigma_2^2`?
Under :math:`H_0`, the statistic is :math:`F=s_1^2/s_2^2`. The observed value is :math:`F=1.8^2/1.2^2=2.25`, with numerator d.f. :math:`v_1=9` and denominator d.f. :math:`v_2=13`. The p-value is the right-tail probability :math:`P(F_{9,13}\ge 2.25)`, which is approximately :math:`0.089`. This probability is not small at :math:`0.05`, so the evidence is not strong at that threshold.
Therefore, at :math:`\alpha=0.05` there is insufficient evidence that Supplier A is more variable, although the observed ratio may motivate additional sampling for quality screening.
Figure 3.3: Sampling distribution of :math:`S_1^2/S_2^2` and the :math:`F` family
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The figure is based on simulated repeated samples from Normal populations, because the :math:`F` result is exact under Normal sampling with independent samples. One “repetition” means drawing two fresh samples of sizes :math:`n_1` and :math:`n_2`, computing :math:`S_1^2` and :math:`S_2^2`, and forming the ratio :math:`S_1^2/S_2^2` under equal population variances. For this figure, :math:`n_1` and :math:`n_2` are the within-repetition sample sizes that determine :math:`v_1=n_1-1` and :math:`v_2=n_2-1`.
To read the plot, compare the histogram of simulated variance ratios to the theoretical :math:`F_{v_1,v_2}` density curve. Use the dropdown to switch among :math:`(n_1,n_2)` settings, and keep the x-axis fixed so that right-tail behavior is comparable across settings. The histogram is empirical, while the smooth curve represents the Normal-theory reference distribution.
The main message is that :math:`F` distributions are right-skewed, especially when either degree of freedom is small, and they concentrate around 1 as both sample sizes grow. When :math:`n_1` and :math:`n_2` increase, each :math:`S^2` becomes more stable, so their ratio becomes less variable and extreme ratios become less frequent. This matters because variance comparisons are sensitive to rare but large ratio values that drive right-tail probabilities.
A practical reading goal is to connect the skewness to test direction and statistic definition. If the question is “is population 1 more variable,” using :math:`F=S_1^2/S_2^2` makes large values evidence in the right tail, and the correct critical value uses :math:`(v_1,v_2)` in that order. If the ratio is inverted, the degrees of freedom must also be swapped, and the reciprocal identity provides a safe way to translate cutoffs.
.. raw:: html
3.3.4 Quantile and Probability Plots as Diagnostics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The chi-square, :math:`t`, and :math:`F` results above are exact under Normal sampling, so it is important to assess whether Normality is plausible before relying on those results. Graphical diagnostics are useful because they reveal distributional shape, tail behavior, and outliers in a way that numerical summaries may hide.
A quantile is the data value corresponding to a given cumulative probability. If the observations are sorted as :math:`y_{(1)}\le \cdots \le y_{(n)}`, a quantile plot places :math:`y_{(i)}` on the vertical axis and an empirical probability on the horizontal axis. A common plotting-position rule is
.. math::
f_i=\frac{i-3/8}{n+1/4},\quad i=1,\ldots,n
In a Normal Q–Q plot (Normal probability plot), the horizontal axis is transformed to theoretical Normal quantiles. Define :math:`z_i=\Phi^{-1}(f_i)` and plot :math:`(z_i,\ y_{(i)})`. If the Normal model is appropriate, the points should fall close to a straight line, with random scatter expected, and systematic curvature indicating a mismatch in shape.
Example 3.4
^^^^^^^^^^^
A logistics team wants to use a :math:`t`-based confidence interval for mean delivery lead time with a moderate sample size. The method is sensitive to strong departures from Normality because tail probabilities determine both interval endpoints and p-values.
The team collects a sample of lead times and prepares both a quantile plot and a Normal Q–Q plot to assess whether Normal-based inference is reasonable.
**Question:** What plot-based evidence supports Normality, and what patterns raise concern?
A quantile plot should increase smoothly without abrupt acceleration in the upper tail, and it should not show long flat segments followed by sharp jumps. A Normal Q–Q plot should show points that align approximately on a straight line, where mild scatter is expected due to sampling variability.
Systematic curvature is the key warning sign because it indicates that the sample’s distributional shape differs from the Normal reference. Tail deviations are especially important because small-sample inference about :math:`\mu` and :math:`\sigma^2` depends on tail areas and extreme values. With larger :math:`n`, departures become easier to see because the ordered points fill the tails more densely and patterns become clearer.
Figure 3.4: Quantile plot as an empirical distribution view
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The figure uses synthetic data to compare a bell-shaped baseline with a right-skewed operational-time mechanism, with both mechanisms scaled to a comparable unit spread. This choice is pedagogically appropriate because it isolates the effect of distributional shape rather than mixing shape with a large change in scale. One plotted sample is sorted to form :math:`y_{(1)},\ldots,y_{(n)}`, and here :math:`n` is the number of observed times included in that single displayed sample.
To read the plot, start with the horizontal axis, which is the plotting position :math:`f_i` and approximates cumulative probability. Then trace how :math:`y_{(i)}` changes as :math:`f_i` increases from near 0 to near 1, paying special attention to the upper-tail region near :math:`f_i\approx 0.9` and above. Use the dropdown to compare a smaller :math:`n` to a larger :math:`n`, and compare the bell-shaped mechanism to the right-skewed mechanism.
The main message is that right-skewed data rise slowly at first and then accelerate in the upper tail, reflecting a long right tail that becomes visible in the upper quantiles. When :math:`n` increases, the quantile plot becomes smoother and tail structure becomes more stable because more ordered points describe the empirical distribution. This matters because many operational conclusions are driven by tail performance, such as delays and unusually long service times.
A practical reading goal is to connect shape to risk in Normal-theory inference. If the upper tail accelerates strongly, Normal-based variance and mean procedures can be distorted because extreme values occur more often than the Normal model predicts. The quantile plot is therefore a first-pass diagnostic that motivates whether a Normal Q–Q plot and additional checks should be used before applying :math:`t`, :math:`\chi^2`, or :math:`F` procedures.
.. raw:: html
Figure 3.5: Normal probability plot (Normal Q–Q plot)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The figure uses the same synthetic mechanisms as the quantile plot so that the “Normal-like” case and the “right-skewed” case differ mainly in shape while remaining on a comparable standardized scale. This is pedagogically appropriate because it focuses attention on linearity versus curvature in the probability plot, which is the key diagnostic criterion. For this figure, :math:`n` is the number of observations in the single sample that is ordered to produce the plotted points.
To read the plot, first examine whether the point cloud aligns approximately along a straight line across the central region. Then check the tails, where departures often appear as systematic bending away from the line, especially in the far right tail for right-skewed data. The plotted points are empirical ordered observations against theoretical Normal quantiles, while the straight reference line is a fitted benchmark for what “Normal consistency” would look like for that sample’s location and scale.
The main message is that skewness and heavy tails create patterned curvature, whereas Normal data produce near-linear alignment with only moderate scatter. When :math:`n` is small, curvature can be difficult to distinguish from random noise, but when :math:`n` is larger, the tail regions contain more points and the pattern becomes clearer. This matters because Normal-theory “exact” results can fail when tail behavior differs from Normal assumptions, particularly for small and moderate samples.
A practical reading goal is to translate geometry into inference readiness. If the plot is close to linear with no strong tail bending, Normal-based :math:`t`, :math:`\chi^2`, and :math:`F` procedures are more defensible for that dataset. If the plot bends upward in the right tail or shows an S-shape, it signals that tail probabilities are misaligned with Normal theory, and analysts should consider transformations, robust methods, or larger samples before relying on exact small-sample distributions.
.. raw:: html
3.4 Discussion and Common Errors
--------------------------------
A frequent error is treating :math:`t` methods as purely CLT-based and therefore safe for any distribution at small :math:`n`. The exact :math:`t` result depends on Normal sampling, and strong skewness or heavy tails can distort both p-values and confidence interval coverage when :math:`n` is modest. Diagnostics such as Q–Q plots are therefore not optional when sample sizes are small or when the process mechanism plausibly produces skewness.
Degrees of freedom are often misapplied by using :math:`n` instead of :math:`n-1` for variance-related statistics. This error changes critical values for chi-square, :math:`t`, and :math:`F`, and it typically produces intervals that are too narrow or tests with incorrect Type I error behavior. In practice, the degree-of-freedom error is especially costly when :math:`n` is small because tail cutoffs change substantially.
For :math:`F` inference, the most common mistake is swapping numerator and denominator degrees of freedom. Because :math:`F_{v_1,v_2}` is not symmetric, exchanging :math:`(v_1,v_2)` changes tail probabilities and critical values, and it can reverse the meaning of “large” evidence. The statistic definition must match the question direction, and the reciprocal identity should be used carefully when translating left-tail to right-tail statements.
In probability plots, it is incorrect to demand a perfectly straight line. Scatter is expected under sampling variability, and the correct focus is on systematic shape and tail behavior. Tail deviations deserve extra attention because small-sample inference is driven by tail probabilities rather than by the center of the distribution.
3.5 Summary
-----------
This module developed three exact Normal-theory sampling distributions used throughout statistical inference. The chi-square result describes the sampling behavior of :math:`S^2` and supports inference about :math:`\sigma^2`, the :math:`t` distribution supports inference about :math:`\mu` when :math:`\sigma` is unknown, and the :math:`F` distribution models ratios of sample variances for comparing variability.
Quantile plots and Normal Q–Q plots provide practical diagnostics for Normal assumptions. These plots are most informative when they are read for systematic curvature and tail behavior rather than for perfect linearity. The combined message is that correct distributional assumptions and correct degrees of freedom are central to reliable small-sample inference.