1. Sampling Distributions I: Random Sampling, Sampling Distributions, and Standard Error

1.0 Notation Table

Symbol

Meaning

\(X\)

one population measurement (random variable)

\(x\)

one observed value of \(X\)

\(X_1,\ldots,X_n\)

i.i.d. random sample (size \(n\))

\(n\)

sample size (number of observed units)

\(\mu\)

population mean

\(\sigma\)

population standard deviation

\(\sigma^2\)

population variance

\(\bar{X}\)

sample mean

\(S^2\)

sample variance (random variable)

\(s^2\)

observed value of \(S^2\)

\(Y\)

Bernoulli indicator variable (\(0/1\))

\(p\)

population proportion \(P(Y=1)\)

\(\hat{p}\)

sample proportion

\(\mathrm{SE}(\cdot)\)

standard error (SD of a sampling distribution)

\(R\)

number of repetitions in a simulation of repeated sampling

1.1 Introduction

In earlier study of probability models, a random variable describes uncertainty in a single measurement. Statistical inference adds a second layer of uncertainty because we usually observe only a sample, not the full population or process.

This module formalizes the idea that common summaries such as \(\bar{X}\), \(S^2\), and \(\hat{p}\) are themselves random variables. Their randomness comes from the sampling procedure, so the same study design could produce different values if it were repeated under the same conditions.

The key bridge to later inference is the sampling distribution, which is the probability distribution of a statistic under repeated sampling. Standard error is the numerical summary of that sampling distribution that will be used repeatedly in confidence intervals and hypothesis tests.

Throughout the module, the applied theme is service process performance for a retail counter, where transaction completion time varies because customers and requested services vary. This context is useful because the underlying time distribution is often right-skewed, which makes the difference between a population distribution and a sampling distribution visually clear.

1.2 Learning Outcomes

After completing this module, students should be able to:

  • Define a population and a sample in the language of random variables and probability models. Students should explain why inference requires sampling rather than full enumeration in many operational settings.

  • State the i.i.d. assumptions for random sampling and describe what can go wrong when they fail. Students should connect non-random sampling or dependence to biased or overly optimistic conclusions.

  • Distinguish parameters from statistics and identify which objects are random before sampling. Students should correctly classify \(\mu, \sigma^2, p\) versus \(\bar{X}, S^2, \hat{p}\).

  • Define a sampling distribution and interpret it as repeated-sample behavior under the same design. Students should explain what “repetition” means and why it is a conceptual tool rather than a practical requirement.

  • Define standard error as the standard deviation of a sampling distribution and interpret it as “typical error.” Students should explain why standard error decreases with \(n\) for common estimators.

  • Compute and interpret standard error formulas for \(\bar{X}\) and \(\hat{p}\) under standard conditions. Students should state the assumptions under which the formulas are valid.

1.3 Main Concepts

1.3.1 Population, sample, parameter, and statistic

A population is the full set of outcomes that could be produced by a stable process or target group under a defined measurement rule. In operations and management, a “population” may be finite (all shipments in a month) or effectively infinite (all future transactions under stable operation).

A parameter is a fixed numerical feature of that population or process. Typical parameters include \(\mu\) (mean), \(\sigma^2\) (variance), and \(p\) (event probability), and they are treated as fixed but unknown.

A sample is the subset of observations that is actually collected. A statistic is any function of the random sample, and it is random before sampling because a different sample would usually produce a different statistic.

Example 1.1: Defect rate as a proportion

A quality engineer monitors a packaging line and defines a “defect” as a seal failure detected by a standard inspection rule. Because inspecting every unit is expensive, the engineer inspects a random subset of produced units each hour and summarizes the hour by a single number.

Question: Identify the parameter and a natural statistic to estimate it, and explain why the statistic is random.

Let \(Y=1\) indicate a defect and \(Y=0\) indicate a non-defect for a randomly selected unit produced under stable conditions. The population parameter of interest is the defect probability \(p=P(Y=1)\), which is fixed for the process condition being studied.

If the engineer inspects \(n\) units in an hour, the natural estimator is the sample proportion \(\hat{p}=\frac{1}{n}\sum_{i=1}^{n}Y_i\). The statistic is random because the inspected units vary across repeated hours (or repeated samples within an hour), so the observed defect count can change even when the process parameter \(p\) is stable.

Answer: The parameter is \(p=P(Y=1)\), and a natural statistic is \(\hat{p}\). The statistic is random before sampling because it depends on which units are selected into the sample.

1.3.2 Random sampling and the i.i.d. model

To connect a sample to a population, we need a sampling model. A common baseline model is that the sample is i.i.d., meaning independent and identically distributed.

Formally, a random sample of size \(n\) is written \(X_1,\ldots,X_n\), where each \(X_i\) has the same population distribution and the variables are independent. Under this model, the joint probability model factorizes as a product of marginal distributions.

\[f(x_1,\ldots,x_n)=\prod_{i=1}^{n} f(x_i)\]

The i.i.d. model is not only convenient; it is a statement about study design. If sampling is biased (for example, only sampling at quiet times) or dependent (for example, repeated measurements from the same customer), then sampling distributions can shift or widen relative to i.i.d. expectations.

1.3.3 Core estimators as random variables

The sample mean is the statistic used to estimate \(\mu\). It averages the observed values, and it becomes more stable as \(n\) increases under i.i.d. sampling.

\[\bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_i\]

The sample variance is a statistic used to estimate \(\sigma^2\). It measures dispersion around \(\bar{X}\) and uses the divisor \(n-1\), which becomes important for unbiasedness properties later.

\[S^2=\frac{1}{n-1}\sum_{i=1}^{n}(X_i-\bar{X})^2\]

For event-rate questions, define an indicator \(Y_i\in\{0,1\}\) and estimate \(p=P(Y=1)\) by the sample proportion. The sample proportion is also an average, but of Bernoulli outcomes.

\[\hat{p}=\frac{1}{n}\sum_{i=1}^{n}Y_i\]

Example 1.2: Average transaction time as a point estimate

A store manager tracks transaction completion time (in seconds) at a service counter using a consistent operational definition of start and end time. The manager selects \(n=25\) transactions during a shift and reports the average time as a single-number summary for that shift.

Question: What parameter is being estimated by \(\bar{X}\), and why should \(\bar{X}\) be treated as random?

If \(X\) is the transaction time for one randomly selected transaction from the stable operating process, then the population mean \(\mu=E(X)\) is the long-run average time. The sample mean \(\bar{X}\) estimates \(\mu\) because it aggregates observed times and targets the process mean under i.i.d. sampling.

Even when the process does not change, different samples of \(n=25\) transactions would contain different customer mixes and service tasks. Therefore the realized value of \(\bar{X}\) would fluctuate across repeated samples, so it must be treated as a random variable before sampling.

Answer: \(\bar{X}\) estimates \(\mu=E(X)\). It is random because it depends on which transactions fall into the sample.

1.3.4 Sampling distributions and “repetition”

A sampling distribution is the probability distribution of a statistic under the sampling model. It answers the question: if the same sampling design were repeated many times under the same population or process, how would the statistic vary?

“Repetition” is a conceptual device, not a requirement to run the study many times in practice. The point is that uncertainty in inference is about what could have happened under repeated sampling, even though only one sample is usually observed.

Sampling distributions depend on three elements. They depend on the population distribution, the sample size \(n\), and the sampling method (for example, i.i.d. sampling versus clustered sampling).

1.3.5 Standard error as typical error

The standard error (SE) of a statistic is the standard deviation of its sampling distribution. It describes the typical size of the difference between the statistic and its target parameter under repeated sampling.

For the sample mean under i.i.d. sampling with population variance \(\sigma^2\), the mean and variance of \(\bar{X}\) satisfy:

\[E(\bar{X})=\mu\]
\[\mathrm{Var}(\bar{X})=\frac{\sigma^2}{n}\]

Therefore, the standard error of \(\bar{X}\) is:

\[\mathrm{SE}(\bar{X})=\frac{\sigma}{\sqrt{n}}\]

This formula shows a key scaling law: multiplying the sample size by 4 cuts \(\mathrm{SE}(\bar{X})\) in half. In practice \(\sigma\) is usually unknown, so a common plug-in estimate is \(s/\sqrt{n}\), where \(s\) is the observed sample standard deviation.

For the sample proportion under i.i.d. Bernoulli sampling with parameter \(p\), the sampling variance is:

\[\mathrm{Var}(\hat{p})=\frac{p(1-p)}{n}\]

and the standard error is:

\[\mathrm{SE}(\hat{p})=\sqrt{\frac{p(1-p)}{n}}\]

When \(p\) is unknown, a common plug-in estimate replaces \(p\) by \(\hat{p}\). This replacement is reasonable when the sample is large enough that \(\hat{p}\) is a stable estimate of \(p\).

Example 1.3: Standard error planning for an event rate

An operations team defines a “slow transaction” event as completion time exceeding a service target. The team wants an event-rate estimate with small typical sampling fluctuation for a daily dashboard, using a simple random sample of transactions.

Question: If the process event probability is approximately \(p=0.20\), compare \(\mathrm{SE}(\hat{p})\) for \(n=50\) versus \(n=200\), and interpret the change.

Under i.i.d. Bernoulli sampling, \(\mathrm{SE}(\hat{p})=\sqrt{p(1-p)/n}\). For \(n=50\), the standard error is \(\sqrt{0.20\cdot 0.80/50}=\sqrt{0.0032}\approx 0.0566\), which corresponds to a typical fluctuation of about 5.7 percentage points around \(p\).

For \(n=200\), the standard error is \(\sqrt{0.20\cdot 0.80/200}=\sqrt{0.0008}\approx 0.0283\). This is about half the standard error at \(n=50\), consistent with the \(1/\sqrt{n}\) scaling.

Answer: \(\mathrm{SE}(\hat{p})\approx 0.0566\) for \(n=50\) and \(\approx 0.0283\) for \(n=200\). Increasing the sample size by a factor of 4 halves the typical sampling fluctuation of the estimated event rate.

1.3.6 Simulation as an approximation tool for sampling distributions

Exact sampling distributions are sometimes difficult to compute, especially for complex statistics or nonstandard data-generating processes. Simulation provides a controlled way to approximate a sampling distribution by explicitly implementing repeated sampling on a computer.

A simulation-based approximation follows a consistent structure. One specifies a population model (or a large baseline dataset representing the process), fixes a sample size \(n\), repeats the sampling procedure \(R\) times, and records the statistic each time.

The resulting \(R\) recorded values approximate the sampling distribution. As \(R\) increases, the histogram of simulated values becomes a more stable picture of the sampling variability implied by the model.

Figure 1.1 — Population variability and the sampling distribution of \(\bar{X}\)

This figure uses simulated data from a right-skewed service-time population to make the sampling idea visible. Simulation is pedagogically appropriate here because we can repeat the same sampling design many times under a fixed population model and directly observe how \(\bar{X}\) changes. A “repetition” means one independent draw of \(n\) transactions from the same population model, followed by computing one sample mean. In this figure, \(n\) is the number of transactions observed per repetition, and the dropdown changes \(n\) while holding the population model fixed.

To read the figure, first look at the top panel, which summarizes the population distribution of transaction time \(X\). Then locate the reference line for the population mean \(\mu\), and compare it to the vertical markers representing sample means from repeated samples. Next, move to the bottom panel, where the same repeated sample means are displayed as a histogram that approximates the sampling distribution of \(\bar{X}\). The histogram is empirical, while the reference lines and the \(\mu \pm 2\cdot \mathrm{SE}\) band are theoretical summaries derived from \(\mathrm{SE}(\bar{X})=\sigma/\sqrt{n}\).

The main message is that the sampling distribution of \(\bar{X}\) is much less variable than the population distribution of \(X\). As \(n\) increases, the histogram of \(\bar{X}\) becomes tighter around \(\mu\) because averaging reduces variability by a factor of \(1/\sqrt{n}\). When the dropdown compares small versus large \(n\), the larger \(n\) case shows a visibly narrower sampling distribution and a smaller \(\mathrm{SE}(\bar{X})\). This matters operationally because a larger sample produces a more precise estimate of the mean service time, even if the underlying process remains skewed.

A common misreading is to confuse the population spread with the spread of the sample mean. The figure shows that “typical error” for estimating \(\mu\) is not \(\sigma\) but rather \(\sigma/\sqrt{n}\), and the difference can be substantial. The figure also anticipates later results: even with a skewed population, the distribution of \(\bar{X}\) tends to become more regular as \(n\) grows. The correct takeaway is that standard error quantifies the expected fluctuation of the estimator across repeated samples, not the variability of individual observations.

Figure 1.2 — Sampling distribution of \(\hat{p}\) for a slow-transaction event

This figure uses simulation to study the sampling distribution of an event-rate estimator. Simulation is appropriate because “coverage” and “typical fluctuation” are defined by repetition, and repetition can be implemented under a fixed model without waiting for many real days of data. A “repetition” means sampling \(n\) transactions from the same stable process, converting each transaction into an indicator \(Y_i\in\{0,1\}\), and computing one \(\hat{p}\). In this figure, \(n\) is the number of transactions in each repeated sample, and the dropdown changes \(n\) while the underlying event probability \(p\) is held fixed.

To read the figure, start by identifying the vertical reference line at \(p\), which is the event probability under the population model. Then examine the histogram of simulated \(\hat{p}\) values, which is empirical and approximates the sampling distribution of \(\hat{p}\). If a smooth reference curve is shown, it represents a normal approximation with mean \(p\) and standard deviation \(\sqrt{p(1-p)/n}\). The empirical histogram should be compared to the reference line and curve to judge both centering (bias) and spread (standard error).

The main message is that the sampling distribution of \(\hat{p}\) tightens around \(p\) as \(n\) increases. For small \(n\), the histogram is wider and can be noticeably discrete because \(\hat{p}\) changes in increments of \(1/n\). For large \(n\), the distribution becomes narrower and more concentrated, consistent with \(\mathrm{SE}(\hat{p})=\sqrt{p(1-p)/n}\). This matters because operational decisions based on event rates (such as staffing triggers) are more stable when the sampling variability is small relative to the decision thresholds.

A frequent error is to interpret a single observed \(\hat{p}\) as if it were a fixed truth about the process. The figure shows that even under a stable \(p\), repeated samples produce different \(\hat{p}\) values, and the typical deviation is quantified by the standard error. Another common error is to treat \(n\) as “number of repetitions” rather than “sample size per repetition,” which changes the meaning of the histogram. The correct takeaway is that larger \(n\) improves precision of the event-rate estimate, while repetition is a conceptual tool used to define and approximate the sampling distribution.

1.4 Discussion and Common Errors

Confusing a parameter with a statistic is a persistent source of incorrect reasoning. A parameter such as \(\mu\) or \(p\) is fixed for the population or process condition, while a statistic such as \(\bar{X}\) or \(\hat{p}\) varies from sample to sample.

Confusing standard deviation with standard error leads to incorrect claims about estimation precision. The standard deviation \(\sigma\) describes variability of individual observations, while \(\mathrm{SE}(\bar{X})=\sigma/\sqrt{n}\) and \(\mathrm{SE}(\hat{p})=\sqrt{p(1-p)/n}\) describe variability of estimators across repeated samples.

Assuming i.i.d. sampling without checking the study design can produce misleading conclusions. Convenience sampling, clustered sampling, or repeated measurements on the same unit can introduce bias or dependence, which can shift or widen the sampling distribution relative to the baseline formulas.

Interpreting “repetition” as a requirement to rerun the real-world study is a conceptual mistake. Repetition is primarily a model-based device that defines sampling distributions and justifies probability statements about estimators, while in practice we usually collect one sample and quantify uncertainty through standard error.

1.5 Summary

A population is the full set of outcomes produced by a target group or stable process under a defined measurement rule. A parameter such as \(\mu\), \(\sigma^2\), or \(p\) describes that population and is fixed but unknown.

A sample is the observed subset of outcomes, and a statistic is any function of the sample. Statistics such as \(\bar{X}\), \(S^2\), and \(\hat{p}\) are random variables before sampling because their values depend on which units are selected.

A sampling distribution is the probability distribution of a statistic under repeated sampling from the same model and design. Standard error is the standard deviation of that sampling distribution and represents typical estimation fluctuation due to sampling.

Under i.i.d. sampling, \(\mathrm{SE}(\bar{X})=\sigma/\sqrt{n}\) and \(\mathrm{SE}(\hat{p})=\sqrt{p(1-p)/n}\). These formulas explain why larger samples typically yield more precise point estimates, forming the foundation for confidence intervals and hypothesis tests in later modules.