Title	Link
My science and philosophy books	Open
My theology books	Open
My books on Classics	Open
My literary work	Open

Statistics 2nd ed

t-test

Appendix 8 — Glossary of Key Terms

Mean (average)
Sum of all scores divided by number of scores.
Example: (6 + 8 + 10) / 3 = 8.

Median
Middle score when data are ordered.
Example: For [5, 7, 8], median = 7.

Mode
Most frequent score.
Example: For [2, 3, 3, 5], mode = 3.

Variance (s²)
Average squared deviation from the mean.

Standard Deviation (s)
Square root of variance. Spread of scores around the mean.

Standard Error of the Mean (SEM)
How much sample means vary.
Formula: $$SEM = \frac{s}{\sqrt{n}}$$

t-test
Compares two means.

ANOVA (F-test)
Compares three or more means.

Post Hoc Test
Used after ANOVA to find which groups differ.

Correlation (r)
Strength and direction of a linear relationship. Range: –1 to +1.

Regression
Equation that predicts Y from X.
Example: $$\hat{Y} = a + bX$$

Chi-square (χ²)
Test for categorical data (counts).

Degrees of Freedom (df)
Independent pieces of information in a test.

p-value
Probability of getting the observed result (or more extreme) if the null hypothesis is true.

📱 QR: Interactive glossary (search symbols, formulas, definitions)

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Tags

educational-statistics

self-test-quiz

Appendix 6 — Data Sets for Practice

```html

Appendix 6 — Data Sets for Practice

Working with real numbers is the best way to learn statistics. This appendix provides small “mini datasets” you can analyze by hand (or with a calculator), plus larger files for practice with spreadsheets.

Dataset Provenance (Read This First)

Pedagogical = small, simplified numbers chosen to make learning and checking easier.
Simulated = computer-generated numbers designed to resemble real data (not collected from real people).
Empirical = collected from real observations (only used if explicitly stated).

Note: Unless a dataset is explicitly labeled Empirical, you should treat it as Pedagogical or Simulated practice data.

Mini Datasets (In-Page)

1) Quiz Scores

Provenance: Pedagogical
n: 10
Scale: Ratio (points)
Data: 6, 7, 8, 9, 10, 7, 8, 6, 9, 10

Suggested Lessons:
- Lesson 2 — The Averages: mean, median, mode
- Lesson 3 — Variance & Standard Deviation: variance, SD, z-scores
- Lesson 4 — The Standard Normal Curve: interpret z-scores (as a bridge)
Check values (optional): Mean = 8.0; SD ≈ 1.41

2) Reaction Times (ms)

Provenance: Pedagogical (human-like values)
n: 8
Scale: Ratio (milliseconds)
Units: ms
Data: 220, 250, 270, 230, 260, 280, 240, 300

Suggested Lessons:
- Lesson 3 — Variance & Standard Deviation: spread, outliers, SD
- Lesson 6 — The t-test: use as a template dataset (e.g., compare two conditions by splitting into two groups)
- Lesson 7 — ANOVA: extend to 3+ groups by creating conditions
Instructor tip: reaction time data often show mild skew in real life. If you want skew, see the larger practice files below.

3) Stress Reduction Scores (Three Groups)

Provenance: Pedagogical (grouped scores)
Scale: Interval/Ratio (score units; treat as interval for ANOVA practice)
Groups:

Meditation (n = 3): 65, 70, 72
Exercise (n = 3): 68, 71, 75
Music (n = 3): 75, 78, 82
Suggested Lessons:
- Lesson 7 — ANOVA: one-way ANOVA (three independent groups)
- Lesson 8 — Post Hoc Tests: follow-up comparisons after ANOVA (conceptual)
- Lesson 13 — Degrees of Freedom Cookbook: df for one-way ANOVA
Important note: The sample sizes are intentionally small for learning mechanics. In real studies, groups are usually larger.

Larger Practice Datasets (Download Files)

These datasets are designed for spreadsheet work, graphing, and full problem sets.

Exam Scores (n = 100)
Provenance: Simulated
Suggested Lessons: Lesson 4 (normal curve), Lesson 5 (SEM), Lesson 6 (t-test foundations)
Survey Data (preferences by gender/age)
Provenance: Simulated (categorical practice)
Suggested Lessons: Lesson 12 (chi-square), Lesson 1 (why statistics matters in decisions)
Simulated Medical Trial (treatment vs. control, repeated measures)
Provenance: Simulated (instructional “trial-style” dataset; not clinical research)
Suggested Lessons: Lesson 6 (t-test concepts), Lesson 7 (variance partitioning concepts), and for advanced learners: repeated-measures ideas (optional)

Downloads: CSV and Excel files are provided via the QR code(s) on this page (and/or direct links, if enabled on your device).

Reproducibility note (simulated files): If you revise these datasets in future editions, consider generating them with a fixed random seed so instructors and students can reproduce results across versions.

Trusted External Sources (Optional)

If you want additional datasets beyond the practice files above, the following repositories are widely used for learning and benchmarking:

NIST Statistical Reference Datasets (SRD)
High-quality benchmark datasets for practice and verification (excellent for checking calculations and software).
UCI Machine Learning Repository
Larger, more complex datasets. Recommended only for advanced students or enrichment projects.

Visual Reference

Figure F.1 — Example spreadsheet view of a dataset (columns such as ID, Score, Group). Use this as a template for organizing your own data before running calculations.

Self-Test Quiz Access

Practice problems and self-test quizzes may appear below. If full access is restricted, please sign up (free) to unlock the quiz section.

```

Tags

educational-statistics

self-test-quiz

Appendix 5 — Technology Tips (On Your Phone & Laptop)

Statistics can be done with calculators, spreadsheets, or software. Here’s a quick guide.

Excel / Google Sheets

Task	Formula	Example
Mean	`=AVERAGE(A1:A10)`	Mean of scores in A1–A10
Standard Deviation	`=STDEV.S(A1:A10)`	Spread of scores
t-test	`=T.TEST(A1:A10,B1:B10,2,2)`	Compare two groups

R (RStudio or RStudio Cloud)

Task	Command	Example
Mean	`mean(x)`	`mean(c(6,8,10)) = 8`
SD	`sd(x)`	`sd(c(6,8,10)) = 2`
t-test	`t.test(x,y)`	Compare two groups

Python (NumPy / SciPy / Pandas)

Task	Command	Example
Mean	`np.mean(x)`	`np.mean([6,8,10]) = 8`
SD	`np.std(x, ddof=1)`	`np.std([6,8,10],ddof=1) = 2`
t-test	`stats.ttest_ind(x,y)`	Compare two groups

iPhone Calculator

Rotate sideways → scientific mode
Use √ for square root
Parentheses matter: type numerator, then divide by denominator
Fine for small problems, but not for full datasets

Summary

For quick homework: iPhone calculator
For assignments: Excel / Google Sheets
For coding: Python (Colab) or R (RStudio Cloud)

📱 QR: Open sample data in Google Sheets (ready to practice mean, SD, t-test)

Visuals

Figure E.1 — Screenshots of the same mean calculation in Sheets, R, and Python side by side.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Tags

scientific-calculator

spreadsheet-formulas

coding-for-statistics

educational-statistics

online-textbook

self-test-quiz

Appendix 3 — Using the t-table and F-table

Online z-calculator (type z or x, get areas instantly)

Tables give the critical values we compare our test statistic against.
They depend on:

The significance level (α, often 0.05)
The degrees of freedom (df)

t-table

Rows = degrees of freedom (df)
Columns = significance level (α)

Example:

Independent-samples t-test with n₁ = 12, n₂ = 12
df = 12 + 12 – 2 = 22
At α = 0.05 (two-tailed) → critical t ≈ 2.07
If $$|t| \geq 2.07$$ → significant

F-table

Needs two df values:
- df between (numerator)
- df within (denominator)

Example:

One-way ANOVA, 3 groups, N = 24
df between = k – 1 = 2
df within = N – k = 21
At α = 0.05 → critical F ≈ 3.47
If computed F ≥ 3.47 → significant

Student Tips

Always compute df correctly.
Use tables if no software is available.
Most calculators or apps today give exact p-values — faster than tables.

📱 QR: Interactive critical value calculator (t and F tables online)

Visuals

Figure C.1 — Snippet of a t-table row (df = 22, α = 0.05 highlighted).
Figure C.2 — F-table grid with numerator df = 2, denominator df = 21 marked.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Tags

statistical-inference

manual-calculation

applied-statistics

educational-statistics

online-textbook

self-test-quiz

Appendix 1 — Symbols and Notation (Cheat Sheet)

A quick reference to the symbols used in this book.

Symbol	Meaning	Example
$$\Sigma$$	Summation (add them up)	$$\Sigma X = 2+4+6=12$$
$$\bar{X}$$	Sample mean	$$\bar{X} = \tfrac{12}{3} = 4$$
$$\mu$$	Population mean	“The true average of all scores”
$$s$$	Sample standard deviation	Spread of quiz scores
$$\sigma$$	Population standard deviation	Spread of SAT scores
$$df$$	Degrees of freedom	$$df = n-1 = 29$$ if $$n=30$$
$$t$$	t-test statistic	Compare two group means
$$F$$	ANOVA statistic	Compare 3+ group means
$$r$$	Pearson correlation	Strength of linear relationship
$$R^2$$	Coefficient of determination	Proportion of variance explained
$$\chi^2$$	Chi-square statistic	Compare observed vs. expected counts
$$p$$	Probability value	“p < 0.05” → significant result

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Tags

educational-statistics

self-test-quiz

Applications: Cases and Examples

Case 1 — Independent t-test (Two Groups)

Scenario: A teacher wants to compare math test scores between students taught with traditional lectures and those taught with interactive software.

Question: Are the two teaching methods different in average test score?

Design/Test: Independent-samples t-test.

Worked Example:

Group A (Lecture): mean = 78, SD = 10, n = 20
Group B (Software): mean = 85, SD = 12, n = 20

Formula:
$$t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\tfrac{s_1^2}{n_1} + \tfrac{s_2^2}{n_2}}}$$

In words:
$$t = \frac{\text{mean}_1 - \text{mean}_2}{\sqrt{\tfrac{\text{variance}_1}{n_1} + \tfrac{\text{variance}_2}{n_2}}}$$

Plugging in values:
$$t = \frac{78 - 85}{\sqrt{\tfrac{100}{20} + \tfrac{144}{20}}} = \frac{-7}{\sqrt{5 + 7.2}} = \frac{-7}{\sqrt{12.2}} = \frac{-7}{3.49} = -2.01$$

Degrees of freedom = 38.

Case 2 — Paired t-test (Before and After)

Scenario: Students take a memory test before and after a week of practice.

Question: Did memory scores improve after training?

Design/Test: Paired-samples t-test.

Worked Example:

Differences (After – Before): 2, 4, 3, 5, 6

Mean difference:
$$\bar{D} = \frac{2+4+3+5+6}{5} = 4$$
Standard deviation of differences: $$s_D = 1.58$$

Formula:
$$t = \frac{\bar{D}}{s_D / \sqrt{n}}$$

Plugging in values:
$$t = \frac{4}{1.58/\sqrt{5}} = \frac{4}{0.71} = 5.63$$

Degrees of freedom = 4.

Case 3 — One-way ANOVA (Three Groups)

Scenario: A psychologist tests three methods of stress reduction: meditation, exercise, and music.

Question: Do the methods differ in average stress score?

Design/Test: One-way ANOVA.

Worked Example (summary):

Group means: Meditation = 65, Exercise = 70, Music = 80
$$SS_{\text{between}} = 300, , df_{\text{between}} = 2, , MS_{\text{between}} = 150$$
$$SS_{\text{within}} = 200, , df_{\text{within}} = 12, , MS_{\text{within}} = 16.7$$

Formula:
$$F = \frac{MS_{\text{between}}}{MS_{\text{within}}}$$

Plugging in values:
$$F = \frac{150}{16.7} = 9.0$$

df = (2, 12).

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Tags

statistical-applications

worked-examples

hypothesis-testing

t-test

independent-samples-t-test

paired-samples-t-test

inferential-statistics

experimental-design

data-analysis

applied-statistics

educational-statistics

online-textbook

self-test-quiz

Part 4 — Applications (Cases and Examples)

Welcome to Part 4 — Applications (Cases and Examples) of this free online high school statistics textbook. This hands-on section brings statistical concepts to life through detailed, worked-out case studies and real-world examples. High school students explore complete applications of hypothesis testing—including t-tests, ANOVA designs, chi-square tests, and non-parametric methods—covering everything from formulating research questions and selecting the appropriate test to performing calculations, interpreting results, and drawing meaningful conclusions.

Ideal for AP Statistics practice and pre-college preparation, Part 4 features 10 comprehensive cases with step-by-step explanations, formulas, data examples, and practical scenarios (e.g., comparing teaching methods, stress reduction programs, and categorical associations). These worked examples reinforce descriptive statistics, inferential statistics, and critical statistical thinking in an engaging, example-driven format.

Case Studies in Part 4: Applications

Case 1: Independent t-Test – Comparing two independent groups (e.g., different teaching methods).
Case 2: Paired t-Test – Analyzing before-and-after data in the same subjects.
Case 3: One-Way ANOVA – Testing differences across three or more groups.
Case 4: Factorial ANOVA (2×2 Design) – Examining main effects and interactions.
Case 5: Repeated-Measures ANOVA – Handling multiple measurements on the same subjects.
Case 6: Mixed ANOVA – Combining between-subjects and within-subjects factors.
Case 7: Chi-Square Goodness-of-Fit – Assessing observed vs. expected frequencies.
Case 8: Chi-Square Test of Independence – Exploring relationships in categorical data.
Case 9: Mann-Whitney U Test – Non-parametric alternative for two independent samples.
Case 10: Wilcoxon Signed-Rank Test – Non-parametric option for paired data.

A practice self-test quiz is also available to reinforce learning (optional signup for full interactive access). Dive into these free high school statistics applications for real-world insight into hypothesis testing, statistical analysis examples, and building confidence with data interpretation!

Case 1 — Independent t-test (Two Groups)

Scenario: A teacher compares math scores of students taught by lecture vs. interactive software.

Question: Are the two teaching methods different in average score?

Design/Test: Independent-samples t-test.

Worked Example:

Group A (Lecture): mean = 78, SD = 10, n = 20
Group B (Software): mean = 85, SD = 12, n = 20

Formula:
$$t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\tfrac{s_1^2}{n_1} + \tfrac{s_2^2}{n_2}}}$$

In words:
$$t = \frac{\text{mean}_1 - \text{mean}_2}{\sqrt{\tfrac{\text{variance}_1}{n_1} + \tfrac{\text{variance}_2}{n_2}}}$$

Plugging in values:
$$t = \frac{78 - 85}{\sqrt{\tfrac{100}{20} + \tfrac{144}{20}}} = \frac{-7}{\sqrt{12.2}} = \frac{-7}{3.49} = -2.01$$

Degrees of freedom = 38.

Case 2 — Paired t-test (Before and After)

Scenario: Students take a memory test before and after a week of practice.

Question: Did scores improve after training?

Design/Test: Paired-samples t-test.

Worked Example:

Differences (After – Before): 2, 4, 3, 5, 6

Mean difference:
$$\bar{D} = \frac{2+4+3+5+6}{5} = 4$$
Standard deviation of differences: $$s_D = 1.58$$

Formula:
$$t = \frac{\bar{D}}{s_D / \sqrt{n}}$$

Plugging in values:
$$t = \frac{4}{1.58/\sqrt{5}} = \frac{4}{0.71} = 5.63$$

Degrees of freedom = 4.

Case 3 — One-way ANOVA (Three Groups)

Scenario: A psychologist tests meditation, exercise, and music as stress-reduction methods.

Question: Do the methods differ in mean stress score?

Design/Test: One-way ANOVA.

Worked Example:

Group means: Meditation = 65, Exercise = 70, Music = 80
$$SS_{\text{between}} = 300, , df_{\text{between}} = 2, , MS_{\text{between}} = 150$$
$$SS_{\text{within}} = 200, , df_{\text{within}} = 12, , MS_{\text{within}} = 16.7$$

Formula:
$$F = \frac{MS_{\text{between}}}{MS_{\text{within}}}$$

$$F = \frac{150}{16.7} = 9.0, \quad df = (2,12)$$

Case 4 — Factorial ANOVA (2 × 2 Design)

Scenario: A researcher studies teaching method (Lecture vs. Online) × Time of Day (Morning vs. Afternoon).

Question: Do method, time, or their interaction affect performance?

Design/Test: Two-way (factorial) ANOVA.

Worked Example (summary):

Lecture: Morning = 70, Afternoon = 90
Online: Morning = 80, Afternoon = 80

Interaction: Lecture scores rise with time, Online stays flat.

Formulas:

$$df_A = a - 1, , df_B = b - 1, , df_{A \times B} = (a-1)(b-1), , df_{\text{within}} = N - ab$$

Case 5 — Repeated-Measures ANOVA

Scenario: Five students are tested across three conditions.

Question: Do scores differ across conditions?

Design/Test: Repeated-measures ANOVA.

Worked Example (summary):

Means increase steadily: 70 → 75 → 80
df:
$$df_{\text{rows}} = n - 1, \quad df_{\text{columns}} = k - 1, \quad df_{\text{error}} = (n-1)(k-1)$$

Formula:
$$F = \frac{MS_{\text{columns}}}{MS_{\text{error}}}$$

Case 6 — Mixed ANOVA

Scenario: Two groups (Drug, Placebo) tested across three weeks.

Question: Is there an effect of group, time, or interaction?

Design/Test: Mixed (split-plot) ANOVA.

Worked Example (summary):

Drug: 70 → 80 → 90
Placebo: 70 → 72 → 74
Drug improves over time, Placebo stays flat.

Formula:
$$F = \frac{MS_{\text{effect}}}{MS_{\text{error}}}$$

Case 7 — Chi-square Goodness-of-Fit

Scenario: A survey asks students to choose a favorite subject: Math, Science, or English.

Question: Is the distribution of responses different from equal chance?

Design/Test: Chi-square goodness-of-fit test.

Formula:
$$\chi^2 = \sum \frac{(O - E)^2}{E}$$

In words:
$$\chi^2 = \frac{\text{(Observed - Expected)}^2}{\text{Expected}}, , \text{summed across categories}$$

Case 8 — Chi-square Test of Independence

Scenario: A researcher tests whether gender (Male, Female) is related to sport preference (Soccer, Basketball, Tennis).

Question: Is there an association between gender and sport?

Design/Test: Chi-square test of independence.

Formula:
$$\chi^2 = \sum \frac{(O - E)^2}{E}$$

Case 9 — Mann–Whitney U Test

Scenario: Students in two different schools are ranked by teacher ratings.

Question: Do the two groups differ in median rank?

Design/Test: Mann–Whitney U test (non-parametric).

Formula:
$$U = n_1 n_2 + \frac{n_1 (n_1 + 1)}{2} - R_1$$

Where $$R_1$$ = sum of ranks for group 1.

Case 10 — Wilcoxon Signed-Rank Test

Scenario: The same students are ranked before and after training.

Question: Did the ranks change?

Design/Test: Wilcoxon signed-rank test (non-parametric).

Formula (summary):

Compute differences (After – Before).
Rank the absolute differences.
Assign signs and sum.
Test statistic = smaller of the two signed sums.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Identify the Design

Case 1

Scenario: A teacher compares test scores of students in two different classrooms (Class A vs. Class B).
Question: Are the two groups significantly different in mean score?
Answer: Independent-samples t-test.

Case 2

Scenario: A researcher tests the same group of students before and after tutoring.
Question: Did their scores improve after the program?
Answer: Paired-samples t-test (dependent t-test).

Case 3

Scenario: Three groups of students use different study methods: flashcards, highlighting, and practice tests.
Question: Do the study methods lead to different mean scores?
Answer: One-way ANOVA.

Case 4

Scenario: A psychologist measures anxiety scores in patients given three different drugs.
Question: Do the drugs produce different mean anxiety scores?
Answer: One-way ANOVA.

Case 5

Scenario: A study compares two groups of athletes: runners vs. swimmers, on reaction time.
Question: Are the two sports groups different in mean reaction time?
Answer: Independent-samples t-test.

Case 6

Scenario: Students are tested at three times: beginning, middle, and end of the semester.
Question: Did their scores change over time?
Answer: Repeated-measures ANOVA.

Case 7

Scenario: Two teaching methods (Lecture, Online) are tested across two times of day (Morning, Afternoon).
Question: What are the effects of method, time, and their interaction?
Answer: Two-way (factorial) ANOVA.

Case 8

Scenario: A company compares productivity of three work shifts (Day, Evening, Night) across two departments (Sales, Service).
Question: Are there main effects of shift and department, and is there an interaction?
Answer: Two-way (factorial) ANOVA.

Case 9

Scenario: Students are randomly assigned to a control or experimental group, and both groups are measured three times (Weeks 1, 2, 3).
Question: Is there an effect of group, time, and interaction?
Answer: Mixed (split-plot) ANOVA.

Case 10

Scenario: A survey asks students to choose their favorite subject: Math, Science, or English.
Question: Is the distribution of responses different from chance?
Answer: Chi-square goodness-of-fit test.

Case 11

Scenario: A researcher studies whether gender (Male, Female) is related to preference for sports (Soccer, Basketball, Tennis).
Question: Is there an association between gender and sport preference?
Answer: Chi-square test of independence.

Case 12

Scenario: Students are ranked by teacher ratings: 1st, 2nd, 3rd, etc. Two different teaching methods are compared on these ranks.
Question: Do the groups differ in median ranks?
Answer: Mann–Whitney U test (non-parametric).

Case 13

Scenario: The same students are ranked before and after a training program.
Question: Did the ranks change after training?
Answer: Wilcoxon signed-rank test (non-parametric).

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Tags

educational-statistics

Lecture 6 — ANOVA (Partitioning the Variance)

The t-test compares two means. But what if we have three or more groups?
We could run multiple t-tests, but that inflates the chance of error.

The solution is the Analysis of Variance (ANOVA).
ANOVA partitions the variability into two parts: between groups and within groups.

Partitioning the Variance

Total variability = variability between groups + variability within groups.

Between groups: differences due to the factor (treatment).
Within groups: differences due to chance or individual variation.

Symbolic formula:
$$F = \frac{MS_{\text{between}}}{MS_{\text{within}}}$$

Formula in words:
$$F = \frac{\text{mean square between groups}}{\text{mean square within groups}}$$

Where:

$$MS_{\text{between}} = \tfrac{SS_{\text{between}}}{df_{\text{between}}}$$
$$MS_{\text{within}} = \tfrac{SS_{\text{within}}}{df_{\text{within}}}$$

Degrees of Freedom

$$df_{\text{between}} = k - 1$$
$$df_{\text{within}} = N - k$$
$$df_{\text{total}} = N - 1$$

Where $$k$$ = number of groups, $$N$$ = total number of observations.

Example (One-way ANOVA)

Three groups of students use different study techniques:

Group A: mean = 70
Group B: mean = 75
Group C: mean = 85

Suppose calculations give:

$$SS_{\text{between}} = 300, , df_{\text{between}} = 2 \Rightarrow MS_{\text{between}} = 150$$
$$SS_{\text{within}} = 200, , df_{\text{within}} = 12 \Rightarrow MS_{\text{within}} = 16.7$$

Then:

$$F = \frac{150}{16.7} = 9.0$$

This F value is compared to the F table at df = (2, 12).

Definition

ANOVA: compares means across three or more groups.
F ratio: signal-to-noise ratio (treatment effect vs. error).

Visual Placeholders

Figure L6.1 — Partitioning Variance. Total variability divided into Between vs. Within.

Figure L6.2 — One-way ANOVA Layout. Bar graph with three groups (A, B, C).

Figure L6.3 — ANOVA Summary Table. Source | SS | df | MS | F | p.

Why This Matters

ANOVA generalizes the t-test to multiple groups.
It is one of the most widely used tools in psychology, education, and medicine.
Understanding the F ratio is key: a large F means treatment differences are greater than chance variation.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Lecture 5 — The t-test

This lecture emphasizes conceptual understanding of the t-test, its logic, and how it fits into the broader structure of statistical reasoning.

The t-test is one of the most widely used statistical tools.
It compares two means and asks: Is the difference between them real, or could it be due to chance?

The t-test is closely related to the z-test.
When the population standard deviation is unknown and the sample size is small, we use t instead of z.

Types of t-Tests

One-sample t-test: compares a sample mean to a known or hypothesized population mean.
Independent-samples t-test: compares means from two separate groups.
Paired-samples t-test: compares two scores from the same group (before vs. after).

Symbolic Formulas

One-sample t-test
$$t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}}$$

Independent-samples t-test
$$t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\tfrac{s_1^2}{n_1} + \tfrac{s_2^2}{n_2}}}$$

Paired-samples t-test
$$t = \frac{\bar{D}}{s_D / \sqrt{n}}$$

Degrees of Freedom

One-sample: $$df = n - 1$$
Independent-samples: $$df = n_1 + n_2 - 2$$
Paired-samples: $$df = n - 1$$

Example (Independent t-Test)

Two groups of students try different study methods:

Group A: $n = 10$, mean = 80, SD = 10
Group B: $n = 10$, mean = 90, SD = 10

$$t = \frac{80 - 90}{\sqrt{\tfrac{10^2}{10} + \tfrac{10^2}{10}}} = \frac{-10}{\sqrt{10 + 10}} = \frac{-10}{\sqrt{20}} = \frac{-10}{4.47} = -2.24$$

Degrees of freedom = 18.
Compare this t-value to the critical value in the t-table at $df = 18$.

Example (Paired t-Test)

Students take a test before and after tutoring.
Differences (After − Before): 4, 6, 5, 3, 2.

Mean difference:
$$\bar{D} = \frac{4 + 6 + 5 + 3 + 2}{5} = 4$$

Standard deviation of differences:
$$s_D = 1.58$$

$$t = \frac{4}{1.58 / \sqrt{5}} = \frac{4}{0.71} = 5.63$$

Degrees of freedom = 4.
This large t-value indicates strong evidence of improvement.

Definition

Independent t-test: compares two separate groups.
Paired t-test: compares the same group measured twice.
Degrees of freedom (df): number of independent pieces of information.

Visuals

Figure L5.1 — Independent t-Test. Bar graph of two groups (A and B) with means and SEM error bars.

Figure L5.2 — Paired t-Test. Line plot showing before vs. after scores for each student.

Figure L5.3 — t vs. z Distribution. Overlay of the normal (z) curve and t curves with df = 5 and 20.

Why This Matters

The t-test is the workhorse of statistics.
It forms the foundation for many other methods (ANOVA, regression, mixed models).
Understanding t means understanding how we compare signal (mean difference) to noise (variability).

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Subscribe to t-test