Statistics 2nd ed

correlation

Appendix 8 — Glossary of Key Terms

Mean (average)
Sum of all scores divided by number of scores.
Example: (6 + 8 + 10) / 3 = 8.

Median
Middle score when data are ordered.
Example: For [5, 7, 8], median = 7.

Mode
Most frequent score.
Example: For [2, 3, 3, 5], mode = 3.

Variance (s²)
Average squared deviation from the mean.

Standard Deviation (s)
Square root of variance. Spread of scores around the mean.

Standard Error of the Mean (SEM)
How much sample means vary.
Formula: $$SEM = \frac{s}{\sqrt{n}}$$

t-test
Compares two means.

ANOVA (F-test)
Compares three or more means.

Post Hoc Test
Used after ANOVA to find which groups differ.

Correlation (r)
Strength and direction of a linear relationship. Range: –1 to +1.

Regression
Equation that predicts Y from X.
Example: $$\hat{Y} = a + bX$$

Chi-square (χ²)
Test for categorical data (counts).

Degrees of Freedom (df)
Independent pieces of information in a test.

p-value
Probability of getting the observed result (or more extreme) if the null hypothesis is true.


📱 QR: Interactive glossary (search symbols, formulas, definitions)

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Appendix 1 — Symbols and Notation (Cheat Sheet)

Symbols and Notation

A quick reference to the symbols used in this book.

SymbolMeaningExample
$$\Sigma$$Summation (add them up)$$\Sigma X = 2+4+6=12$$
$$\bar{X}$$Sample mean$$\bar{X} = \tfrac{12}{3} = 4$$
$$\mu$$Population mean“The true average of all scores”
$$s$$Sample standard deviationSpread of quiz scores
$$\sigma$$Population standard deviationSpread of SAT scores
$$df$$Degrees of freedom$$df = n-1 = 29$$ if $$n=30$$
$$t$$t-test statisticCompare two group means
$$F$$ANOVA statisticCompare 3+ group means
$$r$$Pearson correlationStrength of linear relationship
$$R^2$$Coefficient of determinationProportion of variance explained
$$\chi^2$$Chi-square statisticCompare observed vs. expected counts
$$p$$Probability value“p < 0.05” → significant result

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Lesson 9 — Correlation

scatter plot
scatterplots

Correlation measures the strength and direction of the relationship between two variables.
It tells us whether high values of one variable go with high (or low) values of another.


Pearson’s r

The most common measure is Pearson’s correlation coefficient, $$r$$.
It ranges from –1 to +1.

  • $$r = +1$$ → perfect positive correlation (as X increases, Y increases).
  • $$r = –1$$ → perfect negative correlation (as X increases, Y decreases).
  • $$r = 0$$ → no linear relationship.

Symbolic formula:
$$r = \frac{\sum (X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum (X - \bar{X})^2 , \sum (Y - \bar{Y})^2}}$$

Formula in words:
$$r = \frac{\text{sum of the cross-products of deviations from the mean}}{\text{square root of (sum of squared deviations in X × sum of squared deviations in Y)}}$$


Example

Suppose study hours (X) and test scores (Y) are:

  • X = [2, 4, 6]
  • Y = [50, 60, 80]

Means:

  • $$\bar{X} = 4$$
  • $$\bar{Y} = 63.3$$

Deviations:

  • (2–4)(50–63.3) = (–2)(–13.3) = 26.6
  • (4–4)(60–63.3) = (0)(–3.3) = 0
  • (6–4)(80–63.3) = (2)(16.7) = 33.4

Sum cross-products = 60

Sum squares X = (–2)² + 0² + 2² = 8
Sum squares Y = (–13.3)² + (–3.3)² + 16.7² ≈ 466.7

So:
$$r = \frac{60}{\sqrt{8 \times 466.7}} = \frac{60}{\sqrt{3733}} = \frac{60}{61.1} = 0.98$$

A very strong positive correlation.


Coefficient of Determination

The square of correlation is $$r^2$$.
It represents the proportion of variance in Y explained by X.

Example above:
$$r^2 = (0.98)^2 = 0.96$$

So about 96% of the variation in scores is explained by study hours.


Definition

  • Correlation: degree of linear relationship between two variables.
  • Pearson’s r: ranges from –1 to +1.
  • Coefficient of determination (r²): proportion of explained variance.

Visual Placeholders

Figure 9.1 — Scatterplot with positive correlation (points rising, line upward).

Figure 9.2 — Scatterplots showing r ≈ +1, r ≈ 0, r ≈ –1.


Why This Matters

Correlation is the first step in studying relationships.
It helps identify whether variables move together, setting the stage for regression analysis.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.