Statistics 2nd ed

scatterplot

Appendix 7 — Study Tips for Statistics

Learning statistics is not about memorizing formulas — it’s about thinking with data.
Here are some strategies to make it easier.


1. Read Formulas in Two Ways

  • Symbolic: $$\bar{X} = \frac{\Sigma X}{n}$$
  • Words: “Mean = sum of scores / number of scores”

2. Practice by Hand First

  • Work out a mean or variance with a small dataset.
  • Then check with calculator/Excel.
  • This builds intuition and confidence.

3. Draw Pictures

  • Normal curve with shaded area
  • Bar charts for group means
  • Scatterplots for correlation
    Visuals make ideas stick.

4. Watch Out for Common Mistakes

  • Mixing up SD and SEM
  • Forgetting to subtract 1 for df
  • Using a one-tailed test when two-tailed is needed

5. Use Short Sessions

  • 10–15 minutes of practice each day beats one long cram.
  • Try one formula or test per session.

6. Check Your Understanding

  • Can you explain in words what the test does?
  • Example: “t-test compares two means. ANOVA compares three or more.”

📱 QR: Online flashcards + short quiz (practice key terms & formulas)


Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Lesson 9 — Correlation

scatter plot
scatterplots

Correlation measures the strength and direction of the relationship between two variables.
It tells us whether high values of one variable go with high (or low) values of another.


Pearson’s r

The most common measure is Pearson’s correlation coefficient, $$r$$.
It ranges from –1 to +1.

  • $$r = +1$$ → perfect positive correlation (as X increases, Y increases).
  • $$r = –1$$ → perfect negative correlation (as X increases, Y decreases).
  • $$r = 0$$ → no linear relationship.

Symbolic formula:
$$r = \frac{\sum (X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum (X - \bar{X})^2 , \sum (Y - \bar{Y})^2}}$$

Formula in words:
$$r = \frac{\text{sum of the cross-products of deviations from the mean}}{\text{square root of (sum of squared deviations in X × sum of squared deviations in Y)}}$$


Example

Suppose study hours (X) and test scores (Y) are:

  • X = [2, 4, 6]
  • Y = [50, 60, 80]

Means:

  • $$\bar{X} = 4$$
  • $$\bar{Y} = 63.3$$

Deviations:

  • (2–4)(50–63.3) = (–2)(–13.3) = 26.6
  • (4–4)(60–63.3) = (0)(–3.3) = 0
  • (6–4)(80–63.3) = (2)(16.7) = 33.4

Sum cross-products = 60

Sum squares X = (–2)² + 0² + 2² = 8
Sum squares Y = (–13.3)² + (–3.3)² + 16.7² ≈ 466.7

So:
$$r = \frac{60}{\sqrt{8 \times 466.7}} = \frac{60}{\sqrt{3733}} = \frac{60}{61.1} = 0.98$$

A very strong positive correlation.


Coefficient of Determination

The square of correlation is $$r^2$$.
It represents the proportion of variance in Y explained by X.

Example above:
$$r^2 = (0.98)^2 = 0.96$$

So about 96% of the variation in scores is explained by study hours.


Definition

  • Correlation: degree of linear relationship between two variables.
  • Pearson’s r: ranges from –1 to +1.
  • Coefficient of determination (r²): proportion of explained variance.

Visual Placeholders

Figure 9.1 — Scatterplot with positive correlation (points rising, line upward).

Figure 9.2 — Scatterplots showing r ≈ +1, r ≈ 0, r ≈ –1.


Why This Matters

Correlation is the first step in studying relationships.
It helps identify whether variables move together, setting the stage for regression analysis.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.