Statistics 2nd ed

predictive-modeling

Lesson 18 — AI and Neural Networks (Intro)

Artificial Intelligence (AI) aims to build systems that can learn, adapt, and make decisions.
One powerful tool is the neural network, inspired by the brain.


From Statistics to AI

  • Regression predicts Y from X
  • Logistic regression predicts probability (0–1)
  • Neural networks generalize this idea: many inputs, many layers, nonlinear patterns

The Structure of a Neural Network

  1. Input layer — variables (X₁, X₂, …)
  2. Hidden layers — units that transform the input
  3. Output layer — prediction or classification

Each connection has a weight (like a slope in regression).


Formula for a Neuron

A single unit in the network:

$$z = \sum w_i X_i + b$$

$$y = f(z)$$

Where:

  • $$w_i$$ = weights
  • $$X_i$$ = inputs
  • $$b$$ = bias (like an intercept)
  • $$f(z)$$ = activation function (e.g., logistic, ReLU)

Learning in a Network

The network predicts outputs and compares them with the true answers.
The error is sent backward through the network to adjust weights.
This is called backpropagation.


Example

Predicting if a student will pass or fail based on:

  • Study hours
  • Attendance
  • Practice problems completed

Inputs → combined with weights → logistic activation → output: probability of passing.


Visuals

Simple neural network diagram

Figure 18.1 — Simple Neural Network (Inputs → Hidden → Output)

Activation functions: logistic and ReLU

Figure 18.2 — Activation Functions


Why This Matters

  • Neural networks extend regression and logistic regression.
  • They allow learning from large, complex datasets (images, speech, language).
  • Modern AI (translation, recognition, chatbots) is powered by these models.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Lesson 17 — Regression Beyond the Line

multiple regression plane
logistic curve

Simple regression predicts Y from one X.
But in real life, outcomes often depend on several variables — or may not be linear.

This chapter introduces multiple regression and logistic regression.


Multiple Regression

Formula:

$$\hat{Y} = a + b_1X_1 + b_2X_2 + \dots + b_kX_k$$

In words:
$$\text{Predicted Y} = \text{intercept} + (b_1 \times X_1) + (b_2 \times X_2) + \dots$$

Where:

  • $$X_1, X_2, \dots X_k$$ = predictors
  • $$b_1, b_2, \dots b_k$$ = slopes (weights for each predictor)

Example: Predicting college GPA from:

  • High school GPA ($$X_1$$)
  • Study hours ($$X_2$$)

Equation:
$$\hat{Y} = 1.0 + 0.5X_1 + 0.1X_2$$

Interpretation:

  • For each 1-point increase in HS GPA, college GPA rises 0.5.
  • For each extra study hour, GPA rises 0.1.

Coefficient of Determination

In multiple regression, $$R^2$$ tells us the proportion of variance explained by all predictors together.

Example: $$R^2 = 0.65$$ → predictors explain 65% of the outcome’s variability.


Logistic Regression

What if the outcome is yes/no (categorical)?
Example: Will a student pass or fail?

We use logistic regression.

Formula:

$$P(Y=1) = \frac{1}{1 + e^{-(a + bX)}}$$

In words:
$$\text{Probability of success} = \frac{1}{1 + e^{-(\text{intercept} + \text{slope} \times X)}}$$

Output: probability between 0 and 1.

Example: Predicting pass/fail from study hours.

  • Equation: $$P = \frac{1}{1 + e^{-( -2 + 0.5X )}}$$
  • If X = 6 hours: $$P = \frac{1}{1 + e^{-1}} = 0.73$$
  • About 73% chance of passing.

Visuals

Figure 17.1 — Multiple regression plane: Y predicted from two predictors.

Figure 17.2 — Logistic regression curve: probability vs. study hours.


Why This Matters

  • Multiple regression = prediction with many factors
  • Logistic regression = prediction when the outcome is categorical
  • $$R^2$$ = strength of prediction

These methods expand the power of regression beyond a straight line, preparing for modern predictive modeling.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.

Lesson 10 — Regression

scatter intercept
slope intercept

Correlation tells us the strength of the relationship between two variables.
Regression goes one step further: it gives us an equation to predict one variable from another.

 


The Regression Equation

The regression line predicts Y from X.

Symbolic formula:
$$\hat{Y} = a + bX$$

Formula in words:
$$\text{Predicted Y} = \text{intercept} + (\text{slope} \times X)$$

Where:

  • $$\hat{Y}$$ = predicted value of Y
  • $$a$$ = intercept (value of Y when X = 0)
  • $$b$$ = slope (change in Y for each 1-unit change in X)

Slope and Intercept

The slope is calculated as:

$$b = \frac{\sum (X - \bar{X})(Y - \bar{Y})}{\sum (X - \bar{X})^2}$$

The intercept is:

$$a = \bar{Y} - b\bar{X}$$


Example

Study hours (X) and test scores (Y):

  • X = [2, 4, 6]
  • Y = [50, 60, 80]
  • $$\bar{X} = 4, \quad \bar{Y} = 63.3$$

Step 1: Slope

  • Numerator = Σ(X – X̄)(Y – Ȳ) = 60
  • Denominator = Σ(X – X̄)² = 8
  • $$b = \tfrac{60}{8} = 7.5$$

Step 2: Intercept

  • $$a = 63.3 - (7.5)(4) = 33.3$$

Regression equation:
$$\hat{Y} = 33.3 + 7.5X$$

Interpretation: each extra study hour adds about 7.5 points to the predicted test score.


Coefficient of Determination

The square of the correlation, $$r^2$$, shows the proportion of variance explained by regression.

Here: $$r^2 = 0.98$$, so 98% of score variation is explained by study hours.

 


Definition

  • Regression: predicts one variable from another using a line
  • Slope (b): how much Y changes per unit change in X
  • Intercept (a): expected value of Y when X = 0
  • r²: proportion of variance explained by regression

Visuals

Figure 10.1 — Scatterplot with regression line (Y predicted from X).

Figure 10.2 — Illustration of slope (rise/run) and intercept.


Why This Matters

Regression is a predictive tool.
It connects statistical description to practical forecasting: how much outcome (Y) changes with predictor (X).
It is the basis for more advanced models used in science, business, and data analysis.

Practice self-test quiz

In the space below, please find practice problems and self-test quizzes. For full access, please signup free.