PythonAI

Linear Regression from Scratch — How Models Learn Relationships

1/13/2026
3 min read

Linear Regression from Scratch — How Models Learn Relationships

Meta Description: Learn how linear regression works from the ground up, how models discover relationships in data, and how training transforms simple math into useful predictions.


Introduction

After understanding vectors, matrices, and the training loop, we are now ready to see learning in action.

Linear Regression is the simplest learning model — yet it contains every core idea behind modern machine learning:
prediction, error, gradients, and updates.

If you understand linear regression, you understand the foundation of AI.


What is Linear Regression?

Linear regression learns the relationship between an input x and an output y.

text
y = mx + b

Where:

  • x is the input (for example: square feet)
  • y is the output (for example: house price)
  • m is the weight (slope)
  • b is the bias (intercept)

The model’s goal is to learn the best values of m and b from data.


A Real-World Example

Suppose we want to predict house prices based on size.

text
square_feet,price
500,150000
1000,300000
1500,450000
2000,600000
2500,750000

Here, the relationship is simple:
as square footage increases, price increases.

But the model doesn’t know this yet.

It must learn it from data.


How the Model Learns

Training follows the same loop used in all machine learning:

text
Guess → Measure Error → Adjust → Repeat

At first, the model starts with random values for m and b.

Each training step:

  1. Makes a prediction
  2. Measures how wrong it is
  3. Adjusts m and b
  4. Improves slightly

After many repetitions, the model discovers the relationship hidden in the data.


Training Linear Regression From Scratch (Python)

python
import numpy as np

# Training data
X = np.array([500, 1000, 1500, 2000, 2500], dtype=float)
y = np.array([150000, 300000, 450000, 600000, 750000], dtype=float)

# Initialize model parameters
m = 0.0  # weight
b = 0.0  # bias

learning_rate = 0.0000001
epochs = 5000

for epoch in range(epochs):

    # Step 1: Prediction
    y_pred = m * X + b

    # Step 2: Measure error (Mean Squared Error)
    loss = np.mean((y_pred - y) ** 2)

    # Step 3: Compute gradients
    dm = np.mean(2 * X * (y_pred - y))
    db = np.mean(2 * (y_pred - y))

    # Step 4: Update parameters
    m -= learning_rate * dm
    b -= learning_rate * db

    if epoch % 1000 == 0:
        print(f"Epoch {epoch} | Loss: {loss:,.2f}")

print("\nFinal Model:")
print("m =", m)
print("b =", b)

What Just Happened?

  • The model started knowing nothing
  • It repeatedly guessed prices
  • It measured how wrong those guesses were
  • It slowly corrected itself
  • The correct relationship emerged from data

This is learning.


Why This Matters for AI Engineering

Every modern AI system — from linear regression to neural networks to LLMs — learns using this same pattern:

text
Prediction → Loss → Gradient → Update → Repeat

Different models, same learning engine.


Final Takeaway

Linear regression is the DNA of machine learning.

If you understand this model, you understand the foundation upon which all modern AI systems are built.

Share this article

Chalamaiah Chinnam

Chalamaiah Chinnam

AI Engineer & Senior Software Engineer

15+ years of enterprise software experience, specializing in applied AI systems, multi-agent architectures, and RAG pipelines. Currently building AI-powered automation at LinkedIn.