Eigenvalues & Eigenvectors in ML

Introduction

Eigenvalues and eigenvectors are fundamental concepts in linear algebra that appear throughout machine learning. They help us understand how linear transformations affect space and are crucial for techniques like PCA, spectral clustering, and understanding neural network dynamics.

What Are They?

For a matrix A, an eigenvector v is a special vector that doesn't change direction when A is applied to it:

Av = λv

Where:

v is the eigenvector (a direction that's preserved)
λ (lambda) is the eigenvalue (how much v is scaled)
The transformation only scales the eigenvector, doesn't rotate it

Interactive Eigenspace Explorer

Visualize how matrices transform space and identify their eigenvectors (shown in red when they exist):

Explore Different Transformations:

Stretch Transformation: Stretches space along the x-axis by 3× while keeping y-axis unchanged. Eigenvectors align with coordinate axes.

This diagonal matrix has clear eigenvectors along the x and y axes. The eigenvalue 3 means vectors along x are tripled, while eigenvalue 1 means y-direction is unchanged. Common in scaling operations.

Transformation Matrix

Eigenvalues

λ₁ = 0.000

λ₂ = 0.000

Interpretation

Determinant = 3.000

Transformation Visualization

• Faded arrows: Original vectors

• Solid arrows: Transformed vectors

• Red lines: Eigenvectors (when real)

Applications in Machine Learning

Principal Component Analysis (PCA)

PCA finds the eigenvectors of the covariance matrix. These eigenvectors become the principal components:

Largest eigenvalue → direction of maximum variance
Eigenvectors are orthogonal → uncorrelated components
Used for dimensionality reduction

Spectral Clustering

Uses eigenvectors of the graph Laplacian matrix:

Eigenvectors reveal cluster structure
Works on non-convex clusters
Connected to graph partitioning

Neural Network Analysis

Eigenvalues of the Hessian matrix tell us about:

Loss landscape curvature
Optimization difficulty
Network stability

PageRank Algorithm

Google's PageRank is based on finding the eigenvector with eigenvalue 1:

Models web as a Markov chain
Stationary distribution = principal eigenvector
Eigenvalue 1 guaranteed by stochastic matrix

Key Intuitions

Geometric Interpretation

Eigenvectors point in directions that are only stretched/shrunk by the transformation, not rotated. They reveal the "natural axes" of the transformation.

Why They Matter

Many problems become simpler when viewed in the eigenvector basis. Complex transformations decompose into simple scalings along eigendirections.

Power Iteration

Repeatedly applying a matrix amplifies the eigenvector with the largest eigenvalue. This is why initialization matters in neural networks and why some matrices lead to exploding/vanishing gradients.

Key Takeaways

Eigenvectors reveal invariant directions under linear transformations
Eigenvalues tell us how much scaling happens in those directions
They're fundamental to PCA, spectral methods, and stability analysis
Complex eigenvalues indicate rotation components in the transformation
The largest eigenvalue often dominates long-term behavior
Understanding eigen-decomposition helps debug and design ML algorithms