Eigenvalues & Eigenvectors in ML
Introduction
Eigenvalues and eigenvectors are fundamental concepts in linear algebra that appear throughout machine learning. They help us understand how linear transformations affect space and are crucial for techniques like PCA, spectral clustering, and understanding neural network dynamics.
What Are They?
For a matrix A, an eigenvector v is a special vector that doesn't change direction when A is applied to it:
Where:
- v is the eigenvector (a direction that's preserved)
- λ (lambda) is the eigenvalue (how much v is scaled)
- The transformation only scales the eigenvector, doesn't rotate it
Interactive Eigenspace Explorer
Visualize how matrices transform space and identify their eigenvectors (shown in red when they exist):
Explore Different Transformations:
Stretch Transformation: Stretches space along the x-axis by 3× while keeping y-axis unchanged. Eigenvectors align with coordinate axes.
This diagonal matrix has clear eigenvectors along the x and y axes. The eigenvalue 3 means vectors along x are tripled, while eigenvalue 1 means y-direction is unchanged. Common in scaling operations.
Transformation Matrix
Eigenvalues
Interpretation
Determinant = 3.000
Transformation Visualization
• Faded arrows: Original vectors
• Solid arrows: Transformed vectors
• Red lines: Eigenvectors (when real)
Applications in Machine Learning
Principal Component Analysis (PCA)
PCA finds the eigenvectors of the covariance matrix. These eigenvectors become the principal components:
- Largest eigenvalue → direction of maximum variance
- Eigenvectors are orthogonal → uncorrelated components
- Used for dimensionality reduction
Spectral Clustering
Uses eigenvectors of the graph Laplacian matrix:
- Eigenvectors reveal cluster structure
- Works on non-convex clusters
- Connected to graph partitioning
Neural Network Analysis
Eigenvalues of the Hessian matrix tell us about:
- Loss landscape curvature
- Optimization difficulty
- Network stability
PageRank Algorithm
Google's PageRank is based on finding the eigenvector with eigenvalue 1:
- Models web as a Markov chain
- Stationary distribution = principal eigenvector
- Eigenvalue 1 guaranteed by stochastic matrix
Key Intuitions
Geometric Interpretation
Eigenvectors point in directions that are only stretched/shrunk by the transformation, not rotated. They reveal the "natural axes" of the transformation.
Why They Matter
Many problems become simpler when viewed in the eigenvector basis. Complex transformations decompose into simple scalings along eigendirections.
Power Iteration
Repeatedly applying a matrix amplifies the eigenvector with the largest eigenvalue. This is why initialization matters in neural networks and why some matrices lead to exploding/vanishing gradients.
Key Takeaways
- Eigenvectors reveal invariant directions under linear transformations
- Eigenvalues tell us how much scaling happens in those directions
- They're fundamental to PCA, spectral methods, and stability analysis
- Complex eigenvalues indicate rotation components in the transformation
- The largest eigenvalue often dominates long-term behavior
- Understanding eigen-decomposition helps debug and design ML algorithms