Neural Tangent Kernel Intuition

Function Approximation: Neural Network vs Kernel Regression

True function

Neural Network

Kernel Regression (NTK)

Training Points

Kernel Matrix (NTK)

Training Convergence

Key Insight: The Neural Tangent Kernel (NTK) theory shows that infinitely wide neural networks trained with gradient descent are equivalent to kernel regression with a deterministic kernel. As network width increases, the NTK stays nearly constant during training, and the network's predictions converge to those of kernel regression. This demo visualizes this correspondence.

Network Architecture

Hidden Width 64

Depth (Layers) 2

Training

Learning Rate 0.01

Train Steps 500

NTK Kernel

Length Scale 0.3

Regularization (λ) 0.001

Data

Training Points 10

Noise Level 0.1

NN Loss: -

Kernel Loss: -

Prediction MSE: -

Epoch: 0