Maths

SVD and PCA

Singular value decomposition and principal component analysis

SVD

For an inverse problem: \(d = Gm\)

  • \(d(l)\) is the data vector
  • \(G(l,n)\) the forward model matrix
  • \(m(n)\) the model vector.

The following decomposition always exists: \(G = USV^T\)

  • \(U(l,l)\) is an orthogonal matrix containing unit basis vectors of the data space
  • \(S(l,n)\) is a diagonal matrix with singular values on the diagonal
  • \(V(n,n)\) is an orthogonal matrix containing unit basis vectors of the model space

If \(G\) has a rank \(p\), \(S\) has only \(p\) singular values

\(G = U_pS_pV_p^T\) These are the reduced matrices sizes:

  • unit vectors data space: \(U(l,p)\)
  • singular values: \(S(p,p)\)
  • unit vectors model space: \(V(n,n-p)\)

PCA

Preprocessing : each feature has to be centered and scaled.

m = 100; % n samples
n = 21; % nfeatures
k = 5;  % number of desired output features
x = rand(100,21);
Sigma = cov(x);
[U,S,V] = svd(Sigma);
  • Retain only the first \(k\) columns of \(U\).
    U_ = U(:,1:k);
    
  • Apply the transform to the original \(x\) vector to get the new features \(z\)
    z = x * U_;
    

    \(z = U_{red}^Tx\)

  • input features \(x(m,n)\)
  • covariance matrix \(\Sigma(n,n)\)
  • data space basis vectors \(U(n,n)\)
  • reduced data space basis vectors \(U_{red}(n,k)\)

References and Further Reading

R. Aster, B. Borchers, C. Thurber. Parameter Estimation and Inverse Problems.