Learning kernel matrix by matrix exponential update. Jun Liao. Problem. Given a square matrix that shows the similarities between examples. could be asymmetric could have missing values
symmetric, semi positive definite and no missing values.
Called “Diag Kernel”
the parameter to learn is a symmetric positive definite(SymPD) matrix W instead of a vector w.
When we use von neumann divergence,
Matrix exponential translate a symmetric matrix into a SymPD matrix. As long as W0 is SymPD and Xt is symmetric, Wt is always SymPD. Thus keep the kernel matrix properties.
W is the learned kernel matrix.
Each element in the similarity matrix is an instance for the update: . only has nonzero value 0.5 at (i,j) and (j,i) positions (learning one element of similarity per iteration)
Here, is for all the none-zero elements in the similarity matrix.
Subsample of a drug discovery dataset CDK2. 37 positives and 37 negatives. Similarity matrix is created by FLEXS. It does 3D alignment of two molecules. The similarity value is asymmetric. It distinguishes the reference molecule and the test molecule.
First learn the kernel matrix using the whole similarity matrix by the two approaches. Then do 10 random 50%(training)-50%(testing) split of the dataset. Use SVM classification. The average of 10 runs’ test error rate is reported.
(2) A=A+I λ
Original matrix is made to be symmetric before using algorithm. W0 is the identity matrix.The displayed matrix is learned matrix after 80 iterations.
Matrix Exponential Update is significantly better than the naïve approach.