Machine learning for non-metric proximity data

This blog provides literature, algorithms and data sets for the analysis of (indefinite) proximity data. In machine learning kernels are given as proximity data. But if the proximity measure is non-metric most kernel approaches are inaccurate or fail. This blog shows ways how to deal with these so called indefinite, non-positive or non-psd proximity data, providing links to literature and algorithms. The final objective is to provide - Probabilistic Models in Pseudo-Euclidean Spaces (ProMoS)

Saturday, 8 February 2014

Accepted contribution at ESANN 2014

Accepted paper on Proximity learning for non-standard big data in the special session on Learning and Modeling Big Data at the ESANN 2014. We discuss the supervised learning and embedding of very large indefinite kernel matrices (generalizes also to arbitrary proximities).


Laplacian eigenmap embedding of ~200.000 protein sequences (40 billion proximities). The colors refer to the largest 21 ProSite class labels

Related technical reports:

Machine learning for non-metric proximity data

Saturday, 8 February 2014

Accepted contribution at ESANN 2014

No comments:

Post a Comment