Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering

4239 mots 17 pages

Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering
Yoshua Bengio, Jean-Francois Paiement and Pascal Vincent ¸ D´ partement d’Informatique et Recherche Op´ rationnelle e e Universit´ de Montr´ al e e Montr´ al, Qu´ bec, Canada, H3C 3J7 e e {bengioy,paiemeje,vincentp}@iro.umontreal.ca Technical Report 1238, D´ partement d’Informatique et Recherche Op´ rationnelle e e July 25, 2003

Abstract Several unsupervised learning algorithms based on an eigendecomposition provide either an embedding or a clustering only for given training points, with no straightforward extension for out-of-sample examples short of recomputing eigenvectors. This paper provides algorithms for such an extension for Local Linear Embedding (LLE), Isomap, Laplacian Eigenmaps, Multi-Dimensional Scaling (all algorithms which provide lower-dimensional embedding for dimensionality reduction) as well as for Spectral Clustering (which performs non-Gaussian clustering). These extensions stem from a uniﬁed framework in which these algorithms are seen as learning eigenfunctions of a kernel. LLE and Isomap pose special challenges as the kernel is training-data dependent. Numerical experiments on real data show that the generalizations performed have a level of error comparable to the variability of the embedding algorithms to the choice of training data.

1

Introduction

In the last few years, many unsupervised learning algorithms have been proposed which share the use of an eigendecomposition for obtaining a lower-dimensional embedding of the data that characterizes a non-linear manifold near which the data would lie: Local Linear Embedding (LLE) (Roweis and Saul, 2000), Isomap (Tenenbaum, de Silva and Langford, 2000) and Laplacian Eigenmaps (Belkin and Niyogi, 2003). There are also many variants of Spectral Clustering (Weiss, 1999; Ng, Jordan and Weiss, 2002), in which such an embedding is an intermediate step before obtaining a clustering of the data that can

Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering

en relation

Feuille de note exam mqt-1500

Corrigé bts muc

Diaporama ifsi

Exercices corrigés exo

Corrigé exo3

maths

DATA MINI MININGNG

Hello

1s_6 la_couleur_des_etoiles

site archéologique

Dissertation

Methode pduc

Cned

Equateur

education patient