Mastering Machine Learning Algorithms
上QQ阅读APP看书,第一时间看更新

Example of locally linear embedding

 We can now apply this algorithm to the Olivetti faces dataset, instantiating the Scikit-Learn class LocallyLinearEmbedding with n_components=2 and n_neighbors=15:

from sklearn.manifold import LocallyLinearEmbedding

lle = LocallyLinearEmbedding(n_neighbors=15, n_components=2)
X_lle = lle.fit_transform(faces['data'])

The result (limited to the first 100 samples) is shown in the following plot:

Locally linear embedding applied to 100 samples drawn from the Olivetti faces dataset

Even if the strategy is different from Isomap, we can determine some coherent clusters. In this case, the similarity is obtained through the conjunction of small linear blocks; for the faces, they can represent particular micro-features, like the shape of the nose or the presence of glasses, that remain invariant in the different portraits of the same person. LLE is, in general, preferable when the original dataset is intrinsically locally linear, possibly lying on a smooth manifold. In other words, LLE is a reasonable choice when small parts of a sample are structured in a way that allows the reconstruction of a point given the neighbors and the weights. This is often true for images, but it can be difficult to determine for a generic dataset. When the result doesn't reproduce the original clustering, it's possible to employ the next algorithm or t-SNE, which is one the most advanced.