Performs clustering according to the spectral clustering algorithm
Source:R/sklearn-cluster.R
SpectralClustering.RdThis is a wrapper around the Python class sklearn.cluster.SpectralClustering.
Super classes
rgudhi::PythonClass -> rgudhi::SKLearnClass -> rgudhi::BaseClustering -> SpectralClustering
Methods
Method new()
The SpectralClustering class constructor.
Usage
SpectralClustering$new(
n_clusters = 8L,
eigen_solver = c("arpack", "lobpcg", "amg"),
n_components = NULL,
random_state = NULL,
n_init = 10L,
gamma = 1,
affinity = c("rbf", "nearest_neighbors", "precomputed",
"precomputed_nearest_neighbors"),
n_neighbors = 10L,
eigen_tol = "auto",
assign_labels = c("kmeans", "discretize", "cluster_qr"),
degree = 3L,
coef0 = 1,
kernel_params = NULL,
n_jobs = 1L,
verbose = FALSE
)Arguments
n_clustersAn integer value specifying the dimension of the projection subspace. Defaults to
8L.eigen_solverA string specifying the eigenvalue decomposition strategy to use. Choices are
c("arpack", "lobpcg", "amg"). AMG requires pyamg to be installed. It can be faster on very large, sparse problems, but may also lead to instabilities. Defaults to"arpack".n_componentsAn integer value specifying the number of eigenvectors to use for the spectral embedding. Defaults to
NULL, in which case,n_clustersis used.random_stateAn integer value specifying a pseudo random number generator used for the initialization of the
lobpcgeigenvectors decomposition wheneigen_solver == "amg", and for the k-means initialization. Defaults toNULLwhich uses clock time.n_initAn integer value specifying the number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of
n_initconsecutive runs in terms of inertia. Only used ifassign_labels == "kmeans". Defaults to10L.gammaA numeric value specifying the kernel coefficient for
rbf,poly,sigmoid,laplacianandchi2kernels. Ignored foraffinity == "nearest_neighbors". Defaults to1.0.affinityEither a string or an object coercible to a function via
rlang::as_function()specifying how to construct the affinity matrix:"nearest_neighbors": construct the affinity matrix by computing a graph of nearest neighbors;"rbf": construct the affinity matrix using a radial basis function (RBF) kernel;"precomputed": interpretXas a precomputed affinity matrix, where larger values indicate greater similarity between instances;"precomputed_nearest_neighbors": interpretXas a sparse graph of precomputed distances, and construct a binary affinity matrix from then_neighborsnearest neighbors of each instance;one of the kernels supported by pairwise_kernels.
Only kernels that produce similarity scores (non-negative values that increase with similarity) should be used. This property is not checked by the clustering algorithm.
Defaults to
"rbf".n_neighborsAn integer value specifying the number of neighbors to use when constructing the affinity matrix using the nearest neighbors method. Ignored for
affinity == "rbf". Defaults to10L.eigen_tolA numeric value specifying the stopping criterion for the eigen-decomposition of the Laplacian matrix. If
eigen_tol == "auto", then the passed tolerance will depend on theeigen_solver:If
eigen_solver == "arpack", theneigen_tol = 0.0;If
eigen_solver == "lobpcg"oreigen_solver == "amg", theneigen_tol == NULLwhich configures the underlyinglobpcgsolver to automatically resolve the value according to their heuristics.
Note that when using
eigen_solver == "lobpcg"oreigen_solver == "amg"values oftol < 1e-5may lead to convergence issues and should be avoided.Defaults to
"auto".assign_labelsA string specifying the strategy for assigning labels in the embedding space. There are two ways to assign labels after the Laplacian embedding. k-means is a popular choice (
"kmeans"), but it can be sensitive to initialization. Discretization is another approach which is less sensitive to random initialization ("discretize"). Thecluster_qrmethod directly extract clusters from eigenvectors in spectral clustering. In contrast to k-means and discretization,cluster_qrhas no tuning parameters and runs no iterations, yet may outperform k-means and discretization in terms of both quality and speed. Defaults to"kmeans".degreeAn integer value specifying the degree of the polynomial kernel. Ignored by other kernels. Defaults to
3L.coef0A numeric value specifying the value of the zero coefficient for polynomial and sigmoid kernels. Ignored by other kernels. Defaults to
1.0.kernel_paramsA named list specifying extra arguments to the kernels passed as functions. Ignored by other kernels. Defaults to
NULL.n_jobsAn integer value specifying the number of parallel jobs to run for neighbors search. Defaults to
1L. A value of-1Lmeans using all processors.verboseA boolean value specifying the verbosity mode. Defaults to
FALSE.