This clustering algorithm needs a neighborhood graph on the points, and an estimation of the density at each point. A few possible graph constructions and density estimators are provided for convenience, but it is perfectly natural to provide your own.

## Methods

### Method `new()`

The `Tomato`

constructor.

#### Arguments

`graph_type`

A string specifying the method to compute the neighboring graph. Choices are

`"knn"`

,`"radius"`

or`"manual"`

. Defaults to`"knn"`

.`density_type`

A string specifying the choice of density estimator. Choicea are

`"logDTM"`

,`"DTM"`

,`"logKDE"`

or`"manual"`

. When you have many points,`"KDE"`

and`"logKDE"`

tend to be slower. Defaults to`"logDTM"`

`n_clusters`

An integer value specifying the number of clusters. Defaults to

`NULL`

, i.e. no merging occurs and we get the maximal number of clusters.`merge_threshold`

A numeric value specifying the minimum prominence of a cluster so it doesn’t get merged. Defaults to

`NULL`

, i.e. no merging occurs and we get the maximal number of clusters.`...`

Extra parameters passed to

`KNearestNeighbors`

and`DTMDensity`

.

### Method `fit()`

Runs the Tomato algorithm on the provided data.

#### Arguments

`X`

Either a numeric matrix specifying the coordinates (in column) of each point (in row) or a

**full**distance matrix if`metric == "precomputed"`

or a list of neighbors for each point if`graph_type == "manual"`

. The number of points is currently limited to about 2 billion.`y`

Not used, present here for API consistency with

**scikit-learn**by convention.`weights`

A numeric vector specifying a density estimate at each point. Used only if

`density_type == "manual"`

.

### Method `fit_predict()`

Runs the Tomato algorithm on the provided data **and**
returns the class memberships.

#### Arguments

`X`

Either a numeric matrix specifying the coordinates (in column) of each point (in row) or a

**full**distance matrix if`metric == "precomputed"`

or a list of neighbors for each point if`graph_type == "manual"`

. The number of points is currently limited to about 2 billion.`y`

Not used, present here for API consistency with

**scikit-learn**by convention.`weights`

A numeric vector specifying a density estimate at each point. Used only if

`density_type == "manual"`

.

### Method `set_merge_threshold()`

Sets the threshold for merging clusters which automatically adjusts class memberships.

### Method `plot_diagram()`

Computes the persistence diagram of the merge tree of the initial clusters. This is a convenient graphical tool to help decide how many clusters we want.

## Examples

```
if (FALSE) { # reticulate::py_module_available("gudhi")
X <- seq_circle(100)
cl <- Tomato$new()
cl$fit_predict(X)
cl$set_n_clusters(2)
cl$get_labels()
}
```