Distance based on separation of clusters

The separation between clusters is defined by the minimum distances of a point in the cluster to a point in another cluster. The number of clusters are provided. If not, the hierarchical clustering method is used to obtain the clusters. The separation between the clusters for dataset X is calculated. Same is done for dataset PX. An euclidean distance is then calculated between these separation for X and PX.

sep_dist(X, PX, clustering = FALSE, nclust = 3, type = "separation")

Arguments

X: a data.frame with two or three columns, the first two columns providing the dataset
PX: a data.frame with two or three columns, the first two columns providing the dataset
clustering: LOGICAL; if TRUE, the third column is used as the clustering variable, by default FALSE
nclust: the number of clusters to be obtained by hierarchical clustering, by default nclust = 3
type: character string to specify which measure to use for distance, see ?cluster.stats for details

Value

distance between X and PX

Examples

if(require('fpc')) {
with(mtcars, sep_dist(data.frame(wt, mpg, as.numeric(as.factor(mtcars$cyl))),
              data.frame(sample(wt), mpg, as.numeric(as.factor(mtcars$cyl))),
              clustering = TRUE))
}
#> Loading required package: fpc
#> [1] 0.4600815

if (require('fpc')) {
with(mtcars, sep_dist(data.frame(wt, mpg, as.numeric(as.factor(mtcars$cyl))),
             data.frame(sample(wt), mpg, as.numeric(as.factor(mtcars$cyl))),
             nclust = 3))
}
#> [1] 0.05397337