Skip to contents

Data

Sample data sets for examples.

aflw
AFLW player statistics
box
3D plane in 5D
bushfires
Australian bushfires 2019-2020
c1 c2 c3 c4 c5 c6 c7
Challenge data sets
clusters_nonlin
Four unusually shaped clusters in 4D
clusters
Three clusters in 5D
multicluster
Multiple clusters of different sizes, shapes and distance from each other
pisa
PISA scores
plane_nonlin
Non-linear relationship in 5D
plane
2D plane in 5D
simple_clusters
Two clusters in 2D
sketches_test
Images of sketches for testing
sketches_train
Images of sketches for training

Utility

Useful functions

calc_mv_dist()
Compute Mahalanobis distances between all pairs of observations
calc_norm()
Calculate the norm of a vector
convert_proj_tibble()
This function turns a projection sequence into a tibble
gen_vc_ellipse()
Generate points on the surface of an ellipse
gen_xvar_ellipse()
Ellipse matching data center and variance
norm_vec()
Normalise a vector to have length 1
pooled_vc()
Compute pooled variance-covariance matrix
rmvn()
Generate a sample from a multivariate normal
ggslice()
Generate an axis-parallel slice display
ggslice_projection()
Generate slice display

Principal Component Analysis

Useful functions for PCA

ggscree()
This function produces a simple scree plot
pca_model()
Create wire frame of PCA model

Clustering

Useful functions for cluster analysis

ggmcbic()
Produces an mclust summary plot with ggplot
hierfly()
Generate a dendrogram to be added to data
mc_ellipse()
Computes the ellipses of an mclust model
som_model()
Process the output from SOM to display the map and data