14 Linear discriminant analysis

Linear discriminant analysis (LDA) dates to the early 1900s. It’s one of the most elegant and simple techniques for both modeling separation between groups, and as an added bonus, producing a low-dimensional representation of the differences between groups. LDA has two strong assumptions: the groups are samples from multivariate normal distributions, and each have the same variance-covariance. If the latter assumption is relaxed, a slightly less elegant solution results from quadratic discriminant analysis.

Useful explanations can be found in Venables & Ripley (2002) and Ripley (1996). A good general treatment of parametric methods for supervised classification can be found in R. A. Johnson & Wichern (2002) or another similar multivariate analysis textbook. It’s also useful to know that hypothesis testing for the difference in multivariate means using multivariate analysis of variance (MANOVA) has similar assumptions to LDA. Also model-based clustering assumes that each cluster arises from a multivariate normal distribution, and is related to LDA. The methods described here can be used to check these assumptions when applying these methods, too.

Because LDA is a parametric model it is important to check that these assumptions are reasonably satisfied:

shape of clusters are elliptical.
spread of the observations are the same.

14.1 Extracting the key elements of the model

LDA builds the model on the between-group sum-of-square matrix

\[B=\sum_{k=1}^g n_k(\bar{X}_k-\bar{X})(\bar{X}_k-\bar{X})^\top\] which measures the differences between the class means, compared with the overall data mean \(\bar{X}\) and the within-group sum-of-squares matrix,

\[ W = \sum_{k=1}^g\sum_{i=1}^{n_k} (X_{ki}-\bar{X}_k)(X_{ki}-\bar{X}_k)^\top \]

which measures the variation of values around each class mean. The linear discriminant space is generated by computing the eigenvectors (canonical coordinates) of \(W^{-1}B\), and this is the \((g-1)\)-D space where the group means are most separated with respect to the pooled variance-covariance. For each class we compute

\[ \delta_k(x) = (x-\mu_k)^\top W^{-1}\mu_k + \log \pi_k \]

where \(\pi_k\) is a prior probability for class \(k\) that might be based on unequal sample sizes, or cost of misclassification. The LDA classifier rule is to assign a new observation to the class with the largest value of \(\delta_k(x)\).

We can fit an LDA model using the lda() function from the MASS package. Here we have used the penguins data, assuming equal prior probability, to illustrate.

Load libraries

source("code/setup.R")

# Code to fit the model
load("data/penguins_sub.rda")

p_lda <- lda(species~bl+bd+fl+bm, 
             data=penguins_sub,
             prior=c(1/3, 1/3, 1/3))
options(digits=2)
# p_lda

Because there are three classes the dimension of the discriminant space is 2D. We can easily extract the group means from the model.

# Extract the sample means
p_lda$means

             bl    bd    fl    bm
Adelie    -0.95  0.60 -0.78 -0.62
Chinstrap  0.89  0.64 -0.37 -0.59
Gentoo     0.65 -1.10  1.16  1.10

The coefficients to project the data into the discriminant space, that is the eigenvectors of \(W^{-1}B\) are:

# Extract the discriminant space
p_lda$scaling

     LD1   LD2
bl -0.24 -2.31
bd  2.04  0.19
fl -1.20  0.08
bm -1.22  1.24

and the predicted values, which include class predictions, and coordinates in the discriminant space are generated as:

# Extract the fitted values
p_lda_pred <- predict(p_lda, penguins_sub)

The best separation between classes can be viewed from this object, which can be shown to match the original data projected using the scaling component of the model object (see Figure 14.1).

Code to generate LDA plots

# Check calculations from the fitted model, and equations
# Using the predicted values from the model object
p_lda_pred_x1 <- data.frame(p_lda_pred$x)
p_lda_pred_x1$species <- penguins_sub$species
p_lda1 <- ggplot(p_lda_pred_x1, 
                 aes(x=LD1, y=LD2, 
                     colour=species)) + 
  geom_point() +
  xlim(-6, 8) + ylim(-6.5, 5.5) +
  scale_color_discrete_divergingx("Zissou 1") +
  ggtitle("(a)") +
  theme_minimal() +
  theme(aspect.ratio = 1, legend.title = element_blank()) 

# matches the calculations done manually
p_lda_pred_x2 <- data.frame(as.matrix(penguins_sub[,1:4]) %*%
                              p_lda$scaling)
p_lda_pred_x2$species <- penguins_sub$species
p_lda2 <- ggplot(p_lda_pred_x2, 
                 aes(x=LD1, y=LD2, 
                     colour=species)) + 
  geom_point() +
  xlim(-6, 8) + ylim(-7, 5.5) +
  scale_color_discrete_divergingx("Zissou 1") +
  ggtitle("(b)") +
  theme_minimal() +
  theme(aspect.ratio = 1, legend.title = element_blank()) 
ggarrange(p_lda1, p_lda2, ncol=2, 
          common.legend = TRUE, legend = "bottom")

Two plots labelled (a) and (b) that are the same scatterplot, except the scale in each is different. Plot (a) has x-axis LD1 with labels -4, 0, 4 and 8 has y-axis LD2 with labels -6, -3, 0, 3 and 6. Plot (b) has x-axis LD1 with labels -4, 0, 4 and 8 and y-axis LD2 with labels -4, 0 and 4. There is a legend indicating colour is used to show species, with 3 levels: Adelie shown as brilliant greenish blue colour, Chinstrap shown as brilliant yellow colour and Gentoo shown as vivid red colour. The chart is a set of 313 big solid circle points of which about 97% can be seen. Red is separated from the other two clusters, but green and yellow are overlapped. — Figure 14.1: Penguins projected into the 2D discriminant space, done two ways: (a) using the predicted values, (b) directly projecting using the model component. The scale is different but the projected data is identical in shape.

The \(W\) and \(B\) matrices cannot be extracted from the model object, so we need to compute these separately. We only need \(W\) actually. It is useful to think of this as the pooled variance-covariance matrix. Because the assumption for LDA is that the population group variance-covariances are identical, we estimate this by computing them for each class and then averaging them to get the pooled variance-covariance matrix. It’s laborious, but easy.

# Compute pooled variance-covariance
p_vc_pool <- mulgar::pooled_vc(penguins_sub[,1:4],
                               penguins_sub$species)
p_vc_pool

     bl   bd   fl   bm
bl 0.31 0.18 0.13 0.18
bd 0.18 0.32 0.14 0.20
fl 0.13 0.14 0.23 0.16
bm 0.18 0.20 0.16 0.31

This can be used to draw an ellipse corresponding to the pooled variance-covariance that is used by the LDA model.

14.2 Checking assumptions

This LDA approach is widely applicable, but it is useful to check the underlying assumptions on which it depends: (1) that the cluster structure corresponding to each class forms an ellipse, showing that the class is consistent with a sample from a multivariate normal distribution, and (2) that the variance of values around each mean is nearly the same. Figure 14.2 and Figure 14.3 illustrates two datasets, of which only one is consistent with these assumptions. Other parametric models, such as quadratic discriminant analysis or logistic regression, also depend on assumptions about the data which should be validated.

To check the equal and elliptical variance-covariance assumption, generate points on the surface of an ellipse corresponding to the variance-covariance for each group. When watching these ellipses in a tour, they should look similar in all projections.

Code

# Generate ellipses for each group's variance-covariance
p_ell <- NULL
for (i in unique(penguins_sub$species)) {
  x <- penguins_sub |> dplyr::filter(species == i)
  e <- gen_xvar_ellipse(x[,1:2], n=150, nstd=1.5)
  e$species <- i
  p_ell <- bind_rows(p_ell, e)
}

Code for penguins data and ellipse plots

lda1 <- ggplot(penguins_sub, aes(x=bl, 
                         y=bd, 
                         colour=species)) +
  geom_point() +
  scale_color_discrete_divergingx("Zissou 1") +
  xlim(-2.5, 3) + ylim(-2.5, 2.5) +
  ggtitle("(a)") +
  theme_minimal() +
  theme(aspect.ratio = 1) 
lda2 <- ggplot(p_ell, aes(x=bl, 
                         y=bd, 
                         colour=species)) +
  geom_point() +
  scale_color_discrete_divergingx("Zissou 1") +
  xlim(-2.5, 3) + ylim(-2.5, 2.5) +
  ggtitle("(b)") +
  theme_minimal() +
  theme(aspect.ratio = 1)
ggarrange(lda1, lda2, ncol=2, 
          common.legend = TRUE, legend = "bottom")

Two plots labelled (a) and (b). Both have x-axis bl with labels -2, -1, 0, 1, 2 and 3 and y-axis bd with labels -2, -1, 0, 1 and 2. There is a legend indicating colour is used to show species, with 3 levels: Adelie shown as brilliant greenish blue colour, Chinstrap shown as brilliant yellow colour and Gentoo shown as vivid red colour. Plot (a) is a set of 333 big solid circle points of which about 97% can be seen, where differences between the three groups are visible. Plot (b) is a set of 450 big solid circle points arranged into three very similar shaped and oriented non-overlapping ellipses. — Figure 14.2: Scatterplot of flipper length by bill length of the penguins data, and corresponding variance-covariance ellipses. There is a small amount of difference between the ellipses, but they are similar enough to be confident in assuming the population variance-covariances are equal.

Code for bushfires data and ellipse plots

# Now repeat for a data set that violates assumptions
data(bushfires)
lda3 <- ggplot(bushfires, aes(x=log_dist_cfa, 
                         y=log_dist_road, 
                         colour=cause)) +
  geom_point() +
  scale_color_discrete_divergingx("Zissou 1") +
  xlim(6, 11) + ylim(-1, 10.5) +
  ggtitle("(a)") +
  theme_minimal() +
  theme(aspect.ratio = 1)
b_ell <- NULL
for (i in unique(bushfires$cause)) {
  x <- bushfires |> dplyr::filter(cause == i)
  e <- gen_xvar_ellipse(x[,c(57, 59)], n=150, nstd=2)
  e$cause <- i
  b_ell <- bind_rows(b_ell, e)
}
lda4 <- ggplot(b_ell, aes(x=log_dist_cfa, 
                         y=log_dist_road, 
                         colour=cause)) +
  geom_point() +
  scale_color_discrete_divergingx("Zissou 1") +
  xlim(6, 11) + ylim(-1, 10.5) +
  ggtitle("(b)") +
  theme_minimal() +
  theme(aspect.ratio = 1)
ggarrange(lda3, lda4, ncol=2, 
          common.legend = TRUE, legend = "bottom")

Two plots labelled (a) and (b). Both have x-axis log_dist_cfa with labels 6, 7, 8, 9, 10 and 11 and y-axis log_dist_road with labels 0.0, 2.5, 5.0, 7.5 and 10.0. There is a legend indicating colour is used to show cause, with 4 levels: accident shown as brilliant greenish blue colour, arson shown as light yellowish green colour, burning_off shown as strong orange yellow colour and lightning shown as vivid red colour. Plot (a) is a set of 333 big solid circle points of which about 97% can be seen, with few differences between the four groups. Plot (b) is a set of 450 big solid circle points arranged into four differently shaped and oriented overlapping ellipses. — Figure 14.3: Scatterplot of distance to cfa and road for the bushfires data, and corresponding variance-covariance ellipses. There is a lot of difference between the shapes and the orientation of the ellipses, so it cannot be assumed that the population variance-covariances are equal.

The equal and elliptical variance-covariance assumption is reasonable for the penguins data because the ellipse shapes roughly match the spread of the data. It is not a suitable assumption for the bushfires data, because the spread is not elliptically-shaped and varies in size between groups.

This approach extends to any dimension. We would use the same projection sequence to view both the data and the variance-covariance ellipses, as in Figure 14.4. It can be seen that there is some difference in the shape and size of the ellipses between species, in some projections, and also with the spread of points in the projected data. However, the differences are small, so it would be safe to assume that the population variance-covariances are equal.

Code for making animated gifs

p_ell <- NULL
for (i in unique(penguins_sub$species)) {
  x <- penguins_sub |> dplyr::filter(species == i)
  e <- gen_xvar_ellipse(x[,1:4], n=150, nstd=1.5)
  e$species <- i
  p_ell <- bind_rows(p_ell, e)
}
p_ell$species <- factor(p_ell$species)
load("data/penguins_tour_path.rda")
animate_xy(p_ell[,1:4], col=factor(p_ell$species))
render_gif(penguins_sub[,1:4], 
           planned_tour(pt1), 
           display_xy(half_range=0.9, axes="off", col=penguins_sub$species),
           gif_file="gifs/penguins_lda1.gif",
           frames=500,
           loop=FALSE)
render_gif(p_ell[,1:4], 
           planned_tour(pt1), 
           display_xy(half_range=0.9, axes="off", col=p_ell$species),
           gif_file="gifs/penguins_lda2.gif",
           frames=500,
           loop=FALSE)

Animation showing a tour of the penguins data, with colour indicating species. The spread of points in each group is reasonably similar regardless of projection. — (a) Data

Animation showing a tour of the ellipses corresponding to variance-covariance matrices for each species. The shape of the ellipse for each group is reasonably similar regardless of projection. — (a) Data

As a further check, we could generate three ellipses corresponding to the pooled variance-covariance matrix, as would be used in the model, centered at each of the means. Overlay this with the data, as done in Figure 14.5. Now you will compare the spread of the observations in the data, with the elliptical shape of the pooled variance-covariance. If it matches reasonably we can safely use LDA. This can also be done group by group when multiple groups make it difficult to view all together.

To check the fit of the equal variance-covariance assumption, simulate points on the ellipse corresponding to the pooled sample variance-covariance matrix. Generate one for each group centered at the group mean, and compare with the data.

Code for adding ellipses to data

# Create an ellipse corresponding to pooled vc
pool_ell <- gen_vc_ellipse(p_vc_pool, 
                           xm=rep(0, ncol(p_vc_pool)))

# Add means to produce ellipses for each species
p_lda_pool <- data.frame(rbind(
  pool_ell +
    matrix(rep(p_lda$means[1,],
      each=nrow(pool_ell)), ncol=4),
  pool_ell +
    matrix(rep(p_lda$means[2,],
      each=nrow(pool_ell)), ncol=4),
  pool_ell +
    matrix(rep(p_lda$means[3,],
      each=nrow(pool_ell)), ncol=4)))
# Create one data set with means, data, ellipses
p_lda_pool$species <- factor(rep(levels(penguins_sub$species),
                          rep(nrow(pool_ell), 3)))
p_lda_pool$type <- "ellipse"
p_lda_means <- data.frame(
  p_lda$means,
  species=factor(rownames(p_lda$means)),
                          type="mean")
p_data <- data.frame(penguins_sub[,1:5], 
                     type="data")
p_lda_all <- bind_rows(p_lda_means,
                       p_data,
                       p_lda_pool)
p_lda_all$type <- factor(p_lda_all$type, 
   levels=c("mean", "data", "ellipse"))
shapes <- c(3, 4, 20)
p_pch <- shapes[p_lda_all$type]

Code to generate animated gifs

# Code to run the tour
animate_xy(p_lda_all[,1:4], col=p_lda_all$species, pch=p_pch)
load("data/penguins_tour_path.rda")
render_gif(p_lda_all[,1:4], 
           planned_tour(pt1), 
           display_xy(col=p_lda_all$species, pch=p_pch, 
                      axes="off", half_range = 0.7),
           gif_file="gifs/penguins_lda_pooled1.gif",
           frames=500,
           loop=FALSE)

# Focus on one species
render_gif(p_lda_all[p_lda_all$species == "Gentoo",1:4], 
           planned_tour(pt1), 
           display_xy(col="#F5191C", 
                      pch=p_pch[p_lda_all$species == "Gentoo"], 
                      axes="off", half_range = 0.7),
           gif_file="gifs/penguins_lda_pooled2.gif",
           frames=500,
           loop=FALSE)

Animation showing a tour of the pooled variance-covariance ellipse, computed for each species, overlaid on the data. The shape of the ellipse for each group is reasonably similar to the spread of the points in all projections. — (a) All species

Animation showing a tour of the pooled variance-covariance ellipse overlaid on the data for the Gentoo penguins (red). The shape of the ellipse is reasonably similar to the spread of the points in all projections. — (a) All species

From the tour, we can see that the assumption of equal elliptical variance-covariance is a reasonable assumption for the penguins data. In all projections the ellipse is reasonably matching the spread of the observations.

14.3 Examining model fit

The boundaries for a classification model can be examined by:

generating a large number of points in the domain of the data
predicting the class for each point

We’ll look at this for 2D using the LDA model fitted to bl, and bd of the penguins data.

p_bl_bd_lda <- lda(species~bl+bd, data=penguins_sub, 
                                  prior = c(1/3, 1/3, 1/3))

The fitted model means \(\bar{x}_{Adelie} = (\) -0.95, 0.6\()^\top\), \(\bar{x}_{Chinstrap} = (\) 0.89, 0.64\()^\top\), and \(\bar{x}_{Gentoo} = (\) 0.65, -1.1\()^\top\) can be added to the plots.

The boundaries can be examined using the explore() function from the classifly package, which generates observations in the range of all values ofbl and bd and predicts their class. Figure 14.6 shows the resulting prediction regions, with the observed data and the sample means overlaid.

# Compute points in domain of data and predict
p_bl_bd_lda_boundaries <- explore(p_bl_bd_lda, penguins_sub)
p_bl_bd_lda_m1 <- ggplot(p_bl_bd_lda_boundaries) +
  geom_point(aes(x=bl, y=bd, 
                 colour=species, 
                 shape=.TYPE), alpha=0.8) + 
  scale_color_discrete_divergingx("Zissou 1") +
  scale_shape_manual(values=c(46, 16)) +
  theme_minimal() +
  theme(aspect.ratio = 1, legend.position = "none")

p_bl_bd_lda_means <- data.frame(p_bl_bd_lda$means,
                        species=rownames(p_bl_bd_lda$means))
p_bl_bd_lda_m1 +   
  geom_point(data=p_bl_bd_lda_means, 
             aes(x=bl, y=bd), 
             colour="black", 
             shape=3,
             size=3)

Square divided into three regions by the coloured points, primarily a wedge of yellow (Chinstrap) extending from the top right divides the other two groups. The regions mostly contain the observed values, with just a few over the boundary in the wrong region. — Figure 14.6: Prediction regions of the LDA model for two variables of the three species of penguins indicated by the small points. Large points are the observations, and the sample mean of each species is represented by the plus. The boundaries between groups can be seen to be roughly half-way between the means, taking the elliptical spread into account, and mostly distinguishes the three species.

This approach can be readily extended to higher dimensions. One first fits the model with all four variables, and uses explore() to generate points in the 4D space with predictions, generating a representation of the prediction regions. Figure 14.7 (a) shows the results using a slice tour (Laa et al., 2020a). Points inside the slice are shown in larger size. The slice is made in the centre of the data, to show the boundaries in this neighbourhood. As the tour progresses we see a thin slice through the centre of the data, parallel with the projection plane. In most projections there is some small overlap of points between groups, which happens because we are examining a 4D object with 2D. The slice helps to alleviate this, allowing a focus on the boundaries in the centre of the cube. In all projections the boundaries between groups is linear, as would be expected when using LDA. We can also see that the model roughly divides the cube into three relatively equally-sized regions.

Figure 14.7 (b) shows the three prediction regions, represented by points in 4D, projected into the discriminant space. Linear boundaries neatly divide the full space, which is to be expected because the LDA model computes it’s classification rules in this 2D space. In this view the data has been overlaid and it can be seen that the boundaries produced by LDA quite nicely separate the species, albeit some confusion between Adelie and Chinstrap. Overlaying the data on the slice tour in (a) could be useful but it also adds too many parts to the plot, possibly making it more difficult to focus on the boundary.

p_lda <- lda(species ~ ., penguins_sub[,1:5], prior = c(1/3, 1/3, 1/3))
p_lda_boundaries <- explore(p_lda, penguins_sub)

Code for generating slice tour

# Code to run the tour
animate_slice(p_lda_boundaries[
  p_lda_boundaries$.TYPE == "simulated", 1:4], 
              col=p_lda_boundaries$species[
                p_lda_boundaries$.TYPE == "simulated"], 
              v_rel=0.8, 
              axes="bottomleft")
render_gif(p_lda_boundaries[p_lda_boundaries$.TYPE == "simulated",1:4],
           planned_tour(pt1),
           display_slice(v_rel=0.8, 
             col=p_lda_boundaries$species[
               p_lda_boundaries$.TYPE == "simulated"], 
             axes="bottomleft"),     
           gif_file="gifs/penguins_lda_boundaries.gif",
           frames=500,
           loop=FALSE
           )

Code for projecting into LDA space

# Project the boundaries into the 2D discriminant space
p_lda_b_sub <- p_lda_boundaries[
  p_lda_boundaries$.TYPE == "simulated", 
  c(1:4, 6)]
p_lda_b_sub_ds <- data.frame(as.matrix(p_lda_b_sub[,1:4]) %*%
  p_lda$scaling)
p_lda_b_sub_ds$species <- p_lda_b_sub$species
p_lda_b_sub_ds_p <- ggplot(p_lda_b_sub_ds, 
       aes(x=LD1, y=LD2, 
           colour=species)) +
  geom_point(alpha=0.5) +  
  geom_point(data=p_lda_pred_x1, aes(x=LD1, 
                               y=LD2, 
                               shape=species),
             inherit.aes = FALSE) +
  scale_color_discrete_divergingx("Zissou 1") +
  scale_shape_manual(values=c(1, 2, 3)) +
  theme_minimal() +
  theme(aspect.ratio = 1, 
        legend.position = "bottom",
        legend.title = element_blank())

Animation plot, where three groups of coloured points can be seen. They roughly break the square into three regions with linear boundaries, which varies in each projection. — (a) 4D

Scatterplot of points divided neatly into three colour groups by linear boundaries. The regions are split at the middle, and with the boundary between blue and red, blue and yellow, and yellow and red, starting near 11, 4, and o'clock, respectively. — (a) 4D

From the tour, we can see that the LDA boundaries divide the classes only in the discriminant space. It is not using the space orthogonal to the 2D discriminant space. You can see this because the boundary is sharp in just one 2D projection, while most of the projections show some overlap of regions.

14.4 Interpretability

Because the variables are standardised before computing LDA in the examples above, we can examine the magnitude and sign of the coefficients for the discriminant space to assess the importance of variables (Table 14.1). In the direction of LD1, the variables bd, fl, bm have much larger magnitudes than bl. This says that in the it is a combination of these three variables, actually a contrast between bd against fl and bm because of the differing signs. In the direction of LD2 the variables bl and bm have the largest magnitude and opposing signs which says that it is a contrast of these two variables which produce the second axis of the linear discriminant space.

The next step is to match LD1 and LD2 with the differences between the clusters. From Figure 14.7 (b) we can see that in the direction of LD1 the structure is that the Gentoo penguins (red) are distinct from the other two species, and in the direction of LD2 we can see the slightly overlapping Adelie and Chinstrap clusters. Then we can conclude that bl and bm are important for distinguishing Adelie from Chinstrap. We can also conclude that bl is not important for distinguishing Gentoo from the other species.

Interpretation using the magnitude and sign of the coefficients is accurate only if there is no association between the variables. If the variables are associated, then one variable may still be important for the structure seen, but masked by the other variables. For example, if bl is strongly correlated with bd (which it isn’t) then it is possible that although bl does not contribute to LD1, it is still important for distinguishing Gentoo penguins. This type of higher level association and the impact of interpretability can be explored using the radial tour.

Table 14.1: Coefficients of the discriminant space, original and othonormalised. LD1 is the direction where Gentoo is separated from the others, and the coefficients suggest this is due to a contrast between bd and fl/bm. LD2 is where Adelie is different from Chinstrap, which is generated by a contrast between bl and bm.

	original		orthonormal
	LD1	LD2	LD1	LD2
bl	-0.24	-2.31	-0.09	-0.89
bd	2.04	0.19	0.76	0.14
fl	-1.20	0.08	-0.45	-0.01
bm	-1.22	1.24	-0.45	0.43

The discriminant space can be considered to be a 2D projection basis, except that it needs to be orthonormalised to be an orthonormal basis. This orthonormal basis provides the initial projection for a radial tour where each variable can be rotated out of the projection to assess its importance to the visible structure. It can also be used as the anchor basis for a local tour, where small steps to and from this projection to randomly selected nearby projections are shown. This allows the inspection of the local neighbourhood of the discriminant space, to assess the effect of small changes in the projection basis on the visible structure.

Animation showing a sequence of 2D projections of three species of penguin starting from a projection where the three are distinct. Initial projection has bm pointing to 10 o'clock, fl pointing to 9 o'clock, bl pointing to 6 o'clock and bd pointing to half past 2. The bl contribution is reduced to 0 and then back to its original value. This has a large effect on the separation of Chinstrap from Adelie but not on Gentoo. — radial

The radial tour can be used to assess variable importance to the structure present, especially when predictors are strongly associated with each other.

Figure 14.8 illustrates how the radial tour (a,b) and the local tour (c,d) can be used to assess the importance of variables to structure of interest in a plot. In the radial tour the interpolation runs from the projection corresponding to the discriminant space, and to a projection where one of the variables (here bl) has been removed. The purpose for examining bl is that it has a small contribution to LD1, and thus may not be important for separating Gentoo from the other species. We would also be assessing it’s large contribution to LD2, and the separation between Adelie and Chinstrap, where we expect that it is indispensable. This is indeed what we learn. By removing bl there is no change to the separation between Gentoo, so it might be a simpler best 2D projection if the coefficient for bl in LD1 is actually 0. However, for LD2 removing bl results in completely overlapping clusters for Adelie and Chinstrap, which confirms that this is the main variable that distinguishes the two species.

The local tour is used here to explore the neighbourhood of the discriminant space. We anchor the tour at the discriminant space and interpolate to many random target planes within a specified distance. It can be used to examine how the structure changes with small differences in the combination of variables. Here we can see some small improvements in distinguishing the Adelie (blue) from Chinstrap (yellow) penguins. In the discriminant space there are about three Adelie penguins mixed among a group of Chinstrap penguins. In one change of the projection basis this can be reduced to one Adelie penguin mixed among the Chinstrap, and otherwise a neatly linear boundary could divide the clusters. This is a small improvement, but it does illustrate how visualisation in high dimensions can help to refine an automated rule.

In the penguins data the small contribution of bl in the discriminant space for separating Gentoo can be ignored, or changed to zero without affecting the class difference.

Exercises

For the simple_clusters compute the LDA model, and make a plot of the data, with points coloured by the true class. Overlay variance-covariance ellipses, and a \(+\) indicating the sample mean for each class. Is it reasonable to assume that the two classes are sampled from populations with the same variance-covariance?
Examine the clusters corresponding to the true classes in the clusters data set, using a tour. Based on the shape of the data is the assumption of equal variance-covariance reasonable?
Examine the pooled variance-covariance for the clusters data, overlaid on the data in a tour on the 5D. Does it fit the variance of each cluster nicely?
There are several interesting data sets with class variables available on the GGobi website. Examine the differences between type of music, based on the the variables lvar, lave, lmax, lfener, lfreq. (It is best to remove the “New wave” class because there are too few observations.) Examine the pooled variance-covariance for the two types of music. Does it fit the variance of each cluster nicely?

The music data can be read using:

Code

library(readr)
library(dplyr)
music <- read_csv("http://ggobi.org/book/data/music-sub.csv",
                  show_col_types = FALSE) |>
  rename(title = `...1`) |>
  mutate(type = factor(type))

Fit an LDA model to the simple_clusters data, using the cl variables as the class. Examine the boundaries produced by the model, in 2D.
Fit an LDA model to the clusters data, using the cl variables as the class. Examine the boundaries produced by the model in 5D.
Assess the LDA assumptions for the multicluster data. Is LDA an appropriate model? Do you think it is still possible to produce a useful classification for this data? Fit it to check.
Compute the first 12 PCs of the sketches data. Check the assumption of equal, elliptical variance-covariance of the 6 groups. Regardless of whether you decide that the assumption is satisfied or not, fit an LDA to the 12 PCs. Extract the discriminant space (the x component of the predict object), and examine the separation (or not) of the 6 groups in this 5D space. Is LDA providing a good classification model for this data?
Even though the bushfires data does not satisfy the assumptions for LDA, fit LDA to the first five PCs. Examine the class differences in the 3D discriminant space.
Compute the boundary between classes, for the LDA model where the prior probability reflects the sample size, and the LDA model where the priors are equal for all groups. How does the boundary between lightning cause fires and the other groups change?
Using the aflw data (using the same subset of variables and players as used in Q3 of Chapter 13) compute:

a 2D representation using UMAP. (Make sure position is removed for the calculation.)
Fit an LDA model, and predict the observations into the discriminant space. (Make sure to use standardised variables so the coefficients of the discriminant space can be interpreted. You may receive a warning that variables are collinear, but the model should still be usable.)

Plot the UMAP layout with points coloured by position.
Plot the data in the LDA discriminant space coloured by position.

Discuss what is learned from each representation of the data, relative to the differences between skills of the players. And also, using examine the coefficients of the discriminant space to explain how the skill sets of the player types are different.

View the most important variables contributing to the discriminant space in a tour. Is it possible to distinguish the player types in this subset of variables?

You can use this code to subset the data:

Code

library(mulgar)
data(aflw)
aflw_sub <- aflw |> 
  dplyr::filter(position %in% c("BPL", "FF", "RR")) |>
  dplyr::mutate(position = factor(position)) |>
  dplyr::select(goals:tackles)

Starting with the projection created from the discriminant space of the LDA on the aflw data, use a radial tour to explore the contributions of the variables that contribute the least. The purpose is to determine whether a simpler projection can be used that distinguishes the player types equally well.

Abbott, E. (1884). Flatland: A Romance of Many Dimensions. Dover Publications.

Ahlberg, C., Williamson, C., & Shneiderman, B. (1991). Dynamic Queries for Information Exploration: An Implementation and Evaluation. ACM CHI ‘92 Conference Proceedings, 619–626.

Allaire, J., & Chollet, F. (2023). keras: R interface to Keras. https://CRAN.R-project.org/package=keras

Anderson, E. (1957). A Semigraphical Method for the Analysis of Complex Problems. Proceedings of the National Academy of Science, 13, 923–927.

Andrews, D. F. (1972). Plots of High-dimensional Data. Biometrics, 28, 125–136.

Andrews, D. F., Gnanadesikan, R., & Warner, J. L. (1971). Transformations of Multivariate Data. Biometrics, 27, 825–840.

Anselin, L., & Bao, S. (1997). Exploratory Spatial Data Analysis Linking SpaceStat and ArcView. In M. M. Fischer & A. Getis (Eds.), Recent Developments in Spatial Analysis (pp. 35–59). Springer.

Arnold, J. B. (2024). ggthemes: Extra Themes, Scales and Geoms for ggplot2. https://jrnold.github.io/ggthemes/

ASA Statistical Graphics Section. (2023). Video Library. https://community.amstat.org/jointscsg-section/media/videos.

Asimov, D. (1985). The Grand Tour: A Tool for Viewing Multidimensional Data. SIAM Journal of Scientific and Statistical Computing, 6(1), 128–143.

Auguie, B. (2017). gridExtra: Miscellaneous Functions for grid Graphics. https://CRAN.R-project.org/package=gridExtra

Australian Bureau of Agricultural and Resource Economics and Sciences. (2018). Forests of Australia. https://www.agriculture.gov.au/abares/forestsaustralia/forest-data-maps-and-tools/spatial-data/forest-cover

Batsaikhan, Z., Cook, D., & Laa, U. (2023). Frame to Frame Interpolation for High-dimensional Data Visualisation using the woylier package. https://doi.org/10.48550/arXiv.2311.08181

Batsaikhan, Z., Cook, D., & Laa, U. (2024). woylier: Alternative Tour Frame Interpolation Method. https://numbats.github.io/woylier/

Becker, R. A., & Chambers, J. M. (1984). S: An Environment for Data Analysis and Graphics. Wadsworth.

Becker, R. A., & Cleveland, W. S. (1988). Brushing Scatterplots (W. S. Cleveland & M. E. McGill, Eds.; pp. 201–224). Wadsworth.

Becker, R., Cleveland, W. S., & Shyu, M.-J. (1996). The Visual Design and Control of Trellis Displays. Journal of Computational and Graphical Statistics, 6(1), 123–155.

Bederson, B. B., & Schneiderman, B. (2003). The Craft of Information Visualization: Readings and Reflections. Morgan Kaufmann.

Bellman, R. (1961). Adaptive Control Processes: A Guided Tour.

Bickel, P. J., Kur, G., & Nadler, B. (2018). Projection Pursuit in High Dimensions. Proceedings of the National Academy of Sciences, 115, 9151–9156. https://doi.org/10.1073/pnas.1801177115

Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

Boehmke, B., & Greenwell, B. M. (2019). Hands-On Machine Learning with R (1st ed.). Chapman; Hall/CRC. https://doi.org/10.1201/9780367816377

Boelaert, J., Ollion, E., & Sodoge, J. (2022). aweSOM: Interactive Self-Organizing Maps. https://CRAN.R-project.org/package=aweSOM

Bonneau, G.-P., Ertl, T., & Nielson, G. M. (Eds.). (2006). Scientific Visualization: The Visual Extraction of Knowledge from Data. Springer.

Borg, I., & Groenen, P. J. F. (2005). Modern Multidimensional Scaling. Springer.

Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.

Breiman, L., Cutler, A., Liaw, A., & Wiener, M. (2022). randomForest: Breiman and Cutler’s Random Forests for classification and Regression. https://www.stat.berkeley.edu/~breiman/RandomForests/

Breiman, L., Friedman, J., Olshen, C., & Stone, C. (1984). Classification and Regression Trees. Wadsworth; Brooks/Cole.

Buja, A. (1996). Interactive Graphical Methods in the Analysis of Customer Panel Data: Comment. Journal of Business & Economic Statistics, 14(1), 128–129.

Buja, A., & Asimov, D. (1986). Grand Tour Methods: An Outline. Computing Science and Statistics, 17, 63–67.

Buja, A., Asimov, D., Hurley, C., & McDonald, J. A. (1988). Elements of a Viewing Pipeline for Data Analysis (W. S. Cleveland & M. E. McGill, Eds.; pp. 277–308). Wadsworth.

Buja, A., Cook, D., Asimov, D., & Hurley, C. (2005). Computational Methods for High-Dimensional Rotations in Data Visualization. In C. R. Rao, E. J. Wegman, & J. L. Solka (Eds.), Handbook of Statistics: Data Mining and Visualization (pp. 391–414). Elsevier/North-Holland.

Buja, A., Cook, D., & Swayne, D. (1996). Interactive High-Dimensional Data Visualization. Journal of Computational and Graphical Statistics, 5(1), 78–99.

Buja, A., Hurley, C., & McDonald, J. A. (1986). A Data Viewer for Multivariate Data. Computing Science and Statistics, 17(1), 171–174.

Buja, A., & Swayne, D. F. (2002). Visualization Methodology for Multidimensional Scaling. Journal of Classification, 19(1), 7–43.

Buja, A., Swayne, D. F., Littman, M. L., Dean, N., Hofmann, H., & Chen, L. (2008). Data Visualization with Multidimensional Scaling. Journal of Computational and Graphical Statistics, 17(2), 444–472. https://doi.org/10.1198/106186008X318440

Buja, A., & Tukey, P. (Eds.). (1991). Computing and Graphics in Statistics. Springer-Verlag.

Butler, A., Hoffman, P., Smibert, P., Papalexi, E., & Satija, R. (2018). Integrating Single-Cell Transcriptomic Data Across Different Conditions, Technologies, and Species. Nature Biotechnology, 36, 411–420. https://doi.org/10.1038/nbt.4096

Card, S. K., Mackinlay, J. D., & Schneiderman, B. (1999). Readings in Information Visualization. Morgan Kaufmann Publishers.

Carr, D. B., Wegman, E. J., & Luo, Q. (1996). ExplorN: Design Considerations Past and Present (Technical Report No. 129). Center for Computational Statistics, George Mason University.

Chatfield, C. (1995). Problem Solving: A Statistician’s Guide. Chapman; Hall/CRC Press.

Chen, C.-H., Härdle, W., & Unwin, A. (Eds.). (2007). Handbook of Data Visualization. Springer. https://doi.org/10.1007/978-3-540-33037-0

Chen, Z., Wang, C., Huang, S., Shi, Y., & Xi, R. (2024). Directly Selecting Cell-type Marker Genes for Single-cell Clustering Analyses. Cell Reports Methods, 4, 100810. https://doi.org/10.1016/j.crmeth.2024.100810

Cheng, B., & Titterington, M. (1994). Neural Networks: A Review from a Statistical Perspective. Statistical Science, 9(1), 2–30.

Cheng, J., & Sievert, C. (2023). crosstalk: Inter-Widget Interactivity for HTML Widgets. https://rstudio.github.io/crosstalk/

Chernoff, H. (1973). The Use of Faces to Represent Points in \(k\)-dimensional Space Graphically. Journal of the American Statistical Association, 68, 361–368.

Cleveland, W. S. (1979). Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of American Statistics Association, 74, 829–836.

Cleveland, W. S. (1993). Visualizing Data. Hobart Press.

Cleveland, W. S., & McGill, M. E. (Eds.). (1988). Dynamic Graphics for Statistics. Wadsworth.

Cook, D., & Buja, A. (1997). Manual Controls For High-Dimensional Data Projections. Journal of Computational and Graphical Statistics, 6(4), 464–480.

Cook, D., Buja, A., & Cabrera, J. (1993). Projection Pursuit Indexes Based on Orthonormal Function Expansions. Journal of Computational and Graphical Statistics, 2(3), 225–250.

Cook, D., Buja, A., Cabrera, J., & Hurley, C. (1995). Grand Tour and Projection Pursuit. Journal of Computational and Graphical Statistics, 4(3), 155–172.

Cook, D., Hofmann, H., Lee, E.-K., Yang, H., Nikolau, B., & Wurtele, E. (2007). Exploring Gene Expression Data, Using Plots. Journal of Data Science, 5(2), 151–182.

Cook, D., & Laa, U. (2025). mulgar: Functions for Pre-Processing Data for Multivariate data Visualisation using Tours. https://dicook.github.io/mulgar/

Cook, D., Lee, E.-K., Buja, A., & Wickham, H. (2006). Grand Tours, Projection Pursuit Guided Tours and Manual Controls. In C.-H. Chen, W. Härdle, & A. Unwin (Eds.), Handbook of Data Visualization. Springer. https://doi.org/10.1007/978-3-540-33037-0

Cook, D., Majure, J. J., Symanzik, J., & Cressie, N. (1996). Dynamic Graphics in a GIS: Exploring and Analyzing Multivariate Spatial Data using Linked Software. Computational Statistics: Special Issue on Computer Aided Analyses of Spatial Data, 11(4), 467–480.

Cook, D., & Swayne, D. F. (2007). Interactive and Dynamic Graphics for Data Analysis: With R and GGobi. Springer-Verlag. https://doi.org/10.1007/978-0-387-71762-3

Cortes, C., Pregibon, D., & Volinsky, C. (2003). Computational Methods for Dynamic Graphs. Journal of Computational & Graphical Statistics, 12(4), 950–970.

Cortes, C., & Vapnik, V. N. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297.

d’Ocagne, M. (1885). Coordonnées Parallèles et Axiales: Méthode de Transformation Géométrique et Procédé Nouveau de Calcul Graphique dÉduits de la Considération des Coordonnées Paralléles. Gauthier-Villars.

Dalgaard, P. (2002). Introductory Statistics with R. Springer.

Dasu, T., Swayne, D. F., & Poole, D. (2005). Grouping Multivariate Time Series: A Case Study. Proceedings of the IEEE Workshop on Temporal Data Mining: Algorithms, Theory and Applications, in Conjunction with the Conference on Data Mining, Houston, November 27, 2005, 25–32.

de Vries, A., & Ripley, B. D. (2024). ggdendro: Create Dendrograms and Tree Diagrams Using ggplot2. https://andrie.github.io/ggdendro/

Department of Environment, Land, Water & Planning. (2019). Fire Origins - Current and Historical. https://discover.data.vic.gov.au/dataset/fire-origins-current-and-historical

Department of Environment, Land, Water & Planning. (2020a). CFA - Fire Station. https://discover.data.vic.gov.au/dataset/cfa-fire-station-vmfeat-geomark_point

Department of Environment, Land, Water & Planning. (2020b). Recreation Sites. https://discover.data.vic.gov.au/dataset/recreation-sites

Diaconis, P., & Freedman, D. (1984). Asymptotics of Graphical Projection Pursuit. Annals of Statistics, 12, 793–815.

Dolnicar, S., Grün, B., & Leisch, F. (2018). Market Segmentation Analysis: Understanding it, Doing it, and Making it Useful (pp. 11–22). https://doi.org/10.1007/978-981-10-8818-6_2

Dykes, J., MacEachren, A. M., & Kraak, M.-J. (2005). Exploring Geovisualization. Elsevier.

Emerson, J. W., Green, W. A., Schloerke, B., Crowley, J., Cook, D., Hofmann, H., & Wickham, H. (2013). The Generalized Pairs Plot. Journal of Computational and Graphical Statistics, 22(1), 79–91. https://doi.org/10.1080/10618600.2012.694762

Everitt, B. S., Landau, S., Leese, M., & Stahel, D. (2011). Cluster Analysis (5th ed). John Wiley; Sons, Ltd.

Fienberg, S. E. (1979). Graphical Methods in Statistics. Journal of American Statistical Association, 33(4), 165–178.

Fisher, R. A. (1936). The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics, 7(2), 179–188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x

Fisherkeller, M. A., Friedman, J. H., & Tukey, J. W. (1973). PRIM-9, an Interactive Multidimensional Data Display and Analysis System. https://www.youtube.com/watch?v=B7XoW2qiFUA

Fisherkeller, M. A., Friedman, J. H., & Tukey, J. W. (1974). PRIM-9, an Interactive Multidimensional Data Display and Analysis System. In W. S. Cleveland (Ed.), The collected works of john w. Tukey: Graphics 1965-1985, volume v (pp. 340–346).

Forbes, J., Cook, D., & Hyndman, R. J. (2020). Spatial modelling of the two-party preferred vote in australian federal elections: 2001–2016. Australian & New Zealand Journal of Statistics, 62(2), 168–185. https://doi.org/https://doi.org/10.1111/anzs.12292

Ford, B. J. (1992). Images of Science: A History of Scientific Illustration. The British Library.

Forgy, E. (1965). Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of Classification. Biometrics, 21(3), 768–769.

Fraley, C., & Raftery, A. E. (2002). Model-based Clustering, Discriminant Analysis, Density Estimation. Journal of the American Statistical Association, 97, 611–631. https://doi.org/10.1198/016214502760047131

Fraley, C., Raftery, A. E., & Scrucca, L. (2024). Mclust: Gaussian mixture modelling for model-based clustering, classification, and density estimation. https://mclust-org.github.io/mclust/

Friedman, J. H. (1987). Exploratory Projection Pursuit. Journal of American Statistical Association, 82, 249–266.

Friedman, J. H., & Tukey, J. W. (1974). A Projection Pursuit Algorithm for Exploratory Data Analysis. IEEE Transactions on Computing C, 23, 881–889.

Friendly, M., & Denis, D. J. (2004). Milestones in the History of Thematic Cartography, Statistical Graphics, and Data Visualization. http://www.math.yorku.ca/SCS/Gallery/milestone/.

Fritsch, S., Guenther, F., & Wright, M. N. (2019). neuralnet: Training of Neural Networks. https://CRAN.R-project.org/package=neuralnet

Furnas, G. W., & Buja, A. (1994). Prosection Views: Dimensional Inference Through Sections and Projections. Journal of Computational and Graphical Statistics, 3(4), 323–385.

Gabriel, K. R. (1971). The Biplot Graphical Display of Matrices with Applications to Principal Component Analysis. Biometrika, 58, 453–467.

Gentle, J. E., Härdle, W., & Mori, Y. (Eds.). (2004). Handbook of Computational Statistics: Concepts and Methods. Springer.

Giordani, P., Ferraro, M. B., & Martella, F. (2020). An Introduction to Clustering with R. Springer Singapore. https://doi.org/10.1007/978-981-13-0553-5

Glover, D. M., & Hopke, P. K. (1992). Exploration of Multivariate Chemical Data by Projection Pursuit. Chemometrics and Intelligent Laboratory Systems, 16, 45–59.

Good, P. (2005). Permutation, Parametric, and Bootstrap Tests of Hypotheses. Springer.

Gower, J. C., & Hand, D. J. (1996). Biplots. Chapman; Hall.

Gruen, B. (2024). CRAN Task View: Cluster Analysis & Finite Mixture Models (Version 2024-08-20). https://cran.r-project.org/web/views/Cluster.html.

Hajibaba, H., Karlsson, L., & Dolnicar, S. (2016). Residents Open Their Homes to Tourists When Disaster Strikes. Journal of Travel Research, 56(8), 1065–1078.

Hansen, C., & Johnson, C. R. (2004). Visualization Handbook. Academic Press.

Hao, Y., Hao, S., Andersen-Nissen, E., III, W. M. M., Zheng, S., Butler, A., Lee, M. J., Wilk, A. J., Darby, C., Zagar, M., Hoffman, P., Stoeckius, M., Papalexi, E., Mimitou, E. P., Jain, J., Srivastava, A., Stuart, T., Fleming, L. B., Yeung, B., … Satija, R. (2021). Integrated Analysis of Multimodal Single-Cell Data. Cell. https://doi.org/10.1016/j.cell.2021.04.048

Harrison, P. (2023). langevitour: Smooth Interactive Touring of High Dimensions, Demonstrated with scRNA-Seq Data. The R Journal, 15, 206–219. https://doi.org/10.32614/RJ-2023-046

Harrison, P. (2024). Langevitour: Langevin tour. https://logarithmic.net/langevitour/

Hart, C., & Wang, E. (2024). detourr: Portable and Performant Tour Animations. https://casperhart.github.io/detourr/

Hartigan, J. A., & Kleiner, B. (1981). Mosaics for Contingency Tables. Computer Science and Statistics: Proceedings of the 13th Symposium on the Interface, 268–273.

Hartigan, J., & Kleiner, B. (1984). A Mosaic of Television Ratings. The American Statistician, 38, 32–35.

Haslett, J., Bradley, R., Craig, P., Unwin, A., & Wills, G. (1991). Dynamic Graphics for Exploring Spatial Data with Application to Locating Global and Local Anomalies. The American Statistician, 45(3), 234–242.

Hastie, T., Tibshirani, R., & Friedman, J. (2001). The Elements of Statistical Learning. Springer.

Hennig, C. (2024). fpc: Flexible Procedures for Clustering. https://CRAN.R-project.org/package=fpc

Hennig, C., Meila, M., Murtagh, F., & Rocci, R. (2015). Handbook of Cluster Analysis (1st ed.). Chapman; Hall/CRC. https://doi.org/10.1201/b19706

Hofmann, H. (2001). Graphical Tools for the Exploration of Multivariate Categorical Data. Books on Demand.

Hofmann, H. (2003). Constructing and Reading Mosaicplots. Computational Statistics and Data Analysis, 43(4), 565–580.

Hofmann, H., & Theus, M. (1998). Selection Sequences in MANET. Computational Statistics, 13(1), 77–87.

Horikoshi, M., & Tang, Y. (2018). ggfortify: Data Visualization Tools for Statistical Analysis Results. https://CRAN.R-project.org/package=ggfortify

Horikoshi, M., & Tang, Y. (2024). ggfortify: Data Visualization Tools for Statistical Analysis Results. https://github.com/sinhrks/ggfortify

Horst, A. M., Hill, A. P., & Gorman, K. B. (2022). Palmer Archipelago Penguins Data in the palmerpenguins R Package - An Alternative to Anderson’s Irises. The R Journal, 14, 244–254. https://doi.org/10.32614/RJ-2022-020

Horst, A., Hill, A., & Gorman, K. (2022). palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data. https://allisonhorst.github.io/palmerpenguins/

Hotelling, H. (1933). Analysis of a Complex of Statistical Variables into Principal Components. Journal of Educational Psychology, 24(6), 417--441. https://doi.org/10.1037/h0071325

Huber, P. J. (1985). Projection Pursuit (with discussion). Annals of Statistics, 13, 435–525.

Hurley, C. (1987). The Data Viewer: An Interactive Program for Data Analysis [PhD thesis]. University of Washington.

Iannone, R., Cheng, J., Schloerke, B., Hughes, E., Lauer, A., & Seo, J. (2024). Gt: Easily create presentation-ready display tables. https://gt.rstudio.com

Ihaka, R., & Gentleman, R. (1996). R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics, 5, 299–314.

Ihaka, R., Murrell, P., Hornik, K., Fisher, J. C., Stauffer, R., Wilke, C. O., McWhite, C. D., & Zeileis, A. (2024). colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes. https://colorspace.R-Forge.R-project.org/

Inselberg, A. (1985). The Plane with Parallel Coordinates. The Visual Computer, 1, 69–91.

Iowa State University. (2020). ASOS-AWOS-METAR Data Download. https://mesonet.agron.iastate.edu/request/download.phtml?network=AU__ASOS

Johnson, D., & Travis, J. (2007). Flatland: The Movie. https://round-drum-w7xh.squarespace.com/our-story.

Johnson, R. A., & Wichern, D. W. (2002). Applied Multivariate Statistical Analysis (5th ed). Prentice-Hall.

Jolliffe, I. T., & Cadima, J. (2016). Principal Component Analysis: A Review and Recent Developments. Philosophical Transactions of the Royal Society A, 374, 20150202. https://doi.org/10.1098/rsta.2015.0202

Jones, M. C., & Sibson, R. (1987). What is Projection Pursuit? (With discussion). Journal of the Royal Statistical Society, Series A, 150, 1–36.

Kandanaarachchi, S. (2022). dobin: Dimension Reduction for Outlier Detection. https://sevvandi.github.io/dobin/

Kandanaarachchi, S., & Hyndman, R. J. (2021). Dimension Reduction for Outlier Detection Using DOBIN. Journal of Computational and Graphical Statistics, 30(1), 204–219. https://doi.org/https://doi.org/10.1080/10618600.2020.1807353

Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. STHDA.

Kassambara, A. (2023). ggpubr: ggplot2 Based Publication Ready Plots. https://rpkgs.datanovia.com/ggpubr/

Kohonen, T. (2001). Self-Organizing Maps (3rd ed). Springer.

Koschat, M. A., & Swayne, D. F. (1996). Interactive Graphical Methods in the Analysis of Customer Panel Data (with discussion). Journal of Business and Economic Statistics, 14(1), 113–132.

Krijthe, J. (2023). Rtsne: T-Distributed Stochastic Neighbor Embedding using a Barnes-hut Implementation. https://github.com/jkrijthe/Rtsne

Kruskal, J. B. (1964a). Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis. Psychometrika, 29, 1–27.

Kruskal, J. B. (1964b). Nonmetric Multidimensional Scaling: A Numerical Method. Psychometrika, 29, 115–129.

Kruskal, J. B., & Wish, M. (1978). Multidimensional Scaling. Sage Publications.

Kuhn, M., & Wickham, H. (2020). tidymodels: A Collection of Packages for Modeling and Machine Learning using tidyverse Principles. https://www.tidymodels.org

Kuhn, M., & Wickham, H. (2024). tidymodels: Easily Install and Load the Tidymodels Packages. https://tidymodels.tidymodels.org

Laa, U., Aumann, A., Cook, D., & Valencia, G. (2023). New and Simplified Manual Controls for Projection and Slice Tours, With Application to Exploring Classification Boundaries in High Dimensions. Journal of Computational and Graphical Statistics, 32(3), 1229–1236. https://doi.org/10.1080/10618600.2023.2206459

Laa, U., Cook, D., & Lee, S. (2022). Burning Sage: Reversing the Curse of Dimensionality in the Visualization of High-Dimensional Data. Journal of Computational and Graphical Statistics, 31(1), 40–49. https://doi.org/10.1080/10618600.2021.1963264

Laa, U., Cook, D., & Valencia, G. (2020a). A Slice Tour for Finding Hollowness in High-Dimensional Data. Journal of Computational and Graphical Statistics, 29(3), 681–687. https://doi.org/10.1080/10618600.2020.1777140

Laa, U., Cook, D., & Valencia, G. (2020b). A Slice Tour for Finding Hollowness in High-Dimensional Data. Journal of Computational and Graphical Statistics, 29(3), 681–687. https://doi.org/10.1080/10618600.2020.1777140

Lancaster, H. O. (1965). The Helmert Matrices. The American Mathematical Monthly, 72(1), 4–12.

Laurent, S. (2023). cxhull: Convex Hull. https://github.com/stla/cxhull

Lee, E.-K. (2018). PPtreeViz: An R package for Visualizing Projection Pursuit Classification Trees. Journal of Statistical Software, 83(8), 1–30. https://doi.org/10.18637/jss.v083.i08

Lee, E.-K., & Cook, D. (2009). A Projection Pursuit Index for Large \(p\) Small \(n\) Data. Statistics and Computing, 20, 381–392. https://doi.org/10.1007/s11222-009-9131-1

Lee, E.-K., Cook, D., Klinke, S., & Lumley, T. (2005). Projection Pursuit for Exploratory Supervised Classification. Journal of Computational and Graphical Statistics, 14(4), 831–846.

Lee, S. (2021). Liminal: Multivariate data visualization with tours and embeddings. https://github.com/sa-lee/liminal/

Lee, S., Cook, D., Silva, N. da, Laa, U., Spyrison, N., Wang, E., & Zhang, H. S. (2022). The State-of-the-Art on Tours for Dynamic Visualization of High-Dimensional Data. WIREs Computational Statistics, 14(4), e1573. https://doi.org/10.1002/wics.1573

Lee, Y. D., Cook, D., Park, J., & Lee, E.-K. (2013). PPtree: Projection Pursuit Classification Tree. Electronic Journal of Statistics, 7(none), 1369–1386. https://doi.org/10.1214/13-EJS810

Leisch, F. (2008). Visualizing Cluster Analysis and Finite Mixture Models. In Handbook of Data Visualization (pp. 561–587). Springer. https://doi.org/10.1007/978-3-540-33037-0_22

Li, M., Zhao, Z., & Scheidegger, C. (2020). Visualizing Neural Networks with the Grand Tour. Distill. https://doi.org/10.23915/distill.00025

Liaw, A., & Wiener, M. (2002). Classification and Regression by randomForest. R News, 2(3), 18–22. https://CRAN.R-project.org/doc/Rnews/

Littman, M. L., Swayne, D. F., Dean, N., & Buja, A. (1992). Visualizing the Embedding of Objects in Euclidean Space. Computing Science and Statistics: Proceedings of the 24th Symposium on the Interface, 208–217.

Lloyd, S. (1982). Least Squares Quantization in PCM. IEEE Transactions on Information Theory, 28(2), 129–137. https://doi.org/10.1109/TIT.1982.1056489

Longley, P. A., Maguire, D. J., Goodchild, M. F., & Rhind, D. W. (2005). Geographic Information Systems and Science. John Wiley & Sons.

Loperfido, N. (2018). Skewness-Based Projection Pursuit: A Computational Approach. Computational Statistics & Data Analysis, 120, 42–57. https://doi.org/https://doi.org/10.1016/j.csda.2017.11.001

Maaten, L. van der, & Hinton, G. (2008). Visualizing Data Using t-SNE. J. Mach. Learn. Res., 9(Nov), 2579–2605. http://www.jmlr.org/papers/v9/vandermaaten08a.html

MacQueen, J. B. (1967). Some Methods for Classification and Analysis of Multivariate Observations. In L. M. L. Cam & J. Neyman (Eds.), Proc. Of the fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 281–297). University of California Press.

Maindonald, J., & Braun, J. (2003). Data Analysis and Graphics using R - an Example-based Approach. Cambridge University Press.

Martin, E. (1965). Flatland. http://www.der.org/films/flatland.html.

Mayer, M. (2024). shapviz: SHAP visualizations. https://CRAN.R-project.org/package=shapviz

Mayer, M., & Watson, D. (2023). kernelshap: Kernel SHAP. https://CRAN.R-project.org/package=kernelshap

McFarlane, M., & Young, F. W. (1994). Graphical Sensitivity Analysis for Multidimensional Scaling. Journal of Computational and Graphical Statistics, 3, 23–33.

McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. http://arxiv.org/abs/1802.03426

McNeil, D. (1977). Interactive Data Analysis. John Wiley; Sons.

McVicar, T. (2011). Near-Surface Wind Speed. v10. CSIRO. Data Collection. https://doi.org/10.25919/5c5106acbcb02

Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., & Leisch, F. (2024). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. https://CRAN.R-project.org/package=e1071

Milborrow, S. (2024). rpart.plot: Plot rpart Models: An Enhanced Version of plot.rpart. http://www.milbo.org/rpart-plot/index.html

Mock, T. (2023). gtExtras: Extending gt for beautiful HTML tables. https://github.com/jthomasmock/gtExtras

Molnar, C. (2025). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (3rd ed). https://christophm.github.io/interpretable-ml-book/.

Moon, K. R., Dijk, D. van, Wang, Z., Gigante, S., Burkhardt, D. B., Chen, W. S., Yim, K., Elzen, A. van den, Hirn, M. J., Coifman, R. R., Ivanova, N. B., Wolf, G., & Krishnaswamy, S. (2019). Visualizing Structure and Transitions for Biological Data Exploration. Nature Biotechnology, 37, 1482–1492. https://doi.org/10.1038/s41587-019-0336-3

Murrell, P. (2005). R Graphics. Chapman; Hall/CRC.

OpenStreetMap contributors. (2020). Planet Dump Retrieved from https://planet.osm.org . https://www.openstreetmap.org.

Pearson, K. (1901). LIII. On Lines and Planes of Closest Fit to Systems of Points in Space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572. https://doi.org/10.1080/14786440109462720

Pedersen, T. L. (2024). patchwork: The Composer of Plots. https://patchwork.data-imaginist.com

Perisic, I., & Posse, C. (2005). Projection Pursuit Indices Based on the Empirical Distribution Function. Journal of Computational and Graphical Statistics, 14(3), 700–715. https://doi.org/10.1198/106186005X69440

Polzehl, J. (1995). Projection Pursuit Discriminant Analysis. Computational Statistics and Data Analysis, 20, 141–157.

Posse, C. (1992). Projection Pursuit Discriminant Analysis for Two Groups. Communications in Statistics, Part A - Theory and Methods, 21, 1–19.

Posse, C. (1995). Tools for Two-dimensional Projection Pursuit. Journal of Computational and Graphical Statistics, 4(2), 83–100.

P-Tree System. (2020). JAXA Himawari Monitor - User’s Guide. https://www.eorc.jaxa.jp/ptree/userguide.html

R Core Team. (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/

Rao, C. R. (1948). The Utilization of Multiple Measurements in Problems of Biological Classification (with discussion). Journal of the Royal Statistical Society, Series B, 10, 159–203.

Rao, C. R. (Ed.). (1993). Handbook of Statistics, Vol. 9. Elsevier Science Publishers.

Rao, C. R., Wegman, E. J., & Solka, J. L. (Eds.). (2006). Handbook of Statistics: Data Mining and Visualization. Elsevier/North-Holland.

Ripley, B. (1996). Pattern Recognition and Neural Networks. Cambridge University Press.

Ripley, B. (2023). nnet: Feed-Forward Neural Networks and Multinomial Log-Linear Models. http://www.stats.ox.ac.uk/pub/MASS4/

Ripley, B., & Venables, B. (2024). MASS: Support functions and datasets for venables and ripley’s MASS. http://www.stats.ox.ac.uk/pub/MASS4/

Robinson, D., Hayes, A., & Couch, S. (2024). broom: Convert Statistical Objects into Tidy Tibbles. https://CRAN.R-project.org/package=broom

Rothkopf, E. Z. (1957). A Measure of Stimulus Similarity and Errors in Some Paired-Associate Learning Tasks. Journal of Experimental Psychology, 2, 94–101. https://psycnet.apa.org/doi/10.1037/h0041867

Roweis, S. T., & Saul, L. K. (2000). Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 290(5500), 2323–2326. https://doi.org/10.1126/science.290.5500.2323

Satija, R., Farrell, J. A., Gennert, D., Schier, A. F., & Regev, A. (2015). Spatial Reconstruction of Single-Cell Gene Expression Data. Nature Biotechnology, 33, 495–502. https://doi.org/10.1038/nbt.3192

Savageau, D., & Boyer, R. (1993). Places Rated Almanac: Your Guide to Finding the Best Places to Live in North America. Prentce Hall Travel.

Schloerke, B. (2016). geozoo: Zoo of Geometric Objects. http://schloerke.github.io/geozoo/

Schloerke, B., Cook, D., Larmarange, J., Briatte, F., Marbach, M., Thoen, E., Elberg, A., & Crowley, J. (2024). GGally: Extension to ggplot2. https://ggobi.github.io/ggally/

Schloerke, B., Wickham, H., Cook, D., & Hofmann, H. (2016). Escape from Boxland. The R Journal, 8, 243–257.

Scrucca, L., Fraley, C., Murphy, T. B., & Raftery, A. E. (2023). Model-Based Clustering, Classification, and Density Estimation Using mclust in R. Chapman; Hall/CRC. https://doi.org/10.1201/9781003277965

Shepard, R. N. (1962). The Analysis of Proximities: Multidimensional Scaling with an Unknown Distance Function, I and II. Psychometrika, 27, 125-139 and 219-246.

Sievert, C. (2020). Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman; Hall/CRC. https://plotly-r.com

Sievert, C., Parmer, C., Hocking, T., Chamberlain, S., Ram, K., Corvellec, M., & Despouy, P. (2024). plotly: Create Interactive Web Graphics via plotly.js. https://plotly-r.com

Sjoberg, D. D., Larmarange, J., Curry, M., Lavery, J., Whiting, K., & Zabor, E. C. (2024). Gtsummary: Presentation-ready data summary and analytic result tables. https://github.com/ddsjoberg/gtsummary

Sjoberg, D. D., Whiting, K., Curry, M., Lavery, J. A., & Larmarange, J. (2021). Reproducible Summary Tables with the gtsummary Package. The R Journal, 13, 570–580. https://doi.org/10.32614/RJ-2021-053

Slowikowski, K. (2024). Ggrepel: Automatically position non-overlapping text labels with ggplot2. https://ggrepel.slowkow.com/

Sparks, A. H., Carroll, J., Goldie, J., Marchiori, D., Melloy, P., Padgham, M., Parsonage, H., & Pembleton, K. (2020). bomrang: Australian government bureau of meteorology (BOM) data client. https://CRAN.R-project.org/package=bomrang

Spence, R. (2007). Information Visualization: Design for Interaction. Prentice Hall.

Stauffer, R., Mayr, G. J., Dabernig, M., & Zeileis, A. (2009). Somewhere over the Rainbow: How to Make Effective Use of Colors in Meteorological Visualizations. Bulletin of the American Meteorological Society, 96(2), 203–216. https://doi.org/10.1175/BAMS-D-13-00155.1

Stuart, T., Butler, A., Hoffman, P., Hafemeister, C., Papalexi, E., III, W. M. M., Hao, Y., Stoeckius, M., Smibert, P., & Satija, R. (2019). Comprehensive Integration of Single-Cell Data. Cell, 177, 1888–1902. https://doi.org/10.1016/j.cell.2019.05.031

Sutherland, P., Rossini, A., Lumley, T., Lewin-Koh, N., Dickerson, J., Cox, Z., & Cook, D. (2000). Orca: A Visualization Toolkit for High-Dimensional Data. Journal of Computational and Graphical Statistics, 9(3), 509–529. https://doi.org/10.1080/10618600.2000.10474896

Swayne, D. F., Buja, A., & Temple Lang, D. (2004). Exploratory Visual Analysis of Graphs in GGobi. In J. Antoch (Ed.), CompStat: Proceedings in computational statistics, 16th symposium. Physica-Verlag.

Swayne, D. F., Cook, D., & Buja, A. (1992). XGobi: Interactive Dynamic Graphics in the X Window System with a Link to S. American Statistical Association 1991 Proceedings of the Section on Statistical Graphics, 1–8.

Swayne, D. F., Cook, D., & Buja, A. (1998). XGobi: Interactive Dynamic Data Visualization in the X Window System. Journal of Computational and Graphical Statistics, 7(1), 113–130. https://doi.org/10.1080/10618600.1998.10474764

Swayne, D. F., & Klinke, S. (1998). Editorial Commentary. Computational Statistics: Special Issue on The Use of Interactive Graphics, 14(1).

Swayne, D. F., Temple Lang, D., Buja, A., & Cook, D. (2003). GGobi: Evolving from XGobi into an Extensible Framework for Interactive Data Visualization. Computational Statistics & Data Analysis, 43, 423–444.

Swayne, D., & Buja, A. (1998). Missing Data in Interactive High-Dimensional Data Visualization. Computational Statistics, 13(1), 15–26.

Symanzik, J. (2002). New Applications of the Image Grand Tour. Computing Science and Statistics, 34, 500--512. https://math.usu.edu/symanzik/papers/2002_interface.pdf

Symanzik, J. (2004). Interactive and Dynamic Graphics. In J. E. Gentle, W. Härdle, & Y. Mori (Eds.), Handbook of Computational Statistics: Concepts and Methods (pp. 293–336). Springer.

Takatsuka, M., & Gahegan, M. (2002). GeoVISTA Studio: A Codeless Visual Programming Environment for Geoscientific Data Analysis and Visualization. The Journal of Computers and Geosciences, 28(10), 1131–1144.

Tang, Y., Horikoshi, M., & Li, W. (2016). ggfortify: Unified Interface to Visualize Statistical Result of Popular R Packages. The R Journal, 8(2), 474–485. https://doi.org/10.32614/RJ-2016-060

Tarpey, T., Li, L., & Flury, B. (1995). Principal Points and Self-Consistent Points of Elliptical Distributions. The Annals of Statistics, 23, 103–112.

Temple Lang, D., Swayne, D., Wickham, H., & Lawrence, M. (2006). rggobi: An Interface between R and GGobi. http://www.R-project.org.

Tenenbaum, J. B., Silva, V. de, & Langford, J. C. (2000). A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290(5500), 2319–2323. https://doi.org/10.1126/science.290.5500.2319

Therneau, T., & Atkinson, B. (2023). rpart: Recursive Partitioning and Regression trees. https://github.com/bethatkinson/rpart

Theus, M. (2002). Interactive Data Visualization Using Mondrian. Journal of Statistical Software, 7(11), http://www.jstatsoft.org.

Theus, M., Hofmann, H., & Wilhelm, A. F. X. (1998). Selection Sequences - Interactive Analysis of Massive Data Sets. Computing Science and Statistics, 29(1), 439–444.

Thompson, G. L. (1993). Generalized Permutation Polytopes and Exploratory Graphical Methods for Ranked Data. The Annals of Statistics, 21, 1401–1430.

Tierney, L. (1991). LispStat: An Object-Orientated Environment for Statistical Computing and Dynamic Graphics. John Wiley & Sons.

Tierney, N., & Cook, D. (2023a). Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations. Journal of Statistical Software, 105(7), 1–31. https://doi.org/10.18637/jss.v105.i07

Tierney, N., & Cook, D. (2023b). Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations. Journal of Statistical Software, 105(7), 1–31. https://doi.org/10.18637/jss.v105.i07

Tierney, N., Cook, D., McBain, M., & Fay, C. (2024). naniar: Data Structures, Summaries, and Visualisations for Missing Data. https://github.com/njtierney/naniar

Torgerson, W. S. (1952). Multidimensional Scaling. 1. Theory and Method. Psychometrika, 17, 401–419.

Tufte, E. (1983). The Visual Display of Quantitative Information. Graphics Press.

Tufte, E. (1990). Envisioning Information. Graphics Press.

Tukey, J. W. (1965). The Technical Tools of Statistics. The American Statistician, 19, 23–28.

Unwin, A. R., Hawkins, G., Hofmann, H., & Siegl, B. (1996). Interactive Graphics for Data Sets with Missing Values - MANET. Journal of Computational and Graphical Statistics, 5(2), 113–122.

Unwin, A., Hofmann, H., & Wilhelm, A. (2002). Direct Manipulation Graphics for Data Mining. Journal of Image and Graphics, 2(1), 49–65.

Unwin, A., Theus, M., & Hofmann, H. (2006). Graphics of Large Datasets: Visualizing a Million. Springer.

Unwin, A., Volinsky, C., & Winkler, S. (2003). Parallel Coordinates for Exploratory Modelling Analysis. Comput. Stat. Data Anal., 43(4), 553–564. https://doi.org/{\tt http://dx.doi.org/10.1016/S0167-9473(02)00292-X}

Urbanek, S., & Theus, M. (2003). iPlots: High Interaction Graphics for R. In K. Hornik, F. Leisch, & A. Zeileis (Eds.), Proceedings of the 3rd international workshop on distributed statistical computing (DSC 2003).

Vaidyanathan, R., Xie, Y., Allaire, J., Cheng, J., Sievert, C., & Russell, K. (2023). Htmlwidgets: HTML widgets for r. https://github.com/ramnathv/htmlwidgets

van den Boogaart, K. G., Tolosana-Delgado, R., & Bren, M. (2024). compositions: Compositional Data Analysis. http://www.stat.boogaart.de/compositions/

van der Maaten, L. J. P. (2014). Accelerating t-SNE using Tree-Based lgorithms. Journal of Machine Learning Research, 15, 3221–3245.

van der Maaten, L. J. P., & Hinton, G. E. (2008). Visualizing High-Dimensional Data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.

Vapnik, V. N. (1999). The Nature of Statistical Learning Theory. Springer.

Velleman, P. F., & Velleman, A. Y. (1985). Data Desk Handbook. Data Description, Inc.

Venables, W. N., & Ripley, B. (2002). Modern Applied Statistics with S. Springer-Verlag. https://www.stats.ox.ac.uk/pub/MASS4/

Wainer, H. (2000). Visual Revelations (2nd ed). LEA, Inc.

Wainer, H., & Spence, I. (eds). (2005a). The Commercial and Political Atlas, Representing, by means of Stained Copper-Plate Charts, The Progress of the Commerce, Revenues, Expenditure, and Debts of England, during the whole of the Eighteenth Century, by William Playfair. Cambridge University Press.

Wainer, H., & Spence, I. (eds). (2005b). The Statistical Breviary; Shewing on a Principle entirely new, the resources of every state and kingdom in Europe; illustrated with Stained Copper-Plate Charts, representing the physical powers of each distinct nation with ease and perspicuity by William Playfair. Cambridge University Press.

Wang, P. C. C. (Ed.). (1978). Graphical Representation of Multivariate Data. Academic Press.

Wang, Y., Huang, H., Rudin, C., & Shaposhnik, Y. (2021). Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization. Journal of Machine Learning Research, 22(201), 1–73. http://jmlr.org/papers/v22/20-1061.html

Wegman, E. (1990). Hyperdimensional Data Analysis Using Parallel Coordinates. Journal of American Statistics Association, 85, 664–675.

Wegman, E. J. (1991). The Grand Tour in \(k\)-Dimensions (Technical Report No. 68). Center for Computational Statistics, George Mason University.

Wegman, E. J., & Carr, D. B. (1993). Statistical Graphics and Visualization (C. R. Rao, Ed.; pp. 857–958). Elsevier Science Publishers.

Wegman, E. J., Poston, W. L., & Solka, J. L. (1998). Image Grand Tour. Automatic Target Recognition VIII - Proceedings of SPIE, 3371, 286–294.

Wehrens, R., & Buydens, L. M. C. (2007). Self- and Super-Organizing Maps in R: The kohonen package. Journal of Statistical Software, 21(5), 1–19. https://doi.org/10.18637/jss.v021.i05

Wehrens, R., & Kruisselbrink, J. (2018). Flexible Self-Organizing Maps in kohonen 3.0. Journal of Statistical Software, 87(7), 1–18. https://doi.org/10.18637/jss.v087.i07

Wehrens, R., & Kruisselbrink, J. (2023). Kohonen: Supervised and unsupervised self-organising maps. https://CRAN.R-project.org/package=kohonen

Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org

Wickham, H. (2022). classifly: Explore Classification Models in High Dimensions. http://had.co.nz/classifly

Wickham, H., Chang, W., Henry, L., Pedersen, T. L., Takahashi, K., Wilke, C., Woo, K., Yutani, H., Dunnington, D., & van den Brand, T. (2024). ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://ggplot2.tidyverse.org

Wickham, H., & Cook, D. (2025). tourr: Tour Methods for Multivariate Data Visualisation. https://github.com/ggobi/tourr

Wickham, H., Cook, D., & Hofmann, H. (2015). Visualizing Statistical Models: Removing the Blindfold. Statistical Analysis and Data Mining: The ASA Data Science Journal, 8(4), 203–225. https://doi.org/10.1002/sam.11271

Wickham, H., Cook, D., Hofmann, H., & Buja, A. (2011). tourr: An R Package for Exploring Multivariate Data with Projections. Journal of Statistical Software, 40(2). https://doi.org/10.18637/jss.v040.i02

Wickham, H., François, R., Henry, L., Müller, K., & Vaughan, D. (2023). dplyr: A Grammar of Data Manipulation. https://dplyr.tidyverse.org

Wickham, H., Hester, J., & Bryan, J. (2024). readr: Read Rectangular Text Data. https://readr.tidyverse.org

Wilhelm, A. F. X., Wegman, E. J., & Symanzik, J. (1999). Visual Clustering and Classification: The Oronsay Particle Size Data Set Revisited. Computational Statistics: Special Issue on Interactive Graphical Data Analysis, 14(1), 109–146.

Wilkinson, L. (2005). The Grammar of Graphics. Springer.

Wills, G. (1999). NicheWorks - Interactive Visualization of Very Large Graphs. Journal of Computational and Graphical Statistics, 8(2), 190–212.

Xie, Y., Hofmann, H., & Cheng, X. (2014). Reactive Programming for Interactive Graphics. Statistical Science, 29(2), 201–213. https://doi.org/10.1214/14-STS477

Young, F. W., Valero-Mora, P. M., & Friendly, M. (2006). Visual Statistics: Seeing Data with Dynamic Interactive Graphics. John Wiley & Sons.

Zeileis, A., Fisher, J. C., Hornik, K., Ihaka, R., McWhite, C. D., Murrell, P., Stauffer, R., & Wilke, C. O. (2020). colorspace: A toolbox for manipulating and assessing colors and palettes. Journal of Statistical Software, 96(1), 1–49. https://doi.org/10.18637/jss.v096.i01

Zeileis, A., Hornik, K., & Murrell, P. (2009). Escaping RGBland: Selecting Colors for Statistical Graphics. Computational Statistics & Data Analysis, 53(9), 3259–3270. https://doi.org/10.1016/j.csda.2008.11.033

Zhang, C., Ye, J., & Wang, X. (2023). A Computational Perspective on Projection Pursuit in High Dimensions: Feasible or Infeasible Feature Extraction. International Statistical Review, 91(1), 140–161. https://doi.org/10.1111/insr.12517

Zhang, H. S., Cook, D., Laa, U., Langrené, N., & Menéndez, P. (2021). Visual Diagnostics for Constrained Optimisation with Application to Guided Tours. The R Journal, 13(2), 624–641. https://doi.org/10.32614/RJ-2021-105

Zhang, H. S., Cook, D., Laa, U., Langrené, N., & Menéndez, P. (2024). ferrn: Facilitate Exploration of touRR optimisatioN. https://github.com/huizezhang-sherry/ferrn/

Zhu, H. (2024). kableExtra: Construct complex table with kable and pipe syntax. http://haozhu233.github.io/kableExtra/