Appendix D — Glossary

Table D.1: List of common terms used in the book and their synonyms used elsewhere.
Term Synonyms Description
variable feature, attribute a characteristic, number or quantity that can be measured
observations cases, items, experimental units, observational units, records, statistical units, instances, examples individuals on which the observations are made
data set data, file collection of observations made on one or more variables
response target variable that one wishes to predict
predictor independent variable, feature variables used to produce a mode to predict the response
similarity correlation a measure ranging between 0 and 1, with 1 indicating that the cases are closer
dissimilarity distance a measure where a smaller number means the cases are closer
principal component analysis (PCA) empirical orthogonal functions, eigenvalue decomposition summarise a high-dimensional variance-covariance using an orthonormal matrix and set of variances. Related methods include factor analysis, multidimensional scaling.
linear discriminant analysis (LDA) Fisher's linear discriminant reduce the dimension to the space where the classes are most separated relative to the class means and pooled variance-covariance.
self-organising map (SOM) Kohonen map use a grid-constrained set of means to cluster high-dimensional data, and also provide a 2D view of the clusters