PLS-DA (Partial Least Squares Discriminant Analysis)
Discriminate groups in omics or high-dimensional data.
Definition
PLS-DA is a supervised method combining dimensionality reduction (like PCA) with group discrimination. It is particularly used in metabolomics, proteomics, and genomics where the number of variables far exceeds the number of observations.
When to use it
Discriminate groups in high-dimensional data (p >> n)
Omics data: metabolomics, proteomics, genomics
Identify the most important discriminating variables
When PCA does not sufficiently discriminate groups
Requirements
Categorical dependent variable (2 or more groups)
Numerous continuous independent variables
Standardized data recommended
What StatsLab computes
Scores plots (latent components)
Loadings plots (variable contributions)
VIP scores (Variable Importance in Projection)
% variance explained per component
Visual separation of groups
Worked example
Context : Discriminating 3 cancer types (n=90) from 150 blood metabolites.
Result : LV1: 28% variance · LV2: 19% · Clear separation of 3 groups on scores plot
Interpretation : PLS-DA perfectly discriminates the 3 cancer types on the first 2 components. The 15 metabolites with VIP > 1.5 are the most relevant candidate biomarkers.