Is an R² Value Greater Than 0.2 Acceptable in Large-Sample PLS-DA/OPLS-DA Analyses? Are There Supporting References?

    The acceptability of an R² value greater than 0.2 in large-sample analyses largely depends on the specific research context and disciplinary standards. In certain fields, particularly in biology and chemometrics, where datasets often exhibit high dimensionality and substantial noise, an R² above 0.2 may be deemed acceptable. Nevertheless, higher R² values are generally indicative of better model fit and enhanced predictive performance.

     

    There is no universally accepted threshold for R² values. Instead, model evaluation should be based on a combination of complementary metrics, such as the Predictive Residual Sum of Squares (PRESS), the Q² statistic (cross-validated explained variance), model interpretability, and overall stability. Generally, robust models are characterized by high R² and Q² values alongside low PRESS scores.

     

    When interpreting PLS-DA or OPLS-DA score plots, additional factors should also be considered, including the proportion of variance explained by each component, variable loading plots, and the spatial distribution of samples in the score space. These factors provide a more comprehensive assessment of model validity and discriminative capacity.

     

    Whether an R² > 0.2 is acceptable in large-sample studies is contingent on the analytical objectives and domain-specific expectations. It is advisable to consult relevant literature within the field to determine appropriate R² benchmarks.

     

    Relevant references include the following

    1. Trygg, J., & Wold, S. (2002). Orthogonal projections to latent structures (O-PLS). Journal of Chemometrics, 16(3), 119–128. https://doi.org/10.1002/cem.695

    An introduction to the theoretical framework and application of the OPLS method.

     

    2. Bylesjö, M., et al. (2006). OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. Journal of Chemometrics, 20(8–10), 341–351. https://doi.org/10.1002/cem.1006

    A comparative study of OPLS-DA, PLS-DA, and SIMCA highlighting the advantages of OPLS-DA.

     

    3. Eriksson, L., et al. (2008). CV-ANOVA for significance testing of PLS and OPLS® models. Journal of Chemometrics, 22(11–12), 594–600. https://doi.org/10.1002/cem.1187

    A method for evaluating the statistical significance of latent variable models.

     

    4. Worley, B., & Powers, R. (2013). Multivariate Analysis in Metabolomics. Current Metabolomics, 1(1), 92–107. https://doi.org/10.2174/2213235X11301010092

    An overview of multivariate modeling techniques, including PLS-DA and OPLS-DA, in metabolomics research.

     

    MtoZ Biolabs, an integrated chromatography and mass spectrometry (MS) services provider.

    Related Services

    PLS-DA/OPLS-DA Two-Dimensional Diagrams Analysis Service

Submit Inquiry
Name *
Email Address *
Phone Number
Inquiry Project
Project Description *

 

How to order?


/assets/images/icon/icon-message.png

Submit Inquiry

/assets/images/icon/icon-return.png