IKPLS: Improved Kernel Partial Least Squares and Fast Cross-Validation Algorithms for Python with CPU and GPU Implementations Using NumPy and JAX

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 509 KB, PDF document

The ikpls software package provides fast and efficient tools for PLS (Partial Least Squares)
modeling. This package is designed to help researchers and practitioners handle PLS modeling
faster than previously possible - particularly on large datasets. The PLS implementations in
ikpls use the fast IKPLS (Improved Kernel PLS) algorithms (Dayal & MacGregor, 1997),
providing a substantial speedup compared to scikit-learn’s (Pedregosa et al., 2011) PLS
implementation, which is based on NIPALS (Nonlinear Iterative Partial Least Squares) (H.
Wold, 1966). The ikpls package also offers an implementation of IKPLS combined with
the fast cross-validation algorithm by O.-C. G. Engstrøm (2024), significantly accelerating
cross-validation of PLS models - especially when using a large number of cross-validation splits.
ikpls offers NumPy-based CPU and JAX-based CPU/GPU/TPU implementations. The
JAX implementations are also differentiable, allowing seamless integration with deep learning
techniques. This versatility enables users to handle diverse data dimensions efficiently.
In conclusion, ikpls empowers researchers and practitioners in machine learning, chemometrics,
and related fields with efficient, scalable, and end-to-end differentiable tools for PLS modeling,
facilitating optimal component selection and preprocessing decisions by offering implementations
of
1. both variants of IKPLS for CPUs;
2. both variants of IKPLS for GPUs, both of which are end-to-end differentiable, allowing
integration with deep learning models;
3. IKPLS combined with a cross-validation algorithm that yields a substantial speedup
compared to the classical cross-validation algorithm.
Original languageEnglish
Article number6533
JournalThe Journal of Open Source Software
Volume9
Issue number99
Number of pages6
ISSN2475-9066
DOIs
Publication statusPublished - 2024

ID: 401142043