Accelerating Model Development with Diviner, Ensembles and Model Optimizer

Course Description

Diviner, our semi-automated machine learning tool (semi-autoML) for developing calibration models, was introduced with PLS_Toolbox and Solo 9.5 in 2024. (Divine—to discover or locate something by intuition, insight or supernatural means.) Unlike conventional autoML methods that yield a single optimal model, Diviner creates a diverse family of models and ranks them by their cross-validation performance, degree of overfitting, and prediction error on validation sets. These models may be further refined to produce a single “optimal” model. Alternately, a group of candidate models can be used to create an ensemble model that harnesses the diversity of the candidate models. These models are combined using voting regression. This leverages the complementary strengths and reduces individual model weaknesses, enhancing overall predictive accuracy and robustness. Our Model Optimizer tool can also be used to refine fine tune models and also to explore non-linear models. In this hands-on class participants will learn to use Diviner and Model Optimizer to create single models and ensembles. The course includes hands-on computer time for participants to work example problems using PLS_Toolbox or Solo.

Prerequisites

Chemometrics II–Regression and PLS or equivalent experience.

Course Outline

  1. The Problem of Calibration Model Development
  2. The Promise and Problems with Automated Machine Learning (AutoML)
  3. Diviner Approach – Semi-AutoML
    • A perfect model?
    • Overfit and cross-validation error
    • User input
  4. Diviner Workflow
    • Setting preprocessing options
    • Cross-validation and test sets
    • Outlier detection
    • Selecting models for further refinement
    • Refining models
    • Inspecting individual models
    • Saving models and ensembles
    • Hands-on examples
  5. Improving Prediction Error and Robustness with Ensemble Models
    • Model ambiguity
    • Optimizing ensembles
    • Hands-on examples
  6. Model Optimizer
    • Model snapshots
    • Parameter search
    • Non-linear models (ANN, SVM, LWR etc.)
    • Comparing models
    • Hands-on Examples
  7. Conclusions
  8. Wrap-up and Discussion