Eigenvector University returns to Seattle May 15-20, 2022 Complete Info Here!

Non-linear Machine Learning for Calibration and Classification

Course Description

While linear machine learning methods, such as PLS regression, work in a very wide range of problems of chemical and biological interest, there are times when the relationships between variables are complex and require non-linear modeling methods. Many non-linear machine learning methods have been developed, however, we will focus on a few that we have found quite useful. The course begins with a discussion of linearizing transforms. Augmenting with non-linear transforms, e.g. polynomials, is discussed next. Locally Weighted Regression (LWR), Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs) are then considered, with SVMS for both regression and classification considered. Boosted regression and classification trees (XGBoost) and then covered. The course concludes with segments on how to choose a method and how to implement models online. The course includes hands-on computer time for participants to work example problems using PLS_Toolbox or Solo.

Prerequisites

Linear Algebra for Machine Learning and ChemometricsMATLAB for Machine Learning and ChemometricsChemometrics I–PCA and Chemometrics II–Regression and PLS or equivalent experience.

Course Outline

  1. Introduction 
    – Why non-linear methods?
    – How linear methods deal with non-linear data
  2. Variable Transformations
    – Log, sqrt, etc.
    – Augmenting with non-linear transforms
  3. Factor based transforms
    – PCA Scores and Augmenting
    – Polynomial PLS
  4. Locally Weighted Regression
    – Weighted Regression
    – Distance Measures
    – Basing Models on PCA Scores
  5. Support Vector Machines
    – SVM basics, shattering theorm
    – Kernel functions
    – Classification Models
    – Regression Models
  6. Artificial Neural Networks
    – ANN structures
    – Training procedures
    – Avoiding overfitting
  7. Gradient Boosted Decision Trees
    – Intro to decision trees
    – Classification and Regression Ensemble Models
    – XGBoost
  8. Choosing the right method
    – Prediction skill
    – Computational performance
    – Deployment options