Join us for the 18th Eigenvector University in Seattle May 6-10, 2024 Complete Info Here!

Common Mistakes in Machine Learning (and how not to make them)

Course Description

Machine learning tools for chemometrics make it possible to handle complex data and extract useful information. Unfortunately, though, going from univariate to multivariate analysis does not imply that there are less pitfalls and potential problems in the data analysis. In this course, we will go through many of the problems that occur when analyzing and interpreting multivariate data. The examples will mainly focus on the use of PCA and PLS but most of the conclusions are generally applicable.

The course includes hands-on computer time for participants to work example problems using PLS_Toolbox or Solo.

Prerequisites

Chemometrics I–PCA and Chemometrics II–Regression and PLS or equivalent experience.

Course Outline

  1. Chemometrics – the basic idea
  2. Variability is not information
  3. How to measure variability – misuse of correlations
  4. Interpreting scores and loadings
  5. Interpreting biplots
  6. How to optimize a model
  7. Interpreting a regression model
  8. Interpreting a regression vector
  9. Understanding correlation and causality
  10. How to determine the number of components