## CHEMOMETRICS: THE APPLICATION OF MATHEMATICAL AND STATISTICAL METHODS FOR ANALYSIS OF CHEMICAL DATA

### 1 Chemometrics background

1.1 Definition

1.2 Origins in Psychology and Economics

1.3 Early applications in Chemistry

### 2 Terms, definitions and nomenclature

2.1 Scalars, vectors and matrices

2.2 Vector and matrix multiplication

2.3 Matrix inverses

### 3 Introduction to the MATLAB computational environment

3.1 The MATLAB workspace

3.2 Entering data

3.3 Saving, Clearing and Loading Data

3.4 Simple operations on the command line

3.5 Functions and scripts

3.6 Demonstrations

### 4 Statistics used in chemometrics

4.1 Distributions, t-tests, F-tests

4.2 The Central Limit Theorem

4.3 Analysis of Variance (ANOVA)

4.4 Experimental design

4.5 MATLAB examples and homework problems

### 5 Principal Components Analysis (PCA)

5.1 Nomenclature and conventions

5.2 Data transformation-Linearization

5.3 Data centering and scaling

5.4 The PCA decomposition

5.5 Examples: Wine and Arch data sets

5.6 Scores and loadings plots

5.7 Q and T2 statistics

5.8 Outliers

5.9 Determination of number of factors to keep

5.10 Example: Chemical process monitoring

5.11 MATLAB examples and homework problems

### 6 Building Predictive Models: Regression

6.1 Nomenclature and Conventions

6.2 Classical Least Squares (CLS)

6.3 Inverse Least Squares (ILS) models

6.4 Multiple Linear Regression (MLR)

6.5 Ridge Regression (RR)

6.6 Principal Components Regression (PCR)

6.7 Determination of number of PCs - Cross Validation

6.8 Partial Least Squares (PLS)

6.9 Outlier detection and model diagnostics

6.10 A unifying theme: Continuum Regression (CR)

6.11 Summary

6.12 MATLAB examples and homework problems

### 7 Non-Linear Modeling Methods

7.1 Fitting Polynomials to data

7.2 Artificial Neural Networks (ANNs)

7.3 Non-linear versions of PCR and PLS

7.4 Locally Weighted Regression (LWR)

7.5 Hybrid models: NN-PLS

7.6 Genetic algorithms for structure selection

7.7 Comparison of methods: a non-linear modeling problem

7.8 Summary: The importance of model structure

### 8 Dealing with laboratory or process instrument drift

8.1 Baselining

8.2 Use of derivatives

8.3 Instrument Standardization

8.4 MATLAB examples

### 9 Supervised Pattern Recognition: Classification

9.1 Classes of classification methods

9.2 Linear Discriminant Analysis (LDA)

9.3 K-means and K-nearest neighbor (KNN)

9.4 Modeling classes with unequal variance: UNEQ

9.5 Soft Independent Modeling of Class Analogy (SIMCA)

9.6 MATLAB examples and homework problems

### 10 New Frontiers in Chemometrics

10.1 Definition of instrument order

10.2 PCA on batch data

10.3 Multi-way PCA

10.4 Bilinear data

10.5 Generalized Rank Annihilation Method (GRAM)

10.6 Evolving Factor Analysis (EFA)

10.7 Multivariate Curve Resolution (MCR)

10.8 Parallel Factor Analysis (PARAFAC)

10.9 Tucker Models

10.10 Multi-way and Multi-block PLS

10.11 Rational sensor design

10.12 The future

### 11 Possible Additional Topics

11.1 Process optimization

11.2 Numerical methods

11.3 Fundamentals of chemical process control

11.4 Dynamic model identification