Basic Chemometrics PLUS returns to Tokyo, JAPAN January 19-20, 2023 Complete Info Here!

Near Infrared Spectra of Diesel Fuels


Data Sets
 > NIR Spectra

Near Infrared Spectra of Diesel Fuels

These data consist of NIR spectra of diesel fuels along with various properties of those fules including:

  • bp50 – boiling point at 50% recovery, deg C (ASTM D 86)
  • CN – cetane Number (like Octane number only for diesel, ASTM D 613)
  • d4052 – density, g/mL, @ 15 deg C, (ASTM D 4052)
  • freeze – freezing temperature of the fuel, deg C
  • total – total aromatics, mass% (ASTM D 5186)
  • visc – viscosity, cSt, @ 40 deg C

There are three formats of these data: Matlab DataSet objectsStandard Matlab variables, and CSV files. This data was obtained at Soutwest Research Institute (SWRI) on a project sponsored by the U.S. Army. Many thanks to them for letting us post it here!

DataSet Object Format

The file “SWRI_Diesel_NIR.zip” contains a .mat file which can be loaded into MATLAB. This .mat file contains two dataset objects: One includes all the raw unpreprocessed spectra (diesel_spec) and another that is all the properties (diesel_prop). Some of the properties are not measured on some of the samples, so diesel_prop has some missing values (NaNs) in it. The wavelength axis is included as axisscale in the diesel_spec. If you don’t have PLS_Toolbox or our freeware for the DataSet Object, these two variables should turn into structures when you load them into MATLAB.

NameSizeKindLast Modified
SWRI_Diesel_NIR.zip1,443KdocumentMon, Nov 28, 2005, 11:28 AM

Standard Matlab Variable Format

The following are .zip files of separate .mat files, each with standard Matlab variables containing the same data as above. There are 6 workspace variables in each file, 3 for the spectra and 3 matching ones for the property value. In each case the data includes 20 high leverage samples (_hl) and the remaining samples are split into two random groups (_ll_a and _ll_b). These spectra can be used to test variable selection and calibration algorithms. For instance, you can use the high leverage samples and one of the other sets to make a calibration model (say the _hl and _ll_a), then test it on the third set (the _ll_b). In all cases the data have been pretty thoroughly weeded: outliers removed, and all samples belong to the same class (all summer fuels, no winter fuels).

All of the files end in GATEST because we’ve used the data to test genetic algorithms for variable selection.

NameSizeKindLast Modified
bp50gatest.zip720KdocumentTue, Jan 26, 1999, 01:47 PM
cngatest.zip718KdocumentTue, Jan 26, 1999, 01:47 PM
d4052gatest.zip770KdocumentTue, Jan 26, 1999, 01:48 PM
freezegatest.zip735KdocumentTue, Jan 26, 1999, 01:49 PM
totalgatest.zip749KdocumentTue, Jan 26, 1999, 01:49 PM
viscgatest.zip738KdocumentTue, Jan 26, 1999, 01:50 PM
CSV File FormatThe file “SWRI_Diesel_NIR_CSV.zip” contains two .csv files. One includes all the raw unpreprocessed spectra (diesel_spec) and another that is all the properties (diesel_prop). Some of the properties are not measured on some of the samples, so diesel_prop has some missing values (NaNs) in it. The wavelength axis is included as axisscale in the diesel_spec.NameSizeKindLast ModifiedSWRI_Diesel_NIR_CSV.zip1,005KdocumentMon, Nov 28, 2005, 11:28 AM