1 / 18

Interactive Series Baseline Correction Algorithm

Interactive Series Baseline Correction Algorithm. Andrey Bogomolov a , Willem Windig b , Susan M. Geer c , Debra B. Blondell c , and Mark J. Robbins c a ACD/Labs, Russian Chemometrics Society, Moscow, Russia b Eigenvector Research Inc., Rochester, NY, USA

claire
Download Presentation

Interactive Series Baseline Correction Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interactive Series Baseline Correction Algorithm Andrey Bogomolova, Willem Windigb, Susan M. Geerc, Debra B. Blondellc, and Mark J. Robbinsc aACD/Labs,Russian Chemometrics Society, Moscow, Russia bEigenvector Research Inc., Rochester, NY, USA cEastman Kodak Company, Rochester, NY, USA

  2. Baseline (Background) Problem • Baseline is an “eternal” issue in analytical data processing • “Baseline” or “background”? • no clear distinction • baseline is associated with a smooth line reflecting a “physical” interference • background tends to be used in a more general sense to designate ANY unwanted signal including noise and chemical components • Our preference is given to the term “baseline” because smoothness of the background signal is the main assumption of the proposed correction algorithm

  3. Classical Approach to the Baseline Correction Problem • Classical baseline correction algorithms with respect to single curve are almost exhaustively elaborated in the literature • A baseline to be subtracted is fitted by a linear (polynomial) function to the nodes that belong to signal-free regions • The nodes can be automatically detected by the software or manually placed by the user • These methods are advantageous for half-automatic processing where software-generated results need to be revised by a human expert

  4. Serial (Batch) Methods • Development of two-dimensional spectroscopy and hyphenated techniques demanded new methods applicable to data matrices • Early works in this direction applied automated baseline correction algorithms to every individual curve in a matrix dataset • The main problem with this approach is that it neglects internal (inter-spectral) correlations • Instead of the expected rank reduction it may introduce additional variance into the dataset • It is a “black-box” routine that is difficult to control

  5. Multivariate Background Correction • Multivariate data analysis produced a revolutionary impact onto the baseline problem in general • The paradigmatic shift from hard- (knowledge-driven) to soft- or self- (data-driven) modeling has opened new horizons and introduced new concepts • PLS introduces the means to address the background without its subtraction in the calibration context • OSC by S. Wold turns the problem inside out eliminating the variance that is irrelevant for calibration (orthogonal to Y) from the data (X) • A number of other excellent algorithms…

  6. Our Objectives • The researchers are typically concentrated at the development of fully automated background correction methods • Statement: fuzzy character of the baseline problem in general puts in doubt the feasibility of automated (expert-free) baseline correction routines • In contrast, we present an alternative approach that tends to maximize the means of control for a human operator • simplicity • visualization • interactive stepwise algorithm

  7. The Method • The method is applied to a series of curves (e.g., spectra or chromatograms) • The method consists of two distinct steps • First, a prototype baseline is constructed from linear segments by selecting a set of nodes • To aid in the node selection the mean values are calculated to represent the entire series: • Second, the prototype baseline is used to construct individual baselines to be subtracted from the series curves by adjusting the nodes vertically to the corrected curve

  8. Calculating the mean Selecting nodes Subtracting the baseline Raw Corrected HPLC/DAD: Sample Data

  9. 2nd Derivative for Node Selection

  10. Baseline Correction for Curve Resolution • Baseline correction is an application-specific preprocessing technique • The present baseline correction algorithm has been developed to improve the performance of SIMPLISMA (SIMPLe-to-use Interactive Self-modeling Mixture Analysis) curve resolution technique • The algorithm has been used at Eastman Kodak Company over 10 years for routine analysis of TGA/IR data that represent a challenging case for curve resolution: • a lot of components • high degree of overlap • intensive background signal

  11. TGA/IR Sample Data Reprinted with permission from Eastman Kodak Company, 2005

  12. Baseline Nature in TGA/IR • The most common reasons for TGA/IR baseline drift: • Temperature fluctuations over time • Instrument drift • Material scattering • Impurities • Inappropriate background, etc. • In the present dataset - miscellaneous reasons • Spectral domain is more suitable for series baseline correction because of narrow peaks and explicit baseline areas

  13. Raw spectral series “Snapping” the baseline Subtracting Calculating the mean Raw Corrected TGA/IR: Baseline Correction Reprinted with permission from Eastman Kodak Company, 2005

  14. TGA/IR: Corrected Data Map Reprinted with permission from Eastman Kodak Company, 2005

  15. TGA/IR: SIMPLISMA Curve Resolution Reprinted with permission from Eastman Kodak Company, 2005

  16. IR Library Identification Reprinted with permission from Eastman Kodak Company, 2005

  17. Conclusions • A new interactive approach to the baseline correction problem has been suggested • It allows for adapting traditional automated single-scan baseline correction routines or for performing manual correction on matrix data as if they were a single curve • Advantages of the method include “transparency” of the process and the means for extensive operator interaction • The method has passed long-term testing in an industrial laboratory and was integrated into a professional software package • In spite of the simplicity of the algorithm, it allows for successful elimination of baselines – even in complex cases such as TGA/IR data

  18. Acknowledgements • Antony Williams for his friendly support, and • Michel Hachey for his help and valuable ideas

More Related