INVESTIGATION OF MAIN CONTAMINATION SOURCES OF HEAVY METAL IONS IN FISH, SEDIMENTS, AND WATERS FROM ...
This presentation is the property of its rightful owner.
Sponsored Links
1 / 61

Emma Peré-Trepat 1 and Romà Tauler 2 * PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on
  • Presentation posted in: General

INVESTIGATION OF MAIN CONTAMINATION SOURCES OF HEAVY METAL IONS IN FISH, SEDIMENTS, AND WATERS FROM CATALONIA RIVERS USING DIFFERENT MULTIWAY DATA ANALYSIS METHODS. Emma Peré-Trepat 1 and Romà Tauler 2 *

Download Presentation

Emma Peré-Trepat 1 and Romà Tauler 2 *

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


INVESTIGATION OF MAIN CONTAMINATION SOURCES OF HEAVY METAL IONS IN FISH, SEDIMENTS, AND WATERS FROM CATALONIA RIVERS USING DIFFERENT MULTIWAY DATA ANALYSIS METHODS

Emma Peré-Trepat1 and Romà Tauler 2*

1 Dept. of Analytical Chemistry, Universitat de Barcelona, Diagonal 647, 08028 Barcelona, Spain

2 IIQAB-CSIC, Jordi Girona 18-26, 08034 Barcelona, Spain

* e-mail: [email protected]


  • Outline:

  • Introduction and motivations of this work

  • Environmental data tables and chemometrics models and methods

  • Example of application: metal contamination sources in fish, sediment and surface water river samples.

  • Conclusions


  • Introduction and motivations of this work

  • Pollution and toxicological chemical compounds are a threat for the environment and the health which need urgent measures and actions

  • Environmental monitoring studies produce huge amounts of multivariate data ordered in large data tables (data matrices)

  • The bottle neck in the study of these environmental data tables is their analysis and interpretation

  • There is a need for chemometrics (statistical and numerical analysis of multivariate chemical data) analysis of these data tables!


  • What kind of information can be obtained from chemometric analysis of environmental multivariate data tables?

    • Detection, identification, interpretation and resolution of the main sources of contamination

    • Distribution of these contamination sources in the environment: geographically, temporally, by environmental compartment (air, water, sediments, biota,...),…

    • Distinction between point and diffuse contamination sources sources

    • Quantitative apportionment of these sources .....


  • Introduction and motivations of this work

  • In this work different chemometric multiway data analysis

  • methods are compared for the resolution of the

  • environmental sources of 11 metal ions in 17 river

  • samples of fish, sediment and water at the same site

  • locations of Catalonia (NE, Spain).

  • Two-way bilinear model based methods

    • MA-PCA Matrix Augmentation Principal Component Analysis

    • MA-MCR-ALS Matrix Augmentation Multivariate Curve Resolution Alternating Least Squares

  • Three-way trilinear models based methods

    • PARAFAC

    • TUCKER3

    • MCR-ALS trilinear

    • MCR-ALS TUCKER3


    • Introduction and motivations of this work

    • Special attention will be paid to:

    • Finding ways to compare results obtained using bilinear and trilinear models for three-way data: getting profiles in three modes from bilinear models of three-way data

    • Adaptation of MCR-ALS to the fulfillment of PARAFAC and TUCKER3 trilinear models

    • Reliability of solutions: calculation of boundaries of bands of feasible solutions

    • Integration of Geostatistics and Chemometrics in the investigation of environmental data


    • Outline:

    • Introduction and motivations of this work

    • Environmental data tables and chemometrics models and methods

    • Example of application: metal contamination sources in fish, sediment and river surface water samples.

    • Conclusions


    Environmental data tables (two-way data)

    350

    350

    300

    300

    250

    250

    200

    200

    150

    150

    100

    100

    50

    50

    0

    0

    -50

    -50

    0

    5

    10

    15

    20

    25

    30

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    J variables

    Conc. of chemicals

    Physical Properties

    Biological properties

    Other .....

    <LOD

    Data table or

    matrix

    I samples

    12 13 45 67 89 42 35 0 0.3 0.005 111 33 5 67 90 0.06 44 33 1 2

    X

    ‘m’

    Plot of variables

    (columns)

    Plot of samples

    (rows)


    Environmental three-way data sets

    Measured data usually consisted on concentrations of different

    chemical compounds (variables) measured in different samples

    at different times/situations/conditions/compartments.

    Data are ordered in a two-way or in a three-way data table according

    to their structure

    3-way data sets

    time/

    compartment

    • Three measurement modes

    • - variables mode

    • sample mode

    • times/situations/conditions/ compartments mode

    samples

    variables (conc. Chemical ompounds)


    Chemometric models to describe environmental measurements

    • Models for what?

    • Models for:

    • identification of contamination sources?

    • exploration of contamination sources?

    • interpretation of contamination sources?

    • resolution of environmental source?

    • apportionment/quantitation of environmental source?

    • ??????..............................


    Chemometric models to describe environmental measurements

    Bilinear models for two way data:

    J

    dij

    I

    D

    dijis the concentration of chemical contaminant j in sample i

    n=1,...,N are a reduced number of independent environmental sources

    xin is the amount of source n in sample i;

    ynjis the amount of contaminant j in source n


    Chemometric models to describe environmental measurements

    Bilinear models for two way data:

    J

    J

    J

    YT

    N

    D

    E

    X

    I

    +

    I

    I

    N << I or J

    N

    PCA

    X orthogonal, YT orthonormal

    YT in the direction of maximum

    variance

    Unique solutions

    but without physical meaning

    Identification and Intereprtation!

    MCR-ALS

    X and YTnon-negative

    X or YT normalization

    other constraints (unimodality,

    local rank,… )

    Non-unique solutions

    but with physical meaning

    Resolution and apportionment!


    Chemometric models to describe environmental measurements

    Extension of Bilinear models for simultaneous analysis of multiple two way data sets

    YT

    Xaug

    Dk

    Xk

    (n,J)

    YT

    (I x J)

    (I,n)

    Xk

    Dk

    PCA: orthogonality; max. variance

    MCR: non-negativity, nat. constraints

    Matrix

    augmentation

    strategy

    Daug

    YT

    Dk

    Xk

    (n,J)

    (I x J)

    (I,n)


    Environmental data sets


    Chemometric models to describe environmental measurements

    i=1,...,I

    k=1,...,K

    j=1,...,J

    Trilinear models for three-way data:

    Dk

    dijkis the concentration of chemical contaminant j in sample I at time (condition) k

    n=1,...,N are a reduced number of independent environmental sources

    xin is the amount of source n in sample i;

    ynjis the amount of contaminant j in source n

    znk is the contribution of source n to compartment k


    variables

    Nj

    Nk

    Ni

    Z-mode

    Z

    X-mode

    samples

    X

    Y

    D

    K

    conditions

    (I , J , K)

    I

    J

    Y-mode

    Three Way data models


    Z

    X

    YT

    =

    D

    PARAFAC (trilinear model)

    The same number of components In the three modes: Ni = Nj = Nk = N

    No interactions between components

    Different slices Xk are decomposed In bilinear profiles having the same shape!


    Z

    G

    YT

    =

    • Different number of components

    • in the different modes Ni Nj  Nk

    • Interaction between components

    • in different modes is possible

    X

    D

    Tucker3 models

    In PARAFAC Ni = Nj = Nk = N and

    core array G is a superdiagonal identity cube


    Guidelines for method selection

    (resolution purposes)

    Deviations

    from trilinearity Mild Medium Strong

    Array size

    PARAFAC

    SmallPARAFAC2

    MediumTUCKER

    LargeMCR, PCA, SVD,..

    Journal of Chemometrics, 2001, 15, 749-771


    INTEGRATION OF CHEMOMETRICS—GEOSTATISTICS

    (Geographical

    Information

    Systems, GIS)


    • Outline:

    • Introduction and motivations of this work

    • Environmental data tables

    • Chemometrics bilinear and trilinear models and methods

    • Example of application: metal contamination sources in fish, sediment and river surface water samples.

    • Conclusions


    1

    2

    3

    6

    5

    4

    7

    17

    9

    8

    10

    11

    12

    13

    14

    15

    16

    METAL CONTAMINATION SOURCES IN SEDIMENTS, FISH AND WATERS FROM CATALONIA RIVERS USING MULTIWAY DATA ANALYSIS METHODS

    Emma Peré-Trepat (UB), Mónica Flo, Montserrat Muñoz, Antoni Ginebreda (ACA), Marta Terrado, Romà Tauler (CSIC)

    France

    Pyrinees

    1. RIU MUGA Castelló d´Empúries J052

    2. RIU FLUVIÀ Besalú J022

    3. RIU FLUVIÀ L´Armentera J011

    4. RIU TER Manlleu J034

    5. RIU TERRI Sant Julià de Ramis J028

    6. RIU TER Clomers J112

    7. RIU TORDERA Fogars de Tordera J062

    8. RIU CONGOST La Garriga J037

    9. RIU LLOBREGAT El Pont de Vilomara J031

    10. RIU CARDENER Castellgali J002

    11. RIU LLOBREGAT Abrera J084

    12. RIU LLOBREGAT Martorell J005

    13. RIU LLOBREGAT Sant Joan Despí J049

    14. RIU FOIX Castellet J008

    15. RIU FRANCOLÍ La Masó J059

    16. RIU EBRE Flix J056

    17. RIU SEGRE Térmens J207

    Aragón

    Barcelona

    Mediterranean Sea

    17 rivers, 11 metals (As, Ba, Cd, Co, Cu, Cr, Fe, Mn, Ni, Pb, Zn),

    3 environmental conpartments: Fish (barb’, ‘bagra comuna’, bleak, carp and

    trout), Sediment and Water samples


    • Missing data (‘m’)

    • Unknown values produce empty holes in data matrices

    • When they are few and they are evenly distributed, they

    • may be estimated by PCA imputation (or other method)

    • Below LOD values (<LOD)

    • This a common problem in environmental data tables

    • If most of the values are below LOD, data matrices are sparse

    • For calculations, it is better, either to use the experimental values or set them to LOD/2 instead of to zero or to LOD


    • Preliminary data description: Use of descriptive statistics

    • Individual sample plots

    • Individual variable plots

    • Descriptive statistics (Excel Statistics)

    • Histograms/Box plots

    • Binary correlation between variables

    • 5) .............................................................

    **

    300

    250

    200

    Values

    150

    100

    ***

    50

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    Column Number

    outliers

    upper whisker

    upper quartile

    median

    lower quartile

    lower whisker

    outliers


    Effect of different data pre-treatments: Sediment samples

    raw

    mean-

    centred

    auto-

    scaled

    scaled

    Mo is eliminated

    As Ba Cd Co Cu Cr Fe Mn Ni Pb Zn


    Data Pretreatment

    • No mean-centering was applied to allow an improved physical interpretation of factors (application of non-negativity constraints instead of orthogonality constraints) and the comparison of results using MCR-ALS methods

    • Two scaling possibilities:

      • First, data matrix augmentation and then column scaling to equal variance (each column element divided by its standard deviation)

      • First, column scaling each data matrix separately and then data matrix augmentation

    • Variables with nearly no-changes and equal or close to their limit of detection were removed from scaling and divided by 20 (to avoid their miss-overweighting)


    Description of scaled data

    Metal distribution in the three compartments

    Cd, Co and Ld in water

    were not scaled; only downweigthed

    metals (variables)


    Description of scaled data:

    different sites in the three compartments

    Llobregat

    Tordera

    Segre

    Ter

    Llobregat

    Foix

    Congost

    Cardener

    Fluvià

    Muga

    Llobregat

    Terri

    Ebre

    Francolí

    Ter

    Fluvià

    Llobregat

    sample sites


    Unit variance scaled concentrations boxplot

    Fish

    4

    Values

    2

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    Sediment

    4

    Values

    2

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    6

    Water

    4

    Values

    2

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    As Ba Cd Co Cu Cr Fe Mn Ni Pb Zn


    Fish

    Fish

    Sediment

    Water

    Sediment

    Water

    compartments

    sites

    AUGMENTATION direction

    column row tube

    s1 40.2619 43.2553 41.3302

    s2 16.7504 9.2823 19.4850

    s3 9.4963 8.5312 14.3739

    contaminants

    Fish

    Sediment

    Water

    SVD odf augmented data matrices in the three-directions

    45

    40

    svd column-wise (variables)

    svd row-wise(samples)

    35

    svd trube-wise (type)

    30

    2nd component

    25

    THREE-WAY DATA ARRAY MATRICIZING

    or MATRIX AUGMENTATION

    20

    15

    10

    How many components

    are needed to explain

    each mode?

    5

    0

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10


    compartments

    sites

    metals

    Bilinear modelling of three-way data

    (Matrix Augmentation or matricizing, stretching, unfolding )

    MA-PCA

    MA-MCR-ALS

    contaminants

    Y

    sites

    4

    F

    1

    F

    Loadings

    S

    W

    5

    S

    2

    sites

    sites

    6

    W

    3

    Daug

    Xaug

    Augmented

    scores matrix

    Augmented

    data matrix


    Explained variances using bilinear models

    (profiles in two modes)


    As

    As

    Ba

    Ba

    Cd

    Cd

    Co

    Co

    Cu

    Cu

    Cr

    Cr

    Fe

    Fe

    Mn

    Mn

    Ni

    Ni

    Pb

    Pb

    Zn

    Zn

    metals

    metals

    MA-PCA of scaled data without scores refolding

    10

    8

    6

    4

    2

    0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    5

    water

    samples

    0

    sediment and fish

    samples

    Ba

    As

    Cu

    Zn

    -5

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    water

    soluble

    metal ions

    MA-PCA


    As

    As

    Ba

    Ba

    Cd

    Cd

    Co

    Co

    Cu

    Cu

    Cr

    Cr

    Fe

    Fe

    Mn

    Mn

    Ni

    Ni

    Pb

    Pb

    Zn

    Zn

    metals

    metals

    MA-MCR-ALS of scaled data with nn and without scores refolding

    10

    sediment and fish

    samples

    Ba

    8

    Zn

    Cu

    6

    As

    4

    2

    0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    10

    8

    water

    samples

    6

    4

    2

    0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    More easily

    Interpretable!!!

    MA-MCR-ALS

    MA-PCA


    Calculation of the boundaries of feasible band solutions

    (Journal of Chemometrics, 2001, 15, 627-646)

    max

    min

    Nearly no rotation ambiguities are present in non-negative environmental profiles calculated by MCR-ALS

    (very different to spectroscopy!!!!!)


    Bilinear modelling of three-way data

    (Matrix Augmentation or matricizing, stretching, unfolding )

    Xaug

    contaminants

    Y

    sites

    F

    1

    4

    F

    S

    PCA

    MCR-ALS

    W

    5

    S

    2

    sites

    contaminants

    X

    Y

    sites

    6

    W

    3

    sites

    xi

    xii

    Z

    zi

    zii

    D

    compartments (F,S,W)

    zi

    compartments

    SVD

    sites

    1

    2

    3

    xi

    zii

    contaminants

    SVD

    4

    5

    6

    xii

    Scores

    refolding

    strategy!!!

    (applied only

    to final

    augmented

    Scores)

    Loadings

    recalculation

    in two modes

    from augmented

    scores


    Explained variances using trlinear models

    (profiles in three modes)


    0.5

    0.4

    0.3

    0.2

    0.1

    0

    As

    Ba

    Cd

    Co

    Cu

    Cr

    Fe

    Mn

    Ni

    Pb

    Zn

    metals

    0.5

    0

    -0.5

    As

    Ba

    Cd

    Co

    Cu

    Cr

    Fe

    Mn

    Ni

    Pb

    Zn

    metals

    MA-PCA of scaled data with nn and scores refolding

    Little differences in

    samples mode!!!

    MA-PCA + refolding

    MA-PCA


    MA-MCR-ALS of scaled data with scores refolding

    MA-MCR-ALS + refolding

    MA-MCR-ALS


    Z

    compartments

    (F,S,W)

    metals

    F

    metals

    compartments (F,S,W)

    Y

    S

    W

    PARAFAC

    sites

    sites

    D

    X

    compartments

    sites

    contaminants

    Trilinear modelling of three-way data


    PARAFAC of scaled data

    PARAFAC

    MA-PCA (bilinear)


    MA-MCR-ALS

    Trilinear constraint

    compartments

    sites

    contaminants

    Xaug

    contaminants

    Y

    sites

    F

    1

    contaminants

    F

    X

    Y

    S

    W

    S

    MCR-ALS

    2

    sites

    sites

    Z

    compartments (F,S,W)

    sites

    W

    3

    D

    Substitution of

    species profile

    Selection of species profile

    TRILINEARITY CONSTRAINT

    (ALS iteration step)

    1

    1’

    This constraint

    is applied at each step

    of the ALS optimization

    and independently

    for each component

    individually

    Rebuilding augmented scores

    SVD

    Folding

    2

    2’

    Loadings

    recalculation

    in two modes

    from augmented

    scores

    every augmented

    scored wnated to

    follow the trilinear

    model is refolded

    3

    3’


    10

    8

    6

    4

    2

    0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    10

    8

    As

    As

    Ba

    Ba

    Cd

    Cd

    Co

    Co

    Cu

    Cu

    Cr

    Cr

    Fe

    Fe

    Mn

    Mn

    Ni

    Ni

    Pb

    Pb

    Zn

    Zn

    metals

    6

    4

    2

    0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    MA-MCR-ALS of scaled data with nn, trilinearity (without scores refolding)

    MA-MCR-ALS nn + trilinear

    MA-MCR-ALS nn


    Calculation of the boundaries of feasible band solutions

    (Journal of Chemometrics, 2001, 15, 627-646)

    No rotation ambiguities are present in trilinear non-negative environmental profiles calculated by MCR-ALS

    (very different to spectroscopy!!!!!)


    MA-MCR-ALS of scaled data with nn, trilinearity and with scores refolding

    MA-MCR-ALS nn + trilinear

    PARAFAC nn


    Comparison PARAFAC vs MCR-ALS (trilinearity)


    Z

    compartments

    (F,S,W)

    F

    metals

    compartments (F,S,W)

    metals

    S

    2

    Y

    2

    1

    W

    TUCKER3

    2

    =

    sites

    1

    D

    sites

    2

    G

    Model (1,2,2)

    X

    compartments

    sites

    metals

    Tucker3 modelling of three-way data


    Tucker Models with non-negativity

    constraints

    [2 3 3]

    [3 3 3]

    [1 3 3]

    [3 2 3]

    [2 2 2] [2 2 3]

    [1 2 2] [1 2 3]

    parsimonious model

    [1 2 2]


    Tucker3 of scaled data

    0.4

    1

    1

    0.2

    0.5

    0.5

    0

    0

    0

    0

    5

    10

    15

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    1

    2

    3

    1

    1

    0.5

    0.5

    0

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    1

    2

    3

    TUCKER3

    PARAFAC

    model [1 2 2]

    model [2 2 2]


    compartments

    sites

    contaminants

    MA-MCR-ALS

    Tucker3 constraint

    Xaug

    metals

    Y

    sites

    F

    1

    4

    F

    X

    S

    Y

    W

    S

    MCR-ALS

    =

    2

    5

    sites

    Z

    compartments (F,S,W)

    sites

    W

    3

    6

    Loadings

    recalculation

    in two modes

    from augmented

    scores

    D

    Tucker3 CONSTRAINT

    (ALS iteration step)

    1’

    4’

    Folding

    SVD

    =

    =

    1

    2

    3

    4

    5

    6

    2’

    5’

    This constraint is applied at each step of the ALS optimization

    and independently and individually for each component i

    interacting augmented

    scores are folded

    together

    3’

    6’


    MA-MCR-ALS of scaled data with nn, tucker3 (without scores refolding)

    10

    8

    6

    4

    2

    0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    10

    8

    6

    4

    2

    0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    model [1 2 2]

    model [2 2 2]

    MA-MCR-ALS nn + Tucker3

    MA-MCR-ALS nn + PARAFAC


    MA-MCR-ALS of scaled data with nn, tucker3 and with scores refolding

    MA-MCR-ALS nn + Tucker3

    Tucker3

    model [1 2 2]

    model [1 2 2]


    Summary of Results


    INTEGRATION OF CHEMOMETRICS-GEOSTATISTICS (Geographical Information

    Systems, GIS)

    (67.3%)

    (13.2%)


    INTEGRATION OF CHEMOMETRICS-GEOSTATISTICS (Geographical Information

    Systems, GIS)

    (67.3%)

    (13.2%)


    INTEGRATION OF CHEMOMETRICS-GEOSTATISTICS (Geographical Information

    Systems, GIS)

    (67.3%)

    (13.2%)


    • Outline:

    • Introduction and motivations of this work

    • Environmental data tables

    • Chemometrics bilinear and trilinear models and methods

    • Example of application: metal contamination sources in fish, sediment and river surface water samples.

    • Conclusions


    Conclusions

    Chemometric methods allow resolution of environemtal sources of chemical contaminants

    However we should we aware of how every method displays the information because the mathematical properties of the used method are different (i.e. orthogonality vs non-negativity, bilinearity vs trilinearity, nr. of components...)

    This interpretation and resolution of environmental sources is not easy because the contamination sources in real world are correlated and because of experimental data limitations (environmental sources should show variation in the investigated data set).

    Bilinear PCA and MCR-ALS can be used to study multiway data sets and compared with multiway methods (like PARAFAC and Tucker if appropriate scores refolding is performed)

    Bilinear non-negative MCR-ALS solutions may provide good approximation of the real sources because non-negative environmental profiles have little rotation ambiguity


    Conclusions

    PARAFAC and Tucker3 may provide simpler models and they are special useful for trilinear data or when not the same number of components are present in the different modes.

    Intermediate situations between pure bilinear and pure trilinear models can be easily implemented in MCR-ALS

    Bilinear based models are more flexible than trilinear based models to resolve ‘true’ sources of data variation

    Different number of components and interactions between components in different modes (constraint under development) can be considered in mixed bilinear-trilinear-Tucker MA-MCR models

    For an optimal RESOLUTION, the model should be in accordance with the 'true' data structure

    Integration of Chemometrics-GIS results may facilitate geographical and temporal interpretation of contamination sources and they correlation with land uses, population and industrial activities


    Acknowledgements

    • Water Catalan Agency is acknowledge for its financial support and for providing experimental data sets

    • Research grant Project MCYT, Nr. BQU2003-00191, Spain


  • Login