slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
How to use this Self-guided Tour PowerPoint Presentation
Download Presentation
How to use this Self-guided Tour

Loading in 2 Seconds...

play fullscreen
1 / 110

How to use this Self-guided Tour - PowerPoint PPT Presentation


  • 186 Views
  • Uploaded on

How to use this Self-guided Tour. This self-guided tour is designed for you to work through at your own pace. It also allows you to review or skip a section. The following Features are included to help you navigate through this exercise:.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'How to use this Self-guided Tour' - holli


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
how to use this self guided tour
How to use thisSelf-guided Tour
  • This self-guided tour is designed for you to work through at your own pace. It also allows you to review or skip a section.
  • The following Features are included to help you navigate through this exercise:

At any point throughout the tour, you may jump to the beginning of either one of the six Sectionsbelow.

Underlined wordstake you directly to the respective subject.

The Arrow Buttons allow you to move forward or backwards in the exercise.

aim background and objectives
AimBackground and Objectives
  • The low-flow regime of a river controls industrial, agricultural and domestic water resources. In this context, low flows are critical for maintaining surface-water abstraction, dilution of effluents, and hydropower and for providing an adequate freshwater habitat for a wide range of flora and fauna.
  • For an integrated catchment management it is necessary to have access to low-flow indices not only at the gauged site but also at the ungauged site. In the case of the ungauged site it is necessary to estimate low-flow indices with appropriate methods (BECKER 1992).
  • A widely-used method for this purpose applies multiple regression analysis to estimate low-flow indices at the ungauged site, taking catchment descriptors as independent variables.
  • This self-guided tour demonstrates a practical procedure to estimate low-flow indices at the ungauged site through the learning-by-screening method.
  • You will learn:
  •  to develop a conceptual model,
  •  to translate the conceptual model into a mathematical model,
  •  to calibrate the necessary mathematical transfer function by means of linear multiple regression analysis between catchment descriptors and low-flow indices,
  •  to evaluate the model based on prior assumptions and the goodness of the fit,
  •  to validate the model with the help of a separate data set, and
  •  to apply the model to a real case.

Aim

aim problem statement 1
AimProblem Statement (1)
  • The map on the right shows the State of Baden-Württemberg in Southwest Germany.
  • You will be introduced to the study area in the next section.
  • Let us consider the following problem statement: Supposedly, a hydro-power station is planned to be built at the outlet of the Wiese catchment ( ). For design purposes we are asked to characterise the low-flow behaviour of the Wiese by determining the Q90 value. Unfortunately, there are no runoff data available for this catchment.
  • Is there a way we could still
  • give an estimate of the Q90?

Karlsruhe

Strasbourg (France)

Stuttgart

Freiburg

Basel (Switzerland)

N

100 km

Fig. 1.1 Map of the study area

Aim

aim problem statement 2
AimProblem Statement (2)
  • Even though runoff data are unavailable, we do have access to information on the catchment itself. While some of this information can be deduced from maps (e.g. catchment area and mean elevation) while other information must be acquired in the field (e.g. precipitation).
  • Click here to learn more about the available information on this catchment
  • From historical research we can be quite certain that runoff processes are related to a certain set of catchment attributes, which we will call catchment descriptors. However, we do not know how they are related.

Q90 = ?

Wiese

Site of the proposed hydro-power plant

How can we gain information on the expected relationship between the Q90 and the catchment descriptors?

Fig. 1.2 Wiese catchment (No. 532)

Aim

slide6

AimAvailable Catchment Descriptors

Soil

Percentage of soils with high infiltration capacity 0.19%

Percentage of soils with medium infiltration capacity 0%

Percentage of soils with low infiltration capacity 99.81%

Percentage of soils with very low infiltration capacity 0%

Mean hydraulic conductivity of the soils 201.69 cm/d

Percentage of soils with low hydraulic conductivity 0%

Percentage of soils with high water-holding capacity in the effective root zone 0%

Mean water-holding capacity in the effective root zone 109 mm

Q90 = ?

Morphometry

Catchment area 206.28 km2

Drainage density 1.31 km/ km2

Highest elevation 1485.5 m a.m.s.l.

Lowest elevation 423.6 m a.m.s.l.

Average elevation 898.74 m a.m.s.l.

Maximum slope 45.54%

Minimal slope 0 %

Average slope 18.08%

Climate

Annual precipitation 1891 mm

Land Use

Percentage of urbanisation 2%

Percentage of forested area 63%

Hydrogeology

Percentage of rock formations with a very low hydraulic permeability 0%

Weighted mean of hydraulic conductivity 9.62 * 10–4 m/s

back

Aim

aim problem statement 3
AimProblem Statement (3)

Information on the relationship between the Q90 and a certain set of catchment descriptors can be gained by looking at other catchments in the same region for which both flow data and catchment descriptors are available.

Q90 = ?

By means of multiple regression analysis among the other qualifying catchments in this region, we may be able to find a common

regional pattern which describes this relationship between catchment descriptors and the Q90. This equation is called the regional transfer function.

Wiese

Site of the proposed hydro-power plant

Assuming that the same relationship is true for the Wiese catchment, we can use the regional transfer function and estimate the desired Q90 value at our ungauged site based on the respective Wiese catchment descriptors.

Fig. 1.2 Wiese catchment (No. 532)

Aim

aim problem statement 4
AimProblem Statement (4)
  • In the Appendices we have provided additional information on the catchment descriptors, data sources, and references.
  • Let us now move on to get acquainted with the Study Area we will work with.
  • After a more theoretical section on the basics of the regression analysis Procedurewe will then come back to the Wiese catchment for the Application part of this self-guided tour in order to sole the stated problem.
  • We encourage you to also use the Data provided to seek to reproduce the regression analysis on your own.

Aim

study area overview
Study AreaOverview
  • The general study area is the State of Baden-Württemberg. Baden-Württemberg is located in the Southwest of the Federal Republic of Germany and shares a border with France in the West and Switzerland in the South. The region encompasses several landscapes, which exhibit a wide range of morphometry, hydro-geology, soil, land use, and climate.
  • You may click on either one of the light bulbs on the map to receive more information on the specific landscapes,
  • or choose from the following categories
  • Climate
  • Hydrology

Karlsruhe

Stuttgart

Strasbourg (France)

Freiburg

Basel (Switzerland)

N

100 km

Fig. 1.1 Map of the study area

Study Area

study area rhine rift valley
Study AreaRhine Rift Valley

The Oberrheinische Tiefebene(Rhine Rift Valley) is a 300 km long and 20-30 km wide tectonic rift, which is filled with fluvio-glacial deposits. The river Rhine flows through the valley from South to North. It interacts with the sediments to form terraces, alluvial fans, gravel bars, etc.. It is here that the lowest elevations of the study area, 85 m to 250 m a.m.s.l., are found.

The region is among the warmest in Central Europe, with mean air temperatures around 10°C, and it receives 600 to 900 mm rainfall per year. (BORCHERDT 1991). The favourable climate and fertile soil on extensive loess deposits are the basis for the high agricultural productivity of this region, where wine and fruit are grown (MOHR 1992).

Karlsruhe

Stuttgart

Strasbourg (France)

Freiburg

Basel (Switzerland)

N

100 km

back toOverview

Fig. 1.1 Map of the study area

Study Area

study area black forest
Study AreaBlack Forest

The Schwarzwald (Black Forest) is a mountain range, characterized by steep valleys on the West side toward the river Rhine and more gentle slopes on the Eastern side towards the Danube.

The Northern and Eastern Schwarzwald has an average elevation of 600 to 800 m a.m.s.l. (highest peak: Hornisgrinde 1164 m a.m.s.l.) and is dominated by New Red Sandstone. Due to relatively permeable bedrock, the drainage network is not particularly dense.

The Southern and Western part of the Schwarzwald is the most elevated part of the study area with mean elevations of 1000 m a.m.s.l.. Feldberg is the highest elevation in the study area with 1493 m a.m.s.l. and an average air temperature of 3.2 °C (BORCHERDT 1991). The area also receives the most precipitation in the study area; up to 2100 mm/year. Since the top bedrock is composed of granite and gneiss with relatively low permeability, a significant amount of water is drained on the surface and a dense drainage network with a mean drainage density of 1.94 km/km2 and a maximum of 5.0 km/km2 (WUNDT 1953) has developed.

Karlsruhe

Stuttgart

Strasbourg (France)

Freiburg

Basel (Switzerland)

N

100 km

back toOverview

Fig. 1.1 Map of the study area

Study Area

study area s dwestdeutsches schichtstufenland
Study AreaSüdwestdeutsches Schichtstufenland

The Südwestdeutsches Schichtstufenland(literally: “Southwest German step-layered land”) is characterized by a relatively level to rolling topography, which is slightly tilted towards the South East. Its elevation ranges from 700 to 1000 m a.m.s.l.. Mean annual air temperatures range from 6 to 9°C. The region receives between 650 and 900 mm rainfall per year.

The bedrock is composed of layers of sedimentary rocks, such as New Red Sandstone, Coquina, Keuper, and Jurassic, which exhibit karstic phenomena, such as dolines and sinkholes. Dry valleys are relics from periods of colder climate when the ground was frozen so that more water drained on the surface. Drainage density today may be as low as 0.03 km/km2 (WUNDT 1953).

In some areas, limestone is covered with loess, which causes an increase in drainage density. New Red Sandstone is also found in this region in alternating layers with Marl. Due to differential erosion and an inclination of these layers a sequence of steps has been formed in the landscape.

Karlsruhe

Stuttgart

Strasbourg (France)

Freiburg

Basel (Switzerland)

N

100 km

back toOverview

Fig. 1.1 Map of the study area

Study Area

study area pre alps and lake constance
Study AreaPre-Alps and Lake Constance

The Alpenvorland (Pre-Alps) is an area where unconsolidated sediments have been re-arranged by glaciers.

The area around Bodensee (Lake Constance) has been affected by the most recent (Würm) ice age, and has a quite pronounced relief with drumlins, lakes, and bogs of glacial origin. For the most part the area drains to Lake Constance, which is part of the river Rhine system. The lake is the result of glacial scouring. With a surface area of 538 km2 and a maximum depth of 254 m it is the largest German lake supplying Stuttgart and several other cities with drinking water.

The Northern part of this landscape is a relic of the preceding (Riss) ice age and is therefore more levelled. Along the Danube, gravel with loess deposits can be found. The region lies at a mean elevation of 600 m a.m.s.l.. It receives 750 to 1400 mm precipitation and its mean annual air temperature is between 6 and 7°C.

Karlsruhe

Stuttgart

Strasbourg (France)

Freiburg

Basel (Switzerland)

N

100 km

back toOverview

Fig. 1.1 Map of the study area

Study Area

study area climate 1
Study AreaClimate (1)
  • The climate in the study area is the result of the interaction of oceanic and continental influences. While the latter is responsible for seasonality (with cold winters and hot summers), the dominating impact of the former leads to a more temperate climate. July is usually the warmest and January the coldest month.
  • The mean annual air temperature ranges from 3.2 °C at the Feldberg (highest elevation of the study area) to above 10°C in the Rhine Rift Valley (HUTTENLOCHER 1972).

N

Fig. 2.2 Mean annual precipitation (1961-90) [mm]

more

Study Area

study area climate 2
Study AreaClimate (2)
  • Precipitation in this area is predominantly caused by frontal (zyklonal) storms. This pattern is modified by orographic lifting. Therefore, the amount of precipitation is mostly a function of elevation and exposition. It ranges from 2100 mm/year (in the Western part of the Black Forest, such as the Feldberg) to 600 mm/year (in the sheltered areas of the Rhine Rift Valley).
  • During the summer convective lifting may induce the formation of short-duration-high-intensity precipitation. The study area receives precipitation throughout the year with a maximum in the summer (June to August) and a minimum in the late winter (February and March). There is snow on the ground for up to 150 days in the Black Forest (HUTTENLOCHER 1972).

N

Fig. 2.2 Mean annual precipitation (1961-90) [mm]

back toOverview

Study Area

study area hydrology
Study AreaHydrology
  • Three quarters of the study area is drained by the river Rhine (the only alpine river flowing to the North Sea) and one quarter by the Danube. Since the area draining into the river Rhine falls more steeply, backwards erosion allows its headwaters to tap into the Danube catchments.
  • It is difficult to map out the exact position of the European groundwater divide in this area since a significant amount of water drains in the karst system, part of which is diverted from the Danube into the Rhine (VILLINGER 1982). During low-flow periods, all the water from the Danube leaves the river bed between Immendingen and Fridingen through sink holes and continues to flow underground. Two thirds of this water ends up in the river Rhine system (BORCHERDT 1991).
  • With the lack of substantial tributaries and the loss of water to the river Rhine, the Danube remains a relatively small river until its alpine tributaries add to its flow in Bavaria, east of the study area.

Rhine

Danube

Fig. 2.3 Catchments in Southwest Germany

more

Study Area

study area drainage network and human impact
Study AreaDrainage Network and Human Impact
  • The previously-discussed variety of landscapes in Southwest Germany is reflected by the regional distribution of drainage density. It is easy to spot the low-laying Rhine Rift Valley as well as the Black Forest and the Pre-Alps, which receive the highest precipitation amounts in the study area. The high drainage density in these regions is indicated by the blue colours. In contrast, the Swabian Alb, part of the “Deutsches Schichtstufenland”, is easily distinguishable by the white shading. It has a very low drainage density, due to wide-spread karstic phenomena.
  • Water management measures, such as water diversions and exports, stormwater ponds, reservoirs for the augmentation of low flows and groundwater extraction, are examples of how the hydrological cycle is being quantitatively impacted by human activity in this area. For our example only catchments with little human impact on the flow regime have been selected.

Fig. 2.4 Drainage density in Southwest Germany

more

Study Area

study area runoff regimes

1

1

Study AreaRunoff Regimes
  • The runoff regime in this region is dominated by the effects of rainfall and modified to some degree by snow melt. The highest flows usually occur between February and April and the lowest in August or September due to a summer maximum of evapotranspiration.

Figure 2.5 shows the Pardé coefficients (mean annual monthly flow divided by mean flow) for two catchments in our region, ranging between 0.5 in late summer and 2.0 in the spring.

Breg

(at Hammereisenbach)

Elz

(at Mosbach)

2

2

Fig. 2.5 Pardé coefficients

back toOverview

Study Area

procedure outline
ProcedureOutline

Model Design (Step 2)

Data Acquisition (Step 1)

  • Model Selection
  • Assumptions and Requirements

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

 Catchment Selection

  • Selection of Catchment Descriptors
  • Deduction of Catchment Descriptors
  • Selection of Low-Flow Indices

 Calculation of Low-Flow Indices

 Data Splitting

Calibration Data Set (56 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Calibration (Step 3)

  • Selection of Algorithms to depict the low-flow indices
  • Computation of Regional Transfer Functions

BASE = b0 +  bj * Xij + ei MAM(10) = b0 +  bj * Xij + ei Q90 = b0 +  bj * Xij + ei

Validation Data Set (27 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Evaluation (Step 4)

Model Validation (Step 5)

  • Check for Sensibleness
  • Model Requirements

 Check for agreement between observed and estimated values

Model Application

You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence

Procedure

procedure outline1
ProcedureOutline

Model Design (Step 2)

Data Acquisition (Step 1)

  • Model Selection
  • Assumptions and Requirements

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

 Catchment Selection

  • Selection of Catchment Descriptors
  • Deduction of Catchment Descriptors
  • Selection of Low-Flow Indices

 Calculation of Low-Flow Indices

 Data Splitting

Calibration Data Set (56 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Calibration (Step 3)

  • Selection of Algorithms to depict the low-flow indices
  • Computation of Regional Transfer Functions

BASE = b0 +  bj * Xij + ei MAM(10) = b0 +  bj * Xij + ei Q90 = b0 +  bj * Xij + ei

Validation Data Set (27 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Evaluation (Step 4)

Model Validation (Step 5)

  • Check for Sensibleness
  • Model Requirements

 Check for agreement between observed and estimated values

Model Application

You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence

Procedure

data acquisition overview

1

Data AcquisitionOverview

1

  • This preparatory step is foundational to the success of the whole analysis and estimation process. Our results can only be as good as the data we use for the basis of our calculations. Therefore, adequate resources and attention should be given to this crucial step.
  • The data used in the self-guided tour has been provided by different project groups and institutions, e.g. WaBoA, RIPS-Pool, LfU, LGRB (which are all part of the European Water Archive EWA), and the KLIWA project group. The applicable data associated with each of the respective catchments was entered into a two-dimensional spreadsheet, which can accessed through the Data Section
  • Click here to receive an explanation of the administrative acronyms

 Catchment Selection

  • Selection of Catchment Descriptors
  • Deduction of Catchment Descriptors
  • Selection of Low-Flow Indices

 Calculation of Low-Flow Indices

 Data Splitting

Calibration Data Sets

(56 Stations)

Validation Data Sets

(27 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Procedure

data acquisition acronyms

1

Data AcquisitionAcronyms

1

  • The data used in the self-guided tour were provided by the following data pools, projects, and organisations:
  • Data Pools
  •  RIPS-Pool – Räumliches Informations- und Planungssystem (Spatial Information and Planing System, State of Baden- Württemberg)
  •  EWA - European Water Archive of the Northern European FRIEND project (Flow Regimes from International and Experimental Data)
  • Projects
  •  WaBoA – Wasser und Boden Atlas von Baden-Württemberg (Water and Soil Atlas of the State of Baden-Württemberg)
  •  KLIWA – Projekt Klimaänderung und Konsequenzen für die Wasserwirtschaft (Climatic Change and Impact on Water Resources Management)
  • Organisations
  •  LfU – Landesanstalt für Umweltschutz (Environmental Agency, Regional Office, State of Baden-Württemberg)
  •  LGRB – Landesanstalt für Geologie, Rohstoffe und Bergbau Baden-Württemberg (Regional Office for Geology, Commodities, and Mining, State of Baden-Württemberg)

back

Procedure

data acquisition catchments

1

Data AcquisitionCatchments

2

  • In a first step, the catchments to be considered for the analysis must be selected. The catchments for this study were selected based on the following FRIEND EWA criteria:
  • Availability of continuous runoff data
  •  Precision in gauging low-water runoff. Accurate low-flow measurements, are difficult to attain. According to MORGENSCHWEIS (1990), gauging errors of 10% are common, and may – in case of heavy vegetation in the river bed – even reach 30% (GLOS & LAUTERBACH 1972)
  •  Negligible influence of human activity on low-water runoff
  •  Negligible influence of glacial runoff on total streamflow
  •  Availability of catchment descriptors
  • Based on these criteria, 83 medium-scale catchments were selected for this study (Figure 3.1).

N

0 80 160 km

Fig. 3.1 Spatial distribution of selected catchments

Procedure

data acquisition catchment descriptors overview

1

Data AcquisitionCatchment Descriptors - Overview

3

  • The catchment descriptors are the dependent variables in the model to be established. They were selected based on the following criteria (HAAS 2000):
  •  agreement with hydrological principles

 spatial representation with respect to climate, land use, morphometry, soil, and hydrogeology

 experience in using these independent variables in other studies

 availability for the study area

 relatively easycalculations

 possible interpretation as areal means

  • Click here to receive more information
  • on the catchments descriptors

Tab. 1 Selected catchment descriptors

Soil

Percentage of soils with high infiltration capacity [%]

Percentage of soils with medium infiltration capacity[%]

Percentage of soils with low infiltration capacity[%]

Percentage of soils with very low infiltration capacity [%]

Mean hydraulic conductivity of the soils [cm/d]

Percentage of soils with low hydraulic conductivity [%]

Percentage of soils with high water-holding capacity in the effective root zone [%]

Mean water-holding capacity in the effective root zone [mm]

Hydrogeology

Percentage of rock formations with a very low hydraulic permeability [%]

Weighted mean of hydraulic conductivity [m/s]

Climate

Average annual precipitation [mm]

Land Use

Percentage of urbanisation [%]

Percentage of forest [%]

Morphometry

Catchment area [km2]

Drainage density [km/ km2]

Highest elevation [m a.m.s.l.]

Average elevation [m a.m.s.l.]

Lowest elevation [m a.m.s.l.]

Maximum slope [%]

Average slope [%]

Minimal slope [%]

Procedure

data acquisition catchment descriptors overview1

1

Data AcquisitionCatchment Descriptors - Overview

3

  • You may click on any of the categories on the right to receive more information on the catchment descriptors or click here to return.

Climate

Morphology and Morphometry

Soil

Land Use

Hydrogeology

Fig 3.2 Catchment Descriptors (PLATE 1992)

Procedure

data acquisition morphology and morphometry

1

Data AcquisitionMorphology and Morphometry

3

  • AREA - Catchment area [km2]
  • The catchment area is defined as the“area having a common outlet for its surface runoff” (IHP/OHP 1998).
  • The descriptor was deduced from a 1:50,000 scale map of catchment boundaries provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool.
  • DD - Drainage density [km/km2]
  • Drainage density is the “total channel- segment length, accumulated for all [stream] orders within a drainage area, divided by the area” (IHP/OHP 1998). For the deduction procedure 1: 50,000 scale maps of catchment boundaries and drainage network (WaBoA and RIPS-Pool) were combined.

HMIN – Lowest elevation [m a.m.s.l.]

HMAX – Highest elevation [m a.m.s.l.]

HMEAN – Average elevation [m a.m.s.l.]

The elevation data are based on a digital elevation model (50 m by 50 m cells), provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool.

SLOPEMIN - Minimal slope [%]

SLOPEMAX - Maximum slope [%]

SLOPEMEAN - Mean slope [%]

Minimum, maximum and mean slopes were deduced using a digital elevation model.

back to Catchment Descriptors - Overview

Procedure

data acquisition land use and hydrogeology

1

Data AcquisitionLand Use and Hydrogeology

3

  • Remote sensing was used to derive land use for the area (Landsat TM, 30 x 30 m grid, 1993). It was classified into 16 classes, which were aggregated to four groups; forest, farmland, grassland and settlements/urban areas.
  • Only the relative proportion of forest and urban areas were chosen to be included in this self-guided tour.
  • URBAN - Percentage of urbanisation [%]
  • FOREST - Percentage of forest [%]
  • URBAN is an aggregation of settlement areas and areas with large-scale surface sealing due to industry. The latter covers 0.8% of the study area. Settlements are comprised of loose (1.9%) and dense (4.6%) settlements.
  • FOREST is a combination of deciduous (7.8%) and coniferous (21.4%) forest and other forested areas (10.0%).

GEOHCMEAN –Weighted mean of hydraulic

conductivity [m/s]

GEOVLHP – Percentage of rock formations with a

very low hydraulic permeability [%]

From a 1:350,000 scale map produced by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB), 98 geological classes were reduced to 54 hydro-geological classes and aggregated to eight groups.

Each group was associated with a mean hydraulic conductivity of the upper hydro-geological unit. From these values, a weighted mean was produced for each catchment. From the same data, the proportion of rock formations with a mean hydraulic conductivity of less than 10-5 m/s was derived.

back to Catchment Descriptors - Overview

Procedure

data acquisition soil 1

1

Data AcquisitionSoil (1)

3

  • The classification of the soil water regime was based on a study by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB). They produced a 1 : 350 000 scale map of 29 soil water regime classes based on soil type, humus content, packing, slope, and geology.
  • These classes were aggregated to four groups of soil types based predominantly on their infiltration capacity, which is defined as the “maximum rate at which water can be absorbed by a given soil per unit area under given conditions” (IHP/OHP 1998).
  • SOILH – Percentage of soils with high infiltration capacity [%]
  • These soils exhibit a high infiltration capacity even under conditions of high antecedent soil water content, such as sand and gravel soils.

SOILM – Percentage of soils with medium

infiltration capacity [%]

Examples of soils which feature a medium infiltration capacity are loamy soils and loess of medium depth.

SOILL – Percentage of soils with low

infiltration capacity [%]

The low infiltration capacity of these soils is due to their fine texture and/or the impermeability of one or more layers, as found in shallow sandy and loamy soils.

SOILVL – Percentage of soils with very low

infiltration capacity [%]

The infiltration capacity in these soils is very low because they are shallow, composed of hardly permeable material (such as clay) or have a high ground water level.

more

back to Catchment Descriptors - Overview

Procedure

data acquisition soil 2

1

Data AcquisitionSoil (2)

3

  • SOILHCMEAN - Mean hydraulic conductivity
  • of the soils [cm/d]
  • SOILLHC - Percentage of soils with low
  • hydraulic conductivity [%]
  • Hydraulic conductivity is a “property of a saturated porous medium which determines the relationship, called Darcy’s law, between the specific discharge and the hydraulic gradient causing it” (IHP/OHP 1998).
  • From a 1 : 200 000 scale map with 9 classes, areal means were deduced. The lowest two classes (with a mean hydraulic conductivity of less than 2.3*10-6 m/s) were combined for the calculation of the percentage of soils with low hydraulic conductivity.
  • ROOTSMEAN - Mean water-holding capacity
  • in the effective root zone [mm]
  • ROOTSHIGH - Percentage of soils with high
  • water-holding capacity in the
  • effective root zone [%]

   The data for this descriptor is based on a map produced by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB), which shows the distribution of water-holding capacity for a theoretical soil depth of 100 cm.

Water-holding capacity is defined as “water in the soil available to plants. It is normally taken as the water in the soil between wilting point and field capacity. In this context water-holding capacity is used and is identical to the available water” (IHP/OHP 1998).

Based on the information of soil type, land use, root depth, and water logging conditions the water-holding capacity values were adjusted to the estimated effective root zone. These values were then used to compute the areal mean. A threshold mean water-holding capacity was set at 200 mm. Above this threshold, all classes were aggregated to “soils with high water-storage capacity in the effective root zone” and its proportion was calculated.

back to Catchment Descriptors - Overview

Procedure

data acquisition climate

1

Data AcquisitionClimate

3

  • AAR – Average annual precipitation [mm]
  • The data for the average annual precipitation was derived from a digital map provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool. It shows average annual precipitation for the period 1961-1990 based on a resolution of a 500 m grid.
  • For this map, average annual precipitation had been calculated from the relationship between precipitation depth and altitude. It was also based on the principle of distance-weighting from the points of measurement. The raw data for the production of this map was provided by the German Weather Service (DWD).

N

Fig. 2.2 Mean annual precipitation (1961-90) [mm]

back to Catchment Descriptors - Overview

Procedure

data acquisition low flow indices

1

Data AcquisitionLow-Flow Indices

4

  • Several low-flow indices have been developed to describe the statistical distribution of flow. The low-flow indices are the independent variables in our model. The estimation procedure was performed for the following three low-flow indices.
  •  the mean base flow, BASE,
  •  the mean annual 10-day-minimum flow, MAM(10), and
  •  the 90 percentile runoff, i.e. the runoff to be equalled or exceeded 90% of the time, Q90.
  • The low-flow indices are calculated from daily flow data for the entire data set. Our regression analysis will be performed separately for all three dependent variables.

You may use the arrow buttons to view the low-flow indices in sequence or proceed to the next section.

Procedure

data acquisition base

1

Data AcquisitionBASE

5

  •  The method of base flow estimation from daily flow data was developed by WUNDT (1958) and KILLE (1970) and modified by DEMUTH (1993). It serves as an example of how more complex indices can be obtained in an automated and objective way.
  • The approach is based on the analysis of monthly minimum flows. It assumes that for the most part monthly minimum flows are equivalent to the mean base flows of the respective months. Monthly minimum runoff values are extracted from a time series of at least ten years, and the individual values are ranked in an ascending order and plotted (Figure 3.3).
  • The points of the ranked flow data are similar to a flow duration curve with the lower values arranged approximately along a straight line. At the critical point the slope of the curve sharply increases. It is assumed that flows beyond the critical point are not “pure” base flow.

Streamflow [m3/s]

Rank

Fig. 3.3 Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90)

Procedure

data acquisition base1

1

Data AcquisitionBASE

6

  • A stepped linear regression is computed to find the line which separates the flow values which are influenced by surface and subsurface flow from “pure” base flow values (Figure 3.4).
  • The step regression starts with the values between the 5% and the 50% mark. Successively, values beyond the 50% mark are included in the regression and the correlation coefficient is re-computed until it reaches a maximum. This value is called the critical point.
  • Between the 5% value and the new critical point a straight line is interpolated and extended in both directions to correct the higher flows to “true” base flow. Finally, all flow values are adjusted to the straight line and the mean base flow is calculated (yellow arrows).

critical value

5% value

50% value

Streamflow [m3/s]

BASE

Rank

Fig. 3.4 Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90)

Procedure

data acquisition base2

1

Data AcquisitionBASE

7

  • The DEMUTH procedure can only be applied to the S-shaped curve (Type I). The parabolic curve (Type II) does not allow a linear reduction procedure.
  • In our data sets, all flow data belonged to type I and could be used for the deduction of BASE for the respective catchment.

Type I Type II

Streamflow[m3/s]

Streamflow[m3/s]

Rank

Rank

Fig. 3.5 Type I and Type II curves

Procedure

data acquisition mam 10

1

Data AcquisitionMAM(10)

8

  • The MAM(10) value is calculated by selecting the annual ten-day minimum values (AM(10)) of discharge from each year of the observation period and computing the arithmetic mean of this set of values (Figure 3.6).

Streamflow [m3/s]

Year

Fig. 3.6 Annual 10-day minimum values of discharge and their arithmetic mean (Elsenz at Meckesheim, No. 460, 1966-90)

Procedure

data acquisition q90

1

Data AcquisitionQ90

9

  • In a Flow Duration Curve (FDC) the observed flow data is ranked in descending order. It displays the relationship between a discharge value and the percentage of time during which it is equalled or exceeded.
  • The Q90 is the value which is equalled or exceeded in 90% of the time, in this case 0.9 m3/s (Figure 3.7).

Streamflow [m3/s]

Percentiles

Fig. 3.7 Flow Duration curve and deduction of the 90 percentile (Elsenz at Meckesheim, No. 460, 1966-90)

Procedure

data acquisition data splitting

1

Data AcquisitionData Splitting

10

  • The final step in preparing the data is to split the acquired data set arbitrarily in order to produce two sets for model calibration and validation, respectively. This is called Data Splitting.
  • It is advisable to split the data with a ratio of about 2 to 1, ensuring that both data sets reflect the physiographic properties of the region under study.
  • The Baden-Württemberg data set was split into 56 and 27 data sets for calibration and validation, respectively.

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

 Data Splitting

Validation Data Set

(27 Stations)

Calibration Data Set

(56 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices dependent variables)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Procedure

procedure outline2
ProcedureOutline

Model Design (Step 2)

Data Acquisition (Step 1

  • Model Selection
  • Assumptions and Requirements

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

 Catchment Selection

  • Selection of Catchment Descriptors
  • Deduction of Catchment Descriptors
  • Selection of Low-Flow Indices

 Calculation of Low-Flow Indices

 Data Splitting

Calibration Data Set (56 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Calibration (Step 3)

  • Selection of Algorithms to depict the low-flow indices
  • Computation of Regional Transfer Functions

BASE = b0 +  bj * Xij + ei MAM(10) = b0 +  bj * Xij + ei Q90 = b0 +  bj * Xij + ei

Validation Data Set (27 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Evaluation (Step 4)

Model Validation (Step 5)

  • Check for Sensibleness
  • Model Requirements

 Check for agreement between observed and estimated values

Model Application

You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence

Procedure

model design model selection

2

Model DesignModel Selection

1

  • In the self-guided tour, the multiple regression approach is chosen since it is easy to handle, produces fast results, and is an effective procedure in most statistics programs.
  • The purpose of multiple regression analysis, as defined by HOLDER (1985), is to “asses the combined effect of several variables on a single variable.” Thereby, the regression analysis allows for the recognition and interpretation of statistical relationships.
  • The understanding gained from this analysis can be used to estimate an independent variable based on several dependent variables. In our model, the independent variables are the catchment descriptors, and the dependent variables are the low-flow indices.

By applying the regression approach we assume that the relationship between a low-flow index Y and its catchment descriptors X can be expressed as follows:

Yi = b0 +  bj * Xij + ei

with i = 1, ..., N

and j = 1, ..., P

where Yi is the dependent variable and b0 and bj are constants or coefficients respectively.

Xij signifies the catchment descriptor j of the catchment i. N is the total number of data sets (samples) and P is the total number of independent variables; finally, ei is the error term (DEMUTH 1993).

P

j = 1

Procedure

model design assumptions and requirements

2

Model DesignAssumptions and Requirements

2

  • In order for the multiple regression model to be the “best linear unbiased estimates” (LEWIS-BECK 1986) six assumptions have to be made. They become requirements when we want to make predictions based on this analysis:
  • You may click on any of the six assumptions, use the arrow buttons to view them in sequence, or proceed to the next section.

1. The model is free of specification error

2. The data set is free of measurementerror

3. Homoscedasticity: The variance of theerror term is constant for all values ofthe independent variables

4. The error term is neither auto-correlatednor correlated with the independentvariables

5. The error term follows normaldistribution

6. The model is free of multi-colinearity

Procedure

model design assumptions and requirements1

2

Model DesignAssumptions and Requirements

3

  • 1. The model is free of specification error
  • We must assume that
  •  the independent variables Xi (e.g. catchment size, areal precipitation) are linearly related to the dependent variable Y (e.g. Q90), and their effect on Y is truly additive or multiplicative (depending on the model chosen).
  •  all relevant independent variables have been included in the model while all irrelevant independent variables have been excluded (LEVIS-BECK 1986).
  • It is the responsibility of the modeller to use all available statistical and physical knowledge to minimize specification error. Tests for statistical significance aid in identifying variables that should not be in the model.

2. The data set is free of measurement error

The model relies on the quality of the data. We must be confident that the variables Xi and Yi have been measured accurately.

The fulfilment of this condition is problematic, particularly since low flows are usually associated with an error in the magnitude of 10 to 30% (GLOS & LAUTERBACH 1972).

back toAssumptions - Overview

Procedure

model design assumptions and requirements2

2

Model DesignAssumptions and Requirements

4

  • 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables
  • The assumption of homoscedasticity is true when a plot of residuals versus predicted values of Y produces a horizontal band with uniform width (Figure 3.8).
  • If this condition is not met, the estimated indices will not have a minimal variance. Consequently, the general procedures related to t-test, F-test, and confidence intervals will not be valid anymore. Therefore, the evaluation of a regression model implies a proper investigation of the residuals or estimation error.

StandardisedResiduals

0

Predicted Values of Y

Fig. 3.8 Check for Homoscedasticity

more

Procedure

model design assumptions and requirements3

2

Model DesignAssumptions and Requirements

5

  • 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued)
  • If the plot is a tilted band with equal width (Figure 3.9) either an error has occurred in the calculations or the model fails to accurately model changes in Y.
  • In such a case

 transformation of Y or

 inclusion of polynomial terms of X in the model

may prove as a remedy (HOLDER 1985).

StandardisedResiduals

0

Predicted Values of Y

Fig. 3.9 Check for Homoscedasticity

more

Procedure

model design assumptions and requirements4

2

Model DesignAssumptions and Requirements

6

  • 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued)
  • If the band does not have equal width (Figure 3.10), the variance of the error term is not constant. Reasons for this deviation may include the increase of variability with increasing Y or the increase of errors for greater Y.
  • This can be corrected through
  •  the application of a weighted least squares procedure (LEWIS-BECK 1986),
  •  a transformation of the variables,
  •  elimination of part of the values, or
  •  fitting several models to different ranges of values (HOLDER 1985).

StandardisedResiduals

0

Predicted Values of Y

Fig. 3.10 Check for Homoscedasticity

back toAssumptions - Overview

Procedure

model design assumptions and requirements5

2

Model DesignAssumptions and Requirements

7

  • 4. The error term is neither auto-correlated nor correlated with the independent variables
  • If this condition is not met, significance tests and confidence intervals will be invalid (HOLDER 1985).
  •   In cases where measurements were collected in a sequence or as part of a time series, it is possible that time (even though it is not specified as a separate independent variable) has an effect on the error term. This can be checked when error is plotted versus time or sequence number of measurements. If an (auto-)correlation is detected, the acquired data needs to be corrected with respect to time.

a

StandardisedResiduals

0

Time

b

StandardisedResiduals

0

Time

Fig. 3.11 Assessment of the effect of time on the error term

more

Procedure

model design assumptions and requirements6

2

Model DesignAssumptions and Requirements

8

  • 4. The error term is neither auto-correlated nor correlated with the independent variables (continued)
  • If a trend is visible (Figure 3.11-a) then time has explanatory value and should be included in the model. Possibly the measurement procedure has induced a systematic error over time or the property to be measured is undergoing a change.
  • It is also possible that the plot changes in width over time (Figure 3.11-b). This variability of the variance of the error can be a result of increased precision of the measuring technique over time (HOLDER 1985).

a

StandardisedResiduals

0

Time

b

StandardisedResiduals

0

Time

Fig. 3.11 Assessment of the effect of time on the error term

more

Procedure

model design assumptions and requirements7

2

Model DesignAssumptions and Requirements

9

  • 4. The error term is neither auto-correlated nor correlated with the independent variables (continued)
  • A correlation between the error term and an independent variable (Figure 3.12) may occur when a significant variable has been left out of the model and is now accounted for partially by the error term and partially by the other independent variables. (LEWIS-BECK 1986).
  • If an independent variable and the error term are correlated “the least squares parameter estimates will be biased” (LEWIS-BECK 1986).

StandardisedResiduals

0

Values of X

Fig. 3.12 Assessment of correlation between the error term and an independent variable

back toAssumptions - Overview

Procedure

model design assumptions and requirements8

2

Model DesignAssumptions and Requirements

10

  • 5. The error term follows normal distribution
  • The fulfilment of this requirement can be assessed visually by comparing a histogram of the residuals (Figure 3.13) or a cumulative distribution of the error term (Figure 3.14) to the expected normal distribution, or mathematically by computing skewness.
  •   Since the X values are fix it can be implied that a normal distribution of the error term corresponds to a normal distribution of Y (LEWIS-BECK 1986). This means that for a fulfilment of the assumption the data used in the regression analysis also needs to follow near-normal distribution.

Frequency

0

Residuals

Fig. 3.13 Frequency distribution of residuals

Predicted cumulative probability of residuals

Observed cumulative

probability of residuals

back toAssumptions - Overview

Fig. 3.14 Probability plot of residuals

Procedure

model design assumptions and requirements9

2

Model DesignAssumptions and Requirements

11

  • 5. The error term follows normal distribution
  • (continued)
  • If the error term is not normally distributed tests of significance and confidence interval statements will become questionable. However, “the tests of significance appear to be insensitive to non-normality in Y whenever the Xs themselves come from a near-normal distribution. On the other hand, if the Xs themselves do not come from a near-normal distribution and if some X values are very different in magnitude from the remainder, then the tests of significance are very sensitive to non-normality in Y” (LEWIS-BECK 1986).

Frequency

0

Residuals

Fig. 3.13 Frequency distribution of residuals

Predicted cumulative probability of residuals

Observed cumulative

probability of residuals

back toAssumptions - Overview

Fig. 3.14 Probability plot of residuals

Procedure

model design assumptions and requirements10

2

Model DesignAssumptions and Requirements

12

  • 6. The model is free of multi-colinearity
  • Multi-colinearity means that one independent variable can be expressed as a linear combination of the remaining independent variables in the model.
  •   This is problematic because it produces large variances for the slope estimates resulting in large standard errors so that parameter estimates become unreliable (LEWIS-BECK 1986). Furthermore, multi-colinearity complicates the interpretation of the regression equation. (SCHREIBER 1996).
  • To detect multi-colinearity each independent variable is regressed on all other independent variables of the model (LEWIS-BECK 1986). DEMUTH (1993) uses 0.8 as the upper limit for the coefficient of determination. Variable combinations that are more strongly inter-correlated must be devised.

The problem of multi-colinearity can be addressed by enlarging the sample size or by combining the problematic variables to forma single indicator (e.g. through principle component analysis).

The third option, excluding the problematic variable, introduces specification error to the model (see assumption 1)! Comparing the new (reduced) model with the original model can help in the assessment of the significance of this error (LEWIS-BECK 1986).

The seriousness of violations of the above assumptions is argued controversially in scientific literature. What can be said is that there are different degrees of robustness among the above conditions. For example, while the assumption of normality (5) is relatively robust for large samples, specification errors (1) generally cause grave problems (LEWIS-BECK 1986).

back to Assumptions – Overview

Procedure

procedure outline3
Procedure Outline

Model Design (Step 2)

Data Acquisition (Step 1)

  • Model Selection
  • Assumptions and Requirements

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

 Catchment Selection

  • Selection of Catchment Descriptors
  • Deduction of Catchment Descriptors
  • Selection of Low-Flow Indices

 Calculation of Low-Flow Indices

 Data Splitting

Calibration Data Set (56 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Calibration (Step 3)

  • Selection of Algorithms to depict the low-flow indices
  • Computation of Regional Transfer Functions

BASE = b0 +  bj * Xij + ei MAM(10) = b0 +  bj * Xij + ei Q90 = b0 +  bj * Xij + ei

Validation Data Set (27 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Evaluation (Step 4)

Model Validation (Step 5)

  • Check for Sensibleness
  • Model Requirements

 Check for agreement between observed and estimated values

Model Application

You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence

Procedure

model calibration selection of independent variables

3

Model CalibrationSelection of independent variables

1

  • In the process of applying an objective procedure to select those variables which correlate highly with the target value, but not with each other, several considerations must be made:
  • First of all, it is clear that the goal of the modeller should be to explain the observed phenomena as accurately as possible.

However, it must be kept in mind that adding further variables to the model increases the risk of introducing variables whose correlation with the target value is coincidental.

  • Furthermore, the specific explanatory value of the model may decrease with increasing degrees of freedom. In an extreme case, if the number of variables equals or exceeds the number of samples, the coefficient of determination is always 1 and the amount of information gained virtually zero. This is called overfitting.

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

Regional Transfer Function

Pool of independent variables

Procedure

model calibration selection of independent variables1

3

Model CalibrationSelection of independent variables

2

  • Including less variables into the model can help in the understanding of dominant processes. As a rule of thumb, the number of independent variables should not exceed a third of the sample size (BACKHAUS et al. 1996).
  • Computations are performed with SPSS, a statistical computing package. Automated selection procedures are based on statistical indices, such as the coefficient of determination and significance.
  • While these procedures help identifying potential variables to be included in the model, statistical significance must always be balanced with scientific knowledge. It lies within responsibility of the modeller to deduce equations that have both statistical and physical significance.

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

Regional Transfer Function

Pool of independent variables

Procedure

model calibration selection of independent variables2

3

Model CalibrationSelection of independent variables

3

  • The program selects variables from a given pool based on their contribution towards the explanation of the variance of the target value. Applying the least-square method assures that the residuals (variances of the target value, which cannot be explained by the function) are minimized.
  • The Stepwise selection technique (with F-values of 0.1 and 0.2 for accepting and rejecting variables into the model, respectively) is chosen for the selection of independent variables. This procedure helps the modeller understand how the coefficient of determination increases as more variables are accepted into the model.

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

Regional Transfer Function

Pool of independent variables

Procedure

model calibration selection of independent variables3

3

Model CalibrationSelection of independent variables

4

  • Test model runs produced negative values for the constant b0 in the model equations. As a result, the predicted values for low-flow indices in the lower range were mostly too low and often negative. To reduce the occurrence of negative estimates of the low-flow indices the regression was forced through the origin, which means that the constant b0 was set to zero.
  • The following Tables (2 to 4) show the results of the selection procedures for each of the three low-flow indices BASE, MAM(10), and Q90. All three procedures show a similar pattern in that as more and more variables are included into the model, the goodness of the model, represented by the coefficient of determination, R2, increases, however, with diminishing returns.

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

Regional Transfer Function

Pool of independent variables

Procedure

model calibration regional transfer function

3

Model Calibration Regional Transfer Function

5

  • BASE
  • A simple linear regression for BASE with only one independent variable, AREA, explains 81% of the variation in BASE. Adding the three next adequate independent variables (SLOPEVL, SLOPEMEAN, and ROOTSMEAN) raises the coefficient of determination to 87%. After the fourth step, the procedure terminated because the requirements of significance were not met for any additional variable.
  • Our preliminary model after the fourth step of the selection procedure is the model with the highest coefficient of determination that includes only statistically significant variables.

Table 2 Results of the regression analysis for BASE

Click here to review the definition of BASE.

Independent variables corrected R2

AREA 0.81

AREA, SOILVL 0.82

AREA, SOILVL; SLOPEMEAN 0.84

AREA, SOILVL; SLOPEMEAN, ROOTSMEAN 0.87

AREA Catchment area [km2]

SOILVL Percentage of soils with very low infiltration capacity [%]

SLOPEMEAN Mean slope [%]

ROOTSMEAN Mean water-holding capacity in the

effective root zone [mm]

BASE = AREA*7.3*10-3 - SOILVL*0.416 + SLOPEMEAN*3.9*10-2 - ROOTSMEAN*2.5*10-3

R2 = 0.87

s.e. = 0.33

Click here to learn about the definition of R2 and s.e.

Procedure

slide57

3

Model Calibration Review: BASE

  •  The method of base flow estimation from daily flow data was developed by WUNDT (1958) and KILLE (1970) and modified by DEMUTH (1993). It serves as an example of how more complex indices can be obtained in an automated and objective way.
  • The approach is based on the analysis of monthly minimum flows. It assumes that for the most part monthly minimum flows are equivalent to the mean base flows of the respective months. Monthly minimum runoff values are extracted from a time series of at least ten years, and the individual values are ranked in an ascending order and plotted (Figure 3.3).
  • The points of the ranked flow data are similar to a flow duration curve with the lower values arranged approximately along a straight line. At the critical point the slope of the curve sharply increases. It is assumed that flows beyond the critical point are not “pure” base flow.

Streamflow [m3/s]

Rank

Fig. 3.3 Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90)

more

more

Procedure

slide58

3

Model Calibration Review: BASE

  • A stepped linear regression is computed to find the line which separates the flow values which are influenced by surface and subsurface flow from “pure” base flow values (Figure 3.4).
  • The step regression starts with the values between the 5% and the 50% mark. Successively, values beyond the 50% mark are included in the regression and the correlation coefficient is re-computed until it reaches a maximum. This value is called the critical point.
  • Between the 5% value and the new critical point a straight line is interpolated and extended in both directions to correct the higher flows to “true” base flow. Finally, all flow values are adjusted to the straight line and the mean base flow is calculated (yellow arrows).

critical value

5% value

50% value

Streamflow [m3/s]

BASE

Rank

Fig. 3.4 Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90)

back

back

Procedure

model calibration coefficient of determination

3

Model Calibration Coefficient of Determination
  •   The coefficient of determination (R2) is an indicator for the strength of the relationships represented in the regression model.
  • With the regression model we seek to provide a better prediction for the scatter plot than the arithmetic mean. The total deviation of each value from the mean can be split into one portion, which is explained by our regression line, and the other portion, which remains unexplained, also called error. Figure 3.15 provides a simplified two-dimensional illustration of the components of the variation in Y.
  • The coefficient of determination is defined as the sum of squared explained deviations divided by the sum of squared total deviations. It can range from 0 to 1; 1 being a perfect fit where all deviations from the mean can be explained through the regression line.  
  • For each set of independent variables, the statistic program computes the regression equation for which the sum of squared unexplained deviations reaches a minimum. Then, R2 is calculated.

Y

unexplained deviation

explained deviation

Y

X

Fig. 3.15 Components of deviation from Y

Sum of squared explained deviations

R2 =

Sum of squared total (explained

and unexplained) deviations

more

Procedure

model calibration standard error

3

Model CalibrationStandard Error
  • Beside the coefficient of determination, the standard error is the second characteristic number that describes the quality of the regression equation. 
  • It is an estimate of the standard deviation of the actual Y from the predicted Y and gives an idea of the average error that goes along with predicting Y on the basis of the given regression equation.

The standard error of estimate of Y is defined as follows (LEWIS-BECK 1986):

where Yi, obs is the observed value of the dependent variable Y and Yi, pred is the predicted value of the dependent variable Y.The difference between Yi ,obs and Yi, pred is also called prediction error.

(Yi, obs – Yi, pred)2

s.e. =

n-2

back

Procedure

slide61

3

Model Calibration Regional Transfer Function

6

  • MAM(10)
  • The regression analysis for the MAM(10) produces a similar pattern as for BASE. For this parameter, again four variables were included until the requirements for further acceptance could not be met anymore.

Click here to review the definition of MAM(10).

Table 3 Results of the regression analysis for MAM(10)

independent variables corrected R2

AREA 0.84

AREA, SOILVL 0.87

AREA, SOILVL, SLOPEMEAN 0.88

AREA, SOILVL, SLOPEMEAN, DD 0.89

AREA Catchment area [km2]

SOILVL Percentage of soils with very low

infiltration capacity [%]

SLOPEMEAN Mean slope [%]

DD Drainage density [km/km2]

MAM(10) = AREA*4.5*10-3 - SOILVL*0.4 + SLOPEMEAN*2.1*10-2 - DD*0.1

R2 = 0.89

s.e. = 0.19

Click here to learn about the definition of R2 and s.e.

Procedure

model calibration coefficient of determination1

3

Model Calibration Coefficient of Determination
  •   The coefficient of determination (R2) is an indicator for the strength of the relationships represented in the regression model.
  • With the regression model we seek to provide a better prediction for the scatter plot than the arithmetic mean. The total deviation of each value from the mean can be split into one portion, which is explained by our regression line, and the other portion, which remains unexplained, also called error. Figure 3.15 provides a simplified two-dimensional illustration of the components of the variation in Y.
  • The coefficient of determination is defined as the sum of squared explained deviations divided by the sum of squared total deviations. It can range from 0 to 1; 1 being a perfect fit where all deviations from the mean can be explained through the regression line.  
  • For each set of independent variables, the statistic program computes the regression equation for which the sum of squared unexplained deviations reaches a minimum. Then, R2 is calculated.

Y

unexplained deviation

explained deviation

Y

X

Fig. 3.15 Components of deviation from Y

Sum of squared explained deviations

R2 =

Sum of squared total (explained

and unexplained) deviations

more

more

Procedure

model calibration standard error1

3

Model CalibrationStandard Error
  • Beside the coefficient of determination, the standard error is the second characteristic number that describes the quality of the regression equation. 
  • It is an estimate of the standard deviation of the actual Y from the predicted Y and gives an idea of the average error that goes along with predicting Y on the basis of the given regression equation.

The standard error of estimate of Y is defined as follows (LEWIS-BECK 1986):

 where Yi, obs is the observed value of the dependent variable Y and Yi, pred is the predicted value of the dependent variable Y.The difference between Yi, obs and Yi, pred is also called prediction error.

(Yi, obs – Yi, pred)2

s.e. =

n-2

back

Procedure

slide64

3

Model Calibration Review: MAM(10)

  • The MAM(10) value is calculated by selecting the annual ten-day minimum values of discharge from each year of the observation period and computing the arithmetic mean of this set of values (Figure 3.6).

Streamflow [m3/s]

Year

Fig. 3.6 Annual 10-day minimum values of flow and their arithmetic mean (Elsenz at Meckesheim, No. 460, 1966-90)

back

Procedure

model calibration regional transfer function1

3

Model Calibration Regional Transfer Function

7

  • Q90
  • As with the previous low-flow indices, the regression analysis produced a model with four independent variables.

Click here to review the definition of Q90.

Table 4 Results of the regression analysis for Q90

independent variables corrected R2

AREA 0.81

AREA, SOILVL 0.83

AREA, SOILVL; SLOPEMEAN 0.86

AREA, SOILVL; SLOPEMEAN, ROOTSMEAN 0.88

AREA Catchment area [km2]

SOILVL Percentage of soils with very low infiltration capacity [%]

SLOPEMEAN Mean slope [%]

ROOTSMEAN Mean water-holding capacity in the effective root zone [mm]

Q90 = AREA*4.9*10-3 - SOILVL*0.5 + SLOPEMEAN*2.5*10-2 - ROOTSMEAN*1.5*10-3

R2 = 0.88

s.e. = 0.21

Click here to learn about the definition of R2and s.e.

Procedure

model calibration coefficient of determination2

3

Model Calibration Coefficient of Determination
  •   The coefficient of determination (R2) is an indicator for the strength of the relationships represented in the regression model.
  • With the regression model we seek to provide a better prediction for the scatter plot than the arithmetic mean. The total deviation of each value from the mean can be split into one portion, which is explained by our regression line, and the other portion, which remains unexplained, also called error. Figure 3.15 provides a simplified two-dimensional illustration of the components of the variation in Y.
  • The coefficient of determination is defined as the sum of squared explained deviations divided by the sum of squared total deviations. It can range from 0 to 1; 1 being a perfect fit where all deviations from the mean can be explained through the regression line.  
  • For each set of independent variables, the statistic program computes the regression equation for which the sum of squared unexplained deviations reaches a minimum. Then, R2 is calculated.

Y

unexplained deviation

explained deviation

Y

X

Fig. 3.15 Components of deviation from Y

Sum of squared explained deviations

R2 =

Sum of squared total (explained

and unexplained) deviations

more

more

Procedure

model calibration standard error2

3

Model CalibrationStandard Error
  • Beside the coefficient of determination, the standard error is the second characteristic number that describes the quality of the regression equation. 
  • It is an estimate of the standard deviation of the actual Y from the predicted Y and gives an idea of the average error that goes along with predicting Y on the basis of the given regression equation.

The standard error of estimate of Y is defined as follows (LEWIS-BECK 1986):

where Yi, obs is the observed value of the dependent variable Y and Yi, pred is the predicted value of the dependent variable Y.The difference between Yi, obs and Yi, pred is also called prediction error.

(Yi, obs – Yi, pred)2

s.e. =

n-2

back

Procedure

slide68

3

Model Calibration Review: Q90

  • In a Flow Duration Curve (FDC) the observed flow data ise ranked in descending order. It displays the relationship between a discharge value and the time during which it is equalled or exceeded.
  • The Q90 is the value which is equalled or exceeded in 90% of the time, in this case 0.9 m3/s (Figure 3.7).

Streamflow [m3/s]

Percentiles

Fig. 3.7 Flow Duration curve and deduction of the 90 percentile (Elsenz at Meckesheim, No. 460, 1966-90)

back

Procedure

procedure outline4
Procedure Outline

Model Design (Step 2)

Data Acquisition (Step 1)

  • Model Selection
  • Assumptions and Requirements

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

 Catchment Selection

  • Selection of Catchment Descriptors
  • Deduction of Catchment Descriptors
  • Selection of Low-Flow Indices

 Calculation of Low-Flow Indices

 Data Splitting

Calibration Data Set (56 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Calibration (Step 3)

  • Selection of Algorithms to depict the low-flow indices
  • Computation of Regional Transfer Functions

BASE = b0 +  bj * Xij + ei MAM(10) = b0 +  bj * Xij + ei Q90 = b0 +  bj * Xij + ei

Validation Data Set (27 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Evaluation (Step 4)

Model Validation (Step 5)

  • Check for Sensibleness
  • Model Requirements

 Check for agreement between observed and estimated values

Model Application

You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence

Procedure

model evaluation consideration of sensibleness

4

Model Evaluation Consideration of Sensibleness

?

1

  • After having established the model, it is of utmost importance to critically review and evaluate its quality and applicability. We will do so in this self-guided tour by first looking at the established model itself (which we call evaluation) and then performing a validation procedure on the basis of the validation data set, which has been set aside for this purpose.
  • Both the evaluation and the validation procedures will be shown for the Q90 model. The same procedures would have to be applied to the BASE and the MAM(10) model to check their validity. 
  • First, we can make a general statement about the model by looking at the coefficient of determination, R2. With an R2 of 0.88 the model accounts well for the variation of the low-flow parameter Q90.

AREA*4.9*10-3 - SOILVL*0.5 + SLOPEMEAN*2.5*10-2 - ROOTSMEAN*1.5*10-3

R2 = 0.88

s.e. = 0.21

Q90 =

Procedure

model evaluation consideration of sensibleness1

4

Model Evaluation Consideration of Sensibleness

?

2

  • A second check of the model is to evaluate its sensibleness. This means to check whether the effect of a certain independent variable on the target value coincides with our (current) hydrological knowledge and understanding.
  • AREA
  • The regression analysis has shown that catchment area has a positive effect on the Q90. This result agrees with our current hydrological understanding since amount of precipitation and storage volume are generally proportional to catchment size.

AREA*4.9*10-3 - SOILVL*0.5 + SLOPEMEAN*2.5*10-2 - ROOTSMEAN*1.5*10-3

R2 = 0.88

s.e. = 0.21

Q90 =

Procedure

model evaluation consideration of sensibleness2

4

Model Evaluation Consideration of Sensibleness

?

3

  • SOILVL
  • Due to a high percentage of soils with very low infiltration capacity, only a small portion of the received previpitation can infiltrate and through the soil to recharge groundwater, which predominantly feeds the base flow in our regions.
  • With an increase of the percentage of soils with very low infiltration capacity, low flows will decrease. Therefore the negative effect of SOILVL on the Q90 seems sensible.

AREA*4.9*10-3- SOILVL*0.5 + SLOPEMEAN*2.5*10-2 - ROOTSMEAN*1.5*10-3

R2 = 0.88

s.e. = 0.21

Q90 =

Procedure

model evaluation consideration of sensibleness3

4

Model Evaluation Consideration of Sensibleness

?

4

  • SLOPEMEAN
  • Coming from a process-oriented angle, one would argue that greater slopes are characterized by shallower soils and faster flows. As a result, such cachments would have a lower retention potential and lower base flows.
  • At this point, the knowledge of our study area becomes important. It may be possible that in our study area ‘slope’ is closely correlated with climatic characteristics. The dominant mountain range in the region is the Black Forest, which is the steepest on the Western edge. Westerly winds and steep slopes give rise to a significant amout of orographic precipitation in the Western parts of the Black Forest.
  • Therefore, it makes sense that slope is in indirect descriptor of precipitation. It is interesting, however, to note that the annual precipitation AAR was not included into the model.

AREA*4.9*10-3 - SOILVL*0.5 + SLOPEMEAN*2.5*10-2 - ROOTSMEAN*1.5*10-3

R2 = 0.88

s.e. = 0.21

Q90 =

Procedure

model evaluation consideration of sensibleness4

4

Model Evaluation Consideration of Sensibleness

?

5

  • ROOTSMEAN
  • The mean water-holding capacity of the effective root-zone can be regarded as a storage compartment from where evapotranspiration losses occur. Hence, the more water is stored in this compartment, the less water will be available as base flow. Therefore, the negative effect of ROOTSMEAN on Q90 reflects a known hydrological principle.
  • Therefore, we can conclude that this statistical finding of the multiple regression analysis does not contradict our hydrological experience. In conclusion, we can accept the model as being statistically and physically sound.

AREA*4.9*10-3 - SOILVL*0.5 + SLOPEMEAN*2.5*10-2- ROOTSMEAN*1.5*10-3

R2 = 0.88

s.e. = 0.21

Q90 =

Procedure

model evaluation check for model requirements

4

Model EvaluationCheck for Model Requirements

?

6

  • Another way of evaluating the model is to compare the observed Q90 values with the predicted based on the model established.
  • Since these values correspond to the same data set, this comparison is not sufficient as validation of the model. However, it can give us an idea of how well the observed data has been incorporated in the calibration process.

Predicted values of Q90

Observed values of Q90

Fig. 3.16 Predicted vs. observed values of Q90

Procedure

model evaluation check for model requirements1

4

Model EvaluationCheck for Model Requirements

?

7

  • A first brief look at the plot of predicted vs. observed values indicates two things:
  •  First, we can see that even though the regression was forced to go through the origin the predicted Q90 values for part of the data set are still negative. These results are, of course, nonsensical.
  •  Secondly, the linear trend shows a deviation from the perfect fit (1:1 line). It seems like our model will tend to underestimate the Q90 for higher flows.
  • The same observations are true for the other two models and will have to be kept in mind during further analysis.

Predicted values of Q90

Observed values of Q90

Fig. 3.16 Predicted vs. observed values of Q90

Procedure

model evaluation check for model requirements2

4

Model EvaluationCheck for Model Requirements

?

8

  • Next,we will evaluate the validity of the model by examining whether the previously-established requirements of a valid multiple regression model are also true for our model:

1. The model is free of specification error

2. The data set is free of measurementerror

3. Homoscedasticity: The variance of theerror term is constant for all values ofthe independent variables

4. The error term is neither auto-correlatednor correlated with the independentvariables

5. The error term follows normaldistribution

6. The model is free of multi-colinearity

You may click on any of the six assumptions, use the arrow buttons to view them in sequence, or click here to proceed.

Procedure

model evaluation check for model requirements3

4

Model EvaluationCheck for Model Requirements

?

9

  • (1) The model is free of specification error
  • Statistical significance tests have already been performed when the independent variables were selected. In addition, we considered the model with regards to current knowledge of hydrological processes and approved of the equation given by the regression analysis. We can be confident that our model does not contain a variable that should not be in there.
  • It is, however, impossible to gather and process data of every thinkable catchment characteristic. Nevertheless, the pool of available descriptors included a variety of properties. We trust that the model does not lack any variable that should have been included.
  • Finally, we trust that an additive model gives us the best approximation of the natural relationships. A multiplicative model has also been tested but gave less significant results.

(2) The data set is free of measurement error

In Step 1 we checked the data for inconsistencies and excluded those catchments for which the data did not comply with our requirements.

However, we must be aware, that even consistent data are subject to error since our low-flow measurements could only be obtained with limited precision.

Furthermore, it must be noted that the independent variables include error as well, all of which is by nature of the regression analysis attributed to the error associated with Y.

back to Requirements - Overview

Procedure

model evaluation check for model requirements4

4

Model EvaluationCheck for Model Requirements

?

10

  • (3) Homoscedasticity: The variance of the error term is constant for all values of the índependent variables
  • In Figure 3.17 residuals were plotted versus values of Y. The points more or less snuggle around the X-axis except for a few outliers, which seem to be unbalanced: For the four largest Q90 values the model predicts significantly lower flows. This is due to the fact that our procedure tends to underestimate the Q90 of higher values.
  • It should be noted, however, that this figure illustrates the absolute deviations. In relative terms, the deviation only amounts to 15 to 30% of the observed values.

Residuals of Y

Values of Y

Fig. 3.17 Residuals of Y as a function of values of Y (for the Q90 regression)

more

Procedure

model evaluation check for model requirements5

4

Model EvaluationCheck for Model Requirements

?

11

  • (3) Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued)
  • The catchments which produced the largest residuals in the calibration data set for Q90 are distributed across several regions of the study area. This pattern for Q90 is similar to those of the other two low-flow indices.
  • However, all four of the respective catchments are situated in areas where karst phenomena are quite frequent. In such areas, it is often very difficult to determine the actual size of the catchment. AREA is the most significant independent variable in all three models, which makes them very sensitive to inaccuracies in this descriptor.

N

0 80 160 km

more

Fig. 3.18 Catchments that produced „outliers“

Procedure

model evaluation check for model requirements6

4

Model EvaluationCheck for Model Requirements

?

12

  • (3) Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued)
  • To improve the model with regard to homoscedasticity we could eliminate the seemingly problematic data sets or accept the weakness of the model, regarding those four points as valid and valuable outliers.
  • In this case we desist from further reducing our data set and will accept all remaining data. The outliers can be regarded as a reflection of the natural variance in our study area. Omitting those values would tighten the range of values used for the calibration of our model and would therefore limit its applicability.

N

0 80 160 km

back to Requirements - Overview

Fig. 3.18 Catchments that produced „outliers“

Procedure

model evaluation check for model requirements7

4

Model EvaluationCheck for Model Requirements

?

13

  • (4) The error term is neither auto-correlated nor correlated with the independent variables.
  • In the process of deducing low-flow indices from a time series of discharge those data sets that exhibited inconsistencies (which are an indicator that auto-correlation exists) were already excluded from further analysis.
  • Time is not among the variables in our pool of independent variables. Since the other measurements are not based on a sequence over time that could be traced back, we concluded that auto-correlation is not a problem in this analysis.

more

Procedure

model evaluation check for model requirements8

4

Model EvaluationCheck for Model Requirements

?

14

  • (4) The error term is neither auto-correlated nor correlated with the independent variables (continued)
  • In our test of homoscedasticity we have observed four outliers. The fact that three of these outliers correspond to catchments which are among the largest in the study area suggests that there may be a correlation between catchment size and the error term.
  • In Figure 3.19, residuals are plotted versus catchment area. We can see that the data points more or less crowd around the x-axis forming a horizontal band. Catchment area does therefore not seem to be significantly correlated with the error term.

Residuals

Catchment area [km2]

Fig. 3.19 Residuals as a function of catchment area (for the Q90 regression)

back to Requirements - Overview

Procedure

model evaluation check for model requirements9

4

Model EvaluationCheck for Model Requirements

?

15

  • We can infer that the distribution of the error term – though by far not perfect - is reasonably symmetrical around zero, unimodal, and is not significantly skewed. It comes acceptably close to a normal distribution so that the above assumption is justified.
  • (5) The error term follows normal distribution
  • Two graphical procedures were employed for a visual check of normal distribution. The two plots are the frequency distribution of residuals (Figure 3.20) and the probability plot (Figure 3.21).

back to Requirements - Overview

1

Predicted cumulative probability

Frequency

0

Residuals

0

0

1

Observed cumulative probability

Fig. 3.21 Probability plot of standardized residuals (Q90 regression)

Fig. 3.20 Frequency distribution of residuals (Q90 regression)

Procedure

model evaluation check for model requirements10

4

Model EvaluationCheck for Model Requirements

?

16

  • (6) The model is free of multi-colinearity
  • When the four independent variables in our model are expressed as linear combinations of each other the following results are produced for the calibration data set (Table 5):
  • The selected catchment descriptors exhibit different degrees of inter-dependence, with the coefficient of determination ranging from 0.04 to 0.29, which does not even come close to the previously set upper limit of 0.8. We will accept the variables as being practically independent of each other.

Table 5 Multi-colinearity of catchment descriptors

Variable combinationcorrected R2

AREA = f (SOILVL, SLOPEMEAN, ROOTSMEAN) 0.04

SOILVL = f (AREA, SLOPEMEAN, ROOTSMEAN) 0.08

SLOPEMEAN = f (AREA, SOILVL, ROOTSMEAN) 0.29

ROOTSMEAN = f (AREA, SOILVL, SLOPEMEAN) 0.25

back to Requirements – Overview

Procedure

model evaluation regional transfer functions

4

Model EvaluationRegional Transfer Functions

?

17

  • After having undergone the equivalent considerations and tests as the model for Q90, the regression equations for BASE and MAM(10) were also accepted as regional transfer functions for the estimation of low flows at the ungauged site.
  • This concludes the model evaluation.
  • However, the hardest test of all is still ahead: The validation of the model by means of the previously isolated validation data.

Step 3Model Calibration

Step 4Model Evaluation

Step 5Model Validation

Procedure

procedure outline5
ProcedureOutline

Model Design (Step 2)

Data Acquisition (Step 1)

  • Model Selection
  • Assumptions and Requirements

Multiple Linear Regression ModelYi = b0 +  bj * Xij + ei

 Catchment Selection

  • Selection of Catchment Descriptors
  • Deduction of Catchment Descriptors
  • Selection of Low-Flow Indices

 Calculation of Low-Flow Indices

 Data Splitting

Calibration Data Set (56 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Calibration (Step 3)

  • Selection of Algorithms to depict the low-flow indices
  • Computation of Regional Transfer Functions

BASE = b0 +  bj * Xij + ei MAM(10) = b0 +  bj * Xij + ei Q90 = b0 +  bj * Xij + ei

Validation Data Set (27 Stations)

Catchment Descriptors (independent variables)

Low-Flow Indices (dependent variables)

Model Evaluation (Step 4)

Model Validation (Step 5)

  • Check for Sensibleness
  • Model Requirements

 Check for agreement between observed and estimated values

Model Application

You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence

Procedure

model validation observed vs predicted q90

5

?

)

(

Model ValidationObserved vs. Predicted Q90

1

  • The second data set, which had so far been left out of consideration, is now used to validate our transfer function.
  • For the validation data set the respective three low-flow indices are predicted on the basis of the catchment descriptors using the established regional transfer functions. Figure 3.23 is a visual comparison between the observed and the predicted Q90.
  • Two effects can be noticed here:
  • 1. The points deviate more or less from the perfect fit line, which means there is some individual degree of overestimation and underestimation, respectively.
  • 2. The blue line exhibits a slightly steeper slope than the perfect fit line. This is probably due to the outlier in the top right hand corner.

Predicted Q90

Observed Q90

Fig. 3.23 Validation: predicted Q90 vs. observed Q90

Procedure

model validation model bias

5

?

)

(

Model ValidationModel Bias

2

  • Figure 3.24 shows the relative deviation of the predicted Q90 values from the observed ones, calculated as the model bias:

Deviation of the predicted

from the observed Q90 [%]

Q90 predicted – Q90 observed

* 100%

Q90 observed

-

Observed Q90 [m3/s]

Fig. 3.24 Validation: Relative Deviation of the predicted from the observed Q90

Procedure

model validation relative deviation

5

?

)

(

Model ValidationRelative Deviation

3

  • It is apparent that the model fails for a few values, giving relative deviation of several hundred percent.
  • Based on the previous evaluation this is something that we should almost expect, given the limitations we have already discovered. Going back to the raw data confirms that the deviation is not due to descriptors laying significantly outside the calibration range.
  • Rather, it is a reflection of the natural variability of hydrological processes in the catchment, which cannot be characterized completely with those independent variables. Another effect to be taken into consideration is the accumulation of inaccuracies, both of the data set as well as of the model.

Deviation of the predicted

from the observed Q90 [%]

-

Observed Q90 [m3/s]

Fig. 3.24 Validation: Relative Deviation of the predicted from the observed Q90

Procedure

model validation summary

5

?

)

(

Model ValidationSummary

4

  • For a more specific validation, we have now zoomed into the plot, only looking at those values which are within the 50% range. Table 6 gives the statistics of this plot.
  • Close to half the values were predicted with a deviation of less than 50%.

Deviation of the predicted

from the observed Q90 [%]

Table 6 Summary of the validation results

Evaluation Deviation in % % of Data sets

Q90 MAM10 BASE

Very good <10 15 11 7

Good 10-30 15 26 15

Satisfactory 30-50 15 11 11

Unsatisfactory >50 56 52 67

Observed Q90 [m3/s]

Fig. 3.25 Validation: Relative deviation of the predicted from the observed Q90 (+/- 50% range shown)

Procedure

model validation conclusion

5

?

)

(

Model ValidationConclusion

5

  • This concludes the validation process. In light of the limitations we discovered we could say that the established regional transfer functions can produce good results but should be applied with caution.
  • If the validation had failed, we would have had to conclude that our multiple regression approach is all together inadequate.

Model Validation (Step 5)

  • Check for agreement between observed and estimated values

Model Application

Procedure

application original problem statement
ApplicationOriginal Problem Statement
  • We now come back to our original problem statement: We desired to determine the Q90 for the Wiese catchment without relying on local flow data.
  • By means of multiple regression analysis among other catchments in the region we were able to find a common regional pattern which describes this relationship between catchment descriptors and the Q90.

Q90 = ?

Wiese

Site of the proposed hydro-power plant

Fig. 1.2 Wiese catchment (No. 532)

Application

application original problem statement1
ApplicationOriginal Problem Statement
  • Assuming that the same relationship is true for the Wiese catchment, we can use the previously established regional transfer function and estimate the desired Q90 value at our ungauged site based on the respective Wiese catchment descriptors.
  • Before applying the model we made sure that the Wiese catchment descriptors were within the range of the catchment descriptors of the calibration data set.
  • The following two slides bring our self-guided tour to a conclusion by showing the final calculation process.

Q90 = ?

  • Click here to see the ranges of the catchment descriptors of the calibration data set

Wiese

Site of the proposed hydro-power plant

Fig. 1.2 Wiese catchment

Application

application available data on the wiese catchment
ApplicationAvailable Data on the Wiese Catchment

Soil

Percentage of soils with high infiltration capacity 0.19%

Percentage of soils with medium infiltration capacity 0%

Percentage of soils with low infiltration capacity 99.81%

Percentage of soils with very low infiltration capacity 0%

Mean hydraulic conductivity of the soils 201.69 cm/d

Percentage of soils with low hydraulic conductivity 0%

Percentage of soils with high water-holding capacity in the effective root zone 0%

Mean water-holding capacity in the effective root zone 109 mm

Q90 = ?

Morphometry

Catchment area 206.28 km2

Drainage density 1.31 km/ km2

Highest elevation 1485.5 m a.m.s.l.

Lowest elevation 423.6 m a.m.s.l.

Average elevation 898.74 m a.m.s.l.

Maximum slope 45.54%

Minimal slope 0 %

Average slope 18.08%

Climate

Annual precipitation 1891 mm

Land Use

Percentage of urbanisation 2%

Percentage of forested area 63%

Hydrogeology

Percentage of rock formations with a very low hydraulic permeability 0%

Weighted mean of hydraulic conductivity 0.000962 m/s

Application

application estimation of the q90
ApplicationEstimation of the Q90

Soil

Percentage of soils with high infiltration capacity 0.19%

Percentage of soils with medium infiltration capacity 0%

Percentage of soils with low infiltration capacity 99.81%

Percentage of soils with very low infiltration capacity 0%

Mean hydraulic conductivity of the soils 201.69 cm/d

Percentage of soils with low hydraulic conductivity 0%

Percentage of soils with high water-holding capacity in the effective root zone 0%

Mean water-holding capacity in the

effective root zone 109 mm

AREA*4.879*10-3

- SOILVL*0.457

+ SLOPEMEAN* 2.506*10-2

- ROOTSMEAN*1.540*10-3

Q90 =

Morphometry

Catchment area 206.28 km2

Drainage density 1.31 km/ km2

Highest elevation 1485.5 m a.m.s.l.

Lowest elevation 423.6 m a.m.s.l.

Average elevation 898.74 m a.m.s.l.

Maximum slope 45.54%

Minimal slope 0 %

Average slope 18.08%

Climate

Annual precipitation 1891 mm

Land Use

Percentage of urbanisation 2%

Percentage of forested area 63%

Q90 = 1.29 m3/s

Hydrogeology

Percentage of rock formations with a very low hydraulic permeability 0%

Weighted mean of hydraulic conductivity 0.000962 m/s

Application

conclusion
Conclusion

In this self-guided tour you have learned in which hydrological context you may use a multiple linear regression procedure to estimate low-flow indices at the ungauged site.

The learning-by-screening method (step by step) gave you the opportunity not only to learn at your own pace but also to simultaneously apply the method to your own data. You have seen an example of how to develop a conceptual model, translate a conceptual model into a statistical model, calibrate, evaluate, and validate the model.

You should note that the self-guided tour is a practical introduction towards the design and application of multiple regression models. A detailed discussion about the theoretical background is found in the appropriate literature. For your own exercise or review we have included the data sets used in this self-guided tour, both low-flow indices and catchment descriptors.

  • We welcome your comments and suggestions which should be submitted to the following address:
  • Prof. Dr. Siegfried Demuth
  • IHP/OHP-Sekretariat
  • International Hydrological and Operational Programme of UNESCO and WMO
  • Mainzer Tor 1
  • 59068 Koblenz, Germany
  • Demuth@bafg.de

Application

data overview
Data Overview
  • These EXCEL spreadsheets contain the data used for calibration and validation respectively as well as summary statistics and the plots shown in this self-guided tour.
  • Spreadsheet 1: Calibration
  • Spreadsheet 2: Validation
  • The documents below are in SPSS format. The Calibration Data sheet can be used directly for regression analysis of the data while the Results document is an output summary.
  • Regression: Calibration Data
  • Regression: Results
  • The original flow data can be found on the CD-ROM under Data/Regional Data Set.

The documents can be opened directly by clicking on the respective name

Data

appendices overview
AppendicesOverview
  • You may choose from the
  • following categories:

Catchment Descriptors

Acronyms, means of deduction, units

Data Sources

Data pools, projects, and organisations

References

Background and previous research

Acknowledgements

Thanks to the numerous contributors

Contact Information

We appreciate your feedback

Appendices

appendices catchment descriptors overview

1

Appendices - Catchment DescriptorsOverview

3

  • You may click on any of the categories on the right to receive more information on the catchment descriptors or clickhere to return.

Climate

Morphology and Morphometry

Soil

Land Use

Hydrogeology

Fig. 3.2 Catchment Descriptors (PLATE 1992)

Appendices

appendices catchment descriptors morphology and morphometry

1

Appendices - Catchment Descriptors Morphology and Morphometry

3

  • AREA - Catchment area [km2]
  • The catchment area is defined as the“area having a common outlet for its surface runoff” (IHP/OHP 1998).
  • The descriptor was deduced from a 1:50,000 scale map of catchment boundaries provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool.
  • DD - Drainage density [km/km2]
  • Drainage density is the “total channel- segment length, accumulated for all [stream] orders within a drainage area, divided by the area” (IHP/OHP 1998). For the deduction procedure 1: 50,000 scale maps of catchment boundaries and drainage network (WaBoA and RIPS-Pool) were combined.

HMIN – Lowest elevation [m a.m.s.l.]

HMAX – Highest elevation [m a.m.s.l.]

HMEAN – Average elevation [m a.m.s.l.]

The elevation data are based on a digital elevation model (50 m by 50 m cells), provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool.

SLOPEMIN - Minimal slope [%]

SLOPEMAX - Maximum slope [%]

SLOPEMEAN - Mean slope [%]

Minimum, maximum and mean slopes were deduced using a digital elevation model.

back to Catchment Descriptors - Overview

Appendices

data acquisition land use and hydrogeology1

Appendices - Catchment Descriptors Land Use and Hydrogeology

1

Data AcquisitionLand Use and Hydrogeology

3

  • Remote sensing was used to derive land use for the area (Landsat TM, 30 x 30 m grid, 1993). It was classified into 16 classes, which were aggregated to four groups; forest, farmland, grassland and settlements/urban areas.
  • Only the relative proportion of forest and urban areas were chosen to be included in this self-guided tour.
  • URBAN - Percentage of urbanisation [%]
  • FOREST - Percentage of forest [%]
  • URBAN is an aggregation of settlement areas and areas with large-scale surface sealing due to industry. The latter covers 0.8% of the study area. Settlements are comprised of loose (1.9%) and dense (4.6%) settlements.
  • FOREST is a combination of deciduous (7.8%) and coniferous (21.4%) forest and other forested areas (10.0%).

GEOHCMEAN –Weighted mean of hydraulic

conductivity [m/s]

GEOVLHP – Percentage of rock formations with a

very low hydraulic permeability [%]

From a 1:350,000 scale map produced by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB), 98 geological classes were reduced to 54 hydro-geological classes and aggregated to eight groups.

Each group was associated with a mean hydraulic conductivity of the upper hydro-geological unit. From these values, a weighted mean was produced for each catchment. From the same data, the proportion of rock formations with a mean hydraulic conductivity of less than 10-5 m/s was derived.

back to Catchment Descriptors - Overview

Appendices

slide103

Appendices - Catchment Descriptors Soil (1)

1

3

  • The classification of the soil water regime was based on a study by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB). They produced a 1 : 350 000 scale map of 29 soil water regime classes based on soil type, humus content, packing, slope, and geology.
  • These classes were aggregated to four groups of soil types based predominantly on their infiltration capacity, which is defined as the “maximum rate at which water can be absorbed by a given soil per unit area under given conditions” (IHP/OHP 1998).
  • SOILH – Percentage of soils with high infiltration capacity [%]
  • These soils exhibit a high infiltration capacity even under conditions of high antecedent soil water content, such as sand and gravel soils.

SOILM – Percentage of soils with medium

infiltration capacity [%]

Examples of soils which feature a medium infiltration capacity are loamy soils and loess of medium depth.

SOILL – Percentage of soils with low

infiltration capacity [%]

The low infiltration capacity of these soils is due to their fine texture and/or the impermeability of one or more layers, as found in shallow sandy and loamy soils.

SOILVL – Percentage of soils with very low

infiltration capacity [%]

The infiltration capacity in these soils is very low because they are shallow, composed of hardly permeable material (such as clay) or have a high ground water level.

more

back to Catchment Descriptors - Overview

Appendices

appendices catchment descriptors soil 2
Appendices - Catchment Descriptors Soil (2)
  • SOILHCMEAN - Mean hydraulic conductivity
  • of the soils [cm/d]
  • SOILLHC - Percentage of soils with low
  • hydraulic conductivity [%]
  • Hydraulic conductivity is a “property of a saturated porous medium which determines the relationship, called Darcy’s law, between the specific discharge and the hydraulic gradient causing it” (IHP/OHP 1998).
  • From a 1 : 200 000 scale map with 9 classes, areal means were deduced. The lowest two classes (with a mean hydraulic conductivity of less than 2.3*10-6 m/s) were combined for the calculation of the percentage of soils with low hydraulic conductivity.
  • ROOTSMEAN - Mean water-holding capacity
  • in the effective root zone [mm]
  • ROOTSHIGH - Percentage of soils with high
  • water-holding capacity in the
  • effective root zone [%]

   The data for this descriptor is based on a map produced by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB), which shows the distribution of water-holding capacity for a theoretical soil depth of 100 cm.

Water-holding capacity is defined as “water in the soil available to plants. It is normally taken as the water in the soil between wilting point and field capacity. In this context water-holding capacity is used and is identical to the available water” (IHP/OHP 1998).

Based on the information of soil type, land use, root depth, and water logging conditions the water-holding capacity values were adjusted to the estimated effective root zone. These values were then used to compute the areal mean. A threshold mean water-holding capacity was set at 200 mm. Above this threshold, all classes were aggregated to “soils with high water-storage capacity in the effective root zone” and its proportion was calculated.

back to Catchment Descriptors - Overview

Appendices

appendices catchment descriptors climate
Appendices - Catchment Descriptors Climate
  • AAR – Average annual precipitation [mm]
  • The data for the average annual precipitation was derived from a digital map provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool. It shows average annual precipitation for the period 1961-1990 based on a resolution of a 500 m grid.
  • For this map, average annual precipitation had been calculated from the relationship between precipitation depth and altitude. It was also based on the principle of distance-weighting from the points of measurement. The raw data for the production of this map was provided by the German Weather Service (DWD).

Fig. 2.2 Mean annual precipitation (1961- 90) [mm]

back to Catchment Descriptors - Overview

Appendices

appendices data sources
AppendicesData Sources
  • The data used in the self-guided tour were provided by the following data pools, projects, and organisations:
  • Data Pools
  •  RIPS-Pool – Räumliches Informations- und Planungssystem (Spatial Information and Planing System, State of Baden- Württemberg)
  •  EWA - European Water Archive of the Northern European FRIEND project (Flow Regimes from International and Experimental Data)
  • Projects
  •  WaBoA – Wasser und Boden Atlas von Baden-Württemberg (Water and Soil Atlas of the State of Baden-Württemberg)
  •  KLIWA – Projekt Klimaänderung und Konsequenzen für die Wasserwirtschaft (Climatic Change and Impact on Water Resources Management)
  • Organisations
  •  LfU – Landesanstalt für Umweltschutz (Environmental Agency, Regional Office, State of Baden-Württemberg)
  •  LGRB – Landesanstalt für Geologie, Rohstoffe und Bergbau Baden-Württemberg (Regional Office for Geology, Commodities, and Mining, State of Baden-Württemberg)

back to Appendices - Overview

Appendices

appendices references 1
AppendicesReferences (1)
  • BACKHAUS, K., Erickson, B., Plinke, W. & Weiber, R. (1996): Multivariate Analysemethoden. Eine anwendungsorientierte Einführung. Springer, Berlin, Heidelberg, New York.
  • BORCHERDT, C. (1985): Baden-Württemberg. Eine geographische Landeskunde. Wissenschaftliche Länderkunde. Bd. 12. Stuttgart.
  • BECKER, A. (1992): Methodische Aspekte der Regionalisierung, in: Regionalisierung in der Hydrologie, ed. by H.- B. Kleeberg, DFG-Mitt. XI, VCH-Verl. Ges. Weinheim, pp. 16-33, in German.
  • DEMUTH, S. (1993): Untersuchungen zum Niedrigwasser in West-Europa. Freiburger Schriften zur Hydrologie, Band 1. Freiburg, Germany. 
  • IHP/OHP (1998): WMO Technical Regulations, Volume III - Hydrology.

GLOS, E. & LAUTERBACH, D. (1972): Regionale Verallgemeinerung von Niedrigwasserdurchflüssen mit Wahrscheinlichkeitsaussage. Mitteilungen des Institutes für Wasserwirtschaft. H. 37. VEB Verlag für Bauwesen, Berlin.

HAAS, M. (2000): Regionalisierung des Quotienten Basisabfluss/Gesamtabfluss (Qbas/Qges) für Einzugsgebiete Baden-Württemberg. Freiburg, Germany.

HOLDER, R. L. (1985): Multiple Regression in Hydrology. Institute of Hydrology, Wallingford, Great Britain.

HUTTENLOCHER, F. (1972): Baden-Württemberg. Kleine geographische Landeskunde. Schriftenreihe der Kommission für geschichtliche Landeskunde, H. 2. Karlsruhe.

more

back to Appendices - Overview

Appendices

appendices references 2
AppendicesReferences (2)
  • KILLE, K. (1970): Das Verfahren MoMNQ, ein Beitrag zur Berechnung der mittleren langjährigen Grundwasserneubildung mit Hilfe der monatlichen Niedrigwasserabflüsse. Zeitschrift der deutschen Geologischen Gesellschaft, Sonderheft Hydrogeologie Hydrogeochemie, 89-95.
  • LEWIS-BECK, S. M. (1986): Applied Regression – An Introduction. Series: Quantitative Applications in the Social Sciences. Sage University Paper 22.
  • MOHR, B. (1992): Die natürliche Raumausstattung. Südbaden. Schriften zur politischen Landeskunde Baden-Württembergs, 25-35.
  • MORGENSCHWEIS, G. (1990): Zur Ungenauigkeit von Durchflussmessungen mit hydrometrischen Flügeln. DGM 34, H. 1/2, 16-21.

PLATE, E. J. (1992): Regionalisierung in der Hydrologie. Deutsche Forschungsgemeinschaft. Mitteilung XI der Senatskommission für Wasserforschung. Hrsg. KLEEBERG, H.-B. Cambridge, NY.

SCHREIBER, P. (1996): Regionalisierung des Niedrigwassers mit statistischen Verfahren. Freiburger Schriften zur Hydrologie, Band 4. Freiburg, Germany.

VILLINGER, E. (1982): Hydrogeologische Aspekte zur geothermischen Anomalie im Gebiet Urauch-Boll am Nordrand der Schwäbischen Alb (Südwestdeutschland). Geologisches Jahrbuch, H. 32, 3-42.

WUNDT, W. (1953): Gewässerkunde. Berlin, Göttingen, Heidelberg .

back to Appendices - Overview

Appendices

appendices acknowledgements
AppendicesAcknowledgements
  • First of all, I would like to thank Falk Scissek, my co-author, for transferring the ideas to a power-point presentation and for his continuous engagement during the process of writing.
  • The work and co-operation with my students Christian Birkel, Uli Nädelin, Anne Thormählen, and others from the Institute of Hydrology, University of Freiburg, are gratefully acknowledged.

Thanks is due to individuals for their support in producing the self-guided tour:

Volker Abraham, for providing the cartographic skills, maps and the layout of the front page of the self-guided tour.

Kerstin Stahl, for calculating the flow regimes.

Helmut Straub, Environmental Agency, Regional Office, State of Baden-Württemberg, Germany, for providing the flow data of Baden-Württemberg and the permission to use the data on this CD-ROM.

back to Appendices - Overview

Appendices

appendices contact information
AppendicesContact Information
  • Prof. Dr. Siegfried Demuth
  • formerly
  • Institute of Hydrology
  • University of Freiburg
  • Fahnenbergplatz
  • 79098 Freiburg, Germany
  • currently
  • IHP/OHP-Sekretariat
  • International Hydrological and Operational Programme of UNESCO and WMO
  • Mainzer Tor 1
  • 59068 Koblenz, Germany
  • Demuth@bafg.de

back to Appendices - Overview

Appendices