Grand Challenges in Global Remote Sensing

Grand Challenges in Global Remote Sensing John Townshend

The stimulus from Paul Mather • A man called Hilbert wrote a seminal paper in 1900 that contained a list of problems that had to be overcome if maths was to develop. • This provided a research focus for mathematicians around the world. • Given the range of uses of RS data and the inadequacies of many of the techniques used to extract information from that data I suggest that the RS community …needs a remote sensing Hilbert to write a paper that focuses on land cover extraction (from a range of data of different scales and coverages, and the use to which this remotely-sensed information is put). • To put it bluntly, would you be willing to write such a paper for PIPG (of which I'm an editor)?

Examples of David Hilbert’s 23 Problems • The continuum hypothesis (that is, there is no set whose size is strictly between that of the integers and that of the real numbers) • The Riemann hypothesis (the real part of any non-trivial zero of the Riemann zeta function is ½) and Goldbach's conjecture (every even number greater than 2 can be written as the sum of two prime numbers). • Solve all 7-th degree equations using functions of two parameters.

Some hard nuts to crack

Outline • Making progress, but: • Validation is grossly unsatisfactory. • Classification issues. • Separating emissivity and temperature. • Over-fitting • Failure to repudiate nonsense • Formats • Data Policy • Research to operations

Making progress

GlobCover (300m product)

100 km Landsat-5: Atmospheric Correction (Masek et al) 1990’s Landsat-5mosaic TOA reflectance Surfacereflectance BOREAS Study Region 100 km

MODIS SR Landsat SR Modeled forest% Landsat forest% TM Mosaic (current) band 321 (0-1200)->(0,512) R2, RMSD 500m or 1km 5km R2, RMSD Masek et al

(A) (B) LAI distribution around August 12, 2000: MODIS product (A) and processed product (B) Fang, H., S., Liang,J. Townshend, R. Dickinson, (2008), Spatially and temporally continuous LAI data sets based on an new filtering method: Examples from North America, Remote Sensing of Environment, 112:75-93

Optimizing the combined use of MODIS and GOES fire detection data for Amazonia Monitoring Vegetation Fires in Amazonia Schroeder et al Publications: • Schroeder, W., Prins, E., Giglio, L., Csiszar, I., Schmidt, C., Morisette, J., and D. Morton (2008). Validation of GOES and MODIS active fire detection products using ASTER and ETM+ data. Remote Sensing of Environment, 112 (5), 2711-2726, doi:10.1016/j.rse.2008.01.005. • Schroeder, W., Csiszar, I., and Morisette, J. (2008). Quantifying the impact of cloud obscuration on remote sensing of active fires in the Brazilian Amazon. Remote Sensing of Environment, 112, 456-470, doi:10.1016/j.rse.2007.05.004. • Schroeder, W., Morisette, J. T., Csiszar, I., Giglio, L., Morton, D., and Justice, C. (2005). Characterizing vegetation fire dynamics in Brazil through multisatellite data: Common trends and practical issues. Earth Interactions, 9,Paper 13. • Morisette, J.T., Giglio, L., Csiszar, I., Setzer, A., Schroeder, W., Morton, D., and Justice, C. (2005), Validation of MODIS active fire detection products derived from two algorithms. Earth Interactions, 9, Paper 9. Integrated fire product for Brazilian Amazonia using 2005 MODIS and GOES data showing average number of detection days per year.

Making data available through the web in standard formats makes an enormous difference MODIS Global Web/GIS Fire Maps Example: Wildfires in California MODIS active fire detections superimposed with USFS park boundaries, hydrology, roads. User can query for fire detection attribute information. From: Chris M Mayfield, NORTHCOM COP/GIS Manager “Long time NASA MODIS users, we were unaware of the FIRMS resource until that mid morning, but now, I can assure you that FIRMS is very much a part of the NORTHCOM Team in protecting the homeland. Again, our many thanks and a very big BRAVO ZULU to all of you on the FIRMS Team.” Davies et al. UMd

SRTM/DEM and PRISM/DSM (1/2) SRTM DEM ALOS PRISM DSM

Validation What is truth?

Operational lc validation framework Validation of new products Primary validation Updated valid./change Comparative validation Product synergy Data reprocessing Link to regional datasets Degree of usability and flexibility Legend translations In-situ global Updated interpretations LCCS-based Interpretation (Regional Networks) Reference database: statistically robust, consistent, harmonized, updated, and accessible Design based sample of reference sites Existing global LC products Time

International consensus on technical issues “Best Practices Document” Strahler et al., 2006

Validation is really hard. • Scale matters a lot • Making ground measurements and relating them to even 30m or 250m pixels is hard work and expensive. • With too much inherent spatial variability relative to pixel size and locational rms errors you never know where your ground observations are in relation to the pixels. • Some areas can not be validated • Not to mention MTF/PSF. • Timing (or lack of it) is usually also an issue. • In rugged terrain we are usually screwed. • Validation of change detection is really, really hard.

Validation • We have failed to make the case for Validation so that enugh funds are available! • Few funds means that validation of all products is inadequate. • Stage 1 Validation – Product accuracy has been estimated using a small number of independent measurements from selected locations and time periods. • Stage 2 Validation – Product accuracy has been assessed by a number of independent measurements, at a number of locations or times representative of the range of conditions portrayed by the product e.g. EOS Land Validation Core Sites, Fluxnet sites, Aeronet sites. • Stage 3 Validation - Product accuracy has been assessed by independent measurements in a systematic and statistically robust way representing global conditions e.g. IGBP DISCover Project – suggest that this be undertaken • For any product can we truthfully give the errors in space and time to our own satisfaction? • Sometimes there are no funds and no validation.

Does validation allow us to assess value? “The widely used leaf area products derived from satellite-observed surface reflectances contain substantial erratic fluctuations in time due to inadequate atmospheric corrections and observational and retrieval uncertainties. These fluctuations are inconsistent with the seasonal dynamics of leaf area, known to be gradual. Use in process-based terrestrial carbon models corrupts model behavior, making diagnosis of model performance difficult. We propose a data assimilation approach Combines the satellite observations of Moderate Resolution Imaging Spectroradiometer (MODIS) albedo with a dynamical leaf model. Its novelty is that the seasonal cycle of the directly retrieved leaf areas is smooth and consistent with both observations and current understandings of processes controlling leaf area dynamics.” Liu et al 2008 The point is that any sort of generic validation might not identify this problem. We should assess value not in the abstract but in terms of usefulness.

Classification • Classification often does not work well. • Many reasons. • Some arise because we still don’t know how to classify • Robustness to error in training data. • Class proportions

Dealing with training site errors • Training sets always contain errors • Can we overcome this problem in classification? • Test the classifiers with varying amounts of errors introduced into the training set • Support Vector Machine (SVM) and Kernel Perceptron (KP) outperforms Maximum Likelihood, Decision Tree, and ARTMAP Neural Network • Errors as much as 30% in SVM can be tolerated • The soft-boundary design of modern SVM allows a proportion of errors to exist in the training set

A. Overall condition of the Experiment Site B Change Detection Result of DT and SVM using a 10% corrupted training data C. Change Detection Result of DT and SVM using a 20% corrupted training data D. Change Detection Result of DT and SVM using a 30% corrupted training data SVM Robust against subjective errors

Early Work on Training Design • Class proportions impact on a priori probabilities • Identified by Strahler in 1980 • Part of the Maximum Likelihood Classifier (MLC) framework • Usage: to multiply with the probability of each pixel • Contribution: Introduced the concept of “Class Prior” • Issue: The concept was not used in training design • Class proportions in the Population • Identified by Hagner in 2001 and 2005 • Estimated using MLC • Usage: to adjust the proportions in the training set for iterative MLC • Contribution: Adaptive training design using “Class Prior” • Issue: It is not MLC that needs training set design. MLC actually is largely invariant to training sets of different proportions, as is shown in Hagner’s own results.

The Over/Under-Estimation Problem (Song et al) Modern Algorithms such as SVM are very susceptible to this problem. But MLC is largely unaffected

The Over/Under-Estimation Problem • Many methods need the class prior of the population to resample the training dataset • The class prior of the population might be estimated through MLC.

Almost impossible to separate surface emissivity and temperature accurately (Liang) Surface leaving radiance is the sum of the surface emitted radiance and reflected downward atmospheric radiation Where is surface emissivity, B () is the Planck function, and Fd is the downward flux. For most surfaces, since emissivity is close to 1 the reflected radiance is quite small. Thus It is almost impossible to separate two multiplied components, so we cannot determine emissivity and temperature T accurately. The alternative solution is to estimate upwelling radiation from thermal IR observations for initialization/calibration/validation of land surface models.

Some other issues • The history of remote sensing information extraction is largely the history of over-fitting. • Those working on identification of spam have a one-shot externally organized test. • Hyper-spectral RS. • Something is almost bound to be related to something. • How do we begin to move towards standard products? • Where is the underlying theory to determine them? • Disparities in resolution of reanalysis products and typical land cover variability. • Difficulty of getting global biomass at time and space resolutions appropriate for REDD and conservation.

Standing up for what we believe in. • 159 scientific papers have been found to base their conclusions heavily on FRA statistics (Grainger, 2008) • We know FRA is garbage for land cover change so why don’t we say so? This should not be a challenge.

Land cover and land use change. • FRA Problems are twofold • Having to deal with individual countries • Confusion between land cover and land use • “Where part of a forest is cut down but replanted (reforestation), or where the forest grows back on its own within a relatively short period (natural regeneration), there is no change in forest area.” • But for those concerned with land cover these differences are real

The curious case of Canada in FRA 2005 • Forest Area 1990 310,134,000 ha.* • Forest Area 2000 310,134,000 ha.* • Forest Area 2005 310,134,000 ha. “Canada reports only productive forest land; unproductive forests are classified as “other wooded land” even though many of them meet the FAO definition of forest land. This results in underreporting of more than 170 million hectares, or 40 percent of Canadian forest land.” (Matthews 2000). * Note in FRA 2000 Canada reported only 244,571,000 hectares for both 1990 and 2000!

Issues with FRA • Assuming we are interested in land cover and not land use • Global rates are wrong (much too low) • Changes in rates (by decade and half-decade) are wrong (Tropical deforestation rates from 80s to 90s supposedly declining when increasing). • Inter-continental variations are seriously mistaken (South America vs Africa) • Considerable inconsistencies between countries.

The importance of formats and data policy

How to ensure data are used • On December 8, 2008, the USGS made the entire 36-year long Landsat archive available to anyone via the Internet at no cost. • GeoTIFF format • Orthorectified “GIS-ready” • Calibrated across missions and instruments

Questions for space agencies • Why don’t you always provide the following: • User friendly formats allowing immediate ingestion into GIS’s. • Standardized meta-data. • Rapid response systems. • Ortho-rectified data for all resolutions 500m and below. • Atmospherically corrected data • Up to date Calibration data • Validation data for all products

Six Problems with RS data policies 1. If people want to use remotely sensed data then they should pay They already have as citizens. Plus the driving force for most environmental remote sensing data is scientific or policy driven. 2. Making data available has an incremental cost. Resources raised are a tiny fraction of the total cost of the system. 3. There is a commercial future for all environmental remote sensing data. No evidence for mid and coarser resolution data. 4. Restrictive Data Policy is OK because remote sensing data is made available free to scientists. Why should scientists have preferential access compared with those in developing countries alleviating poverty? 5. Principal Investigators need an extended period of exclusive use Only to make sure the products are characterized so that “health warnings” can be attached. Tell us why you want to use the data before we will let you have it Otherwise known as the ”Papa ESA knows best policy”

Free and open data policy Data easily accessible on line. Community specified formats Orthorectified Validated data sets Restrictive data policy with charging. Not on-line: difficult to order. Non-standard agency specified formats Not orthorectified Unvalidated data sets GEO Halls of Fame andShame for Agencies HALL OF SHAME

“Valley of death”. FROM RESEARCH TO OPERATIONS IN WEATHER SATELLITES AND NUMERICAL WEATHER PREDICTION CROSSING THE VALLEY OF DEATH Board on Atmospheric Sciences and Climate The term “Crossing the Valley of Death” is sometimes used in industry to describe a fundamental challenge for research and development (R&D) programs. For technology investments, the transitions from development to implementation are frequently difficult, and, if done improperly, these transitions often result in “skeletons in Death Valley.”

Successful transitions from R&D to operational implementation • Understanding of the importance (and risks) of the transition, • Development and maintenance of appropriate transition plans, • Adequate resource provision, • Continuous feedback (in both directions) between the R&D and operational activities. “In the case of the atmospheric and climate sciences, inadequacies in transition planning and resource commitment can seriously inhibit the implementation of good research leading to useful societal benefits.” NRC. Landsat>LDCM and MODIS>VIIRS clearly demonstrate the enormous difficulties that can occur.

Fire (Justice) • A near-term major challenge for the international community will be to develop the best available - validated Fire Disturbance ECVs. • The Grand Challenge will be to secure the satellite fire observing system that is needed consisting of • 1) operational polar orbiters with appropriate saturation for fire characterization, • 2) operational global geostationary network with 500m resolution 30 minute repeat, • 3) operational global Landsat class observations with 3-5 day repeat

Who has the responsibility for doing things operationally? • Broad consensus on methods to achieve operational monitoring. • But we must adapt to rapidly changing technologies and data availability (Google and radar) • Need to ensure commitment to: • Supply of remote sensing data • Generation of terrestrial products • Operational validation process • More broadly who will commit to generation of operational products such as ECVs? • Which international body will oversee the work? • Who has both the formal responsibility and scientific and technical capacity? • Can not simply be left to agencies. Agencies are starting to lay claim to certain ECVs but with little oversight. • Urgent need to establish roles and responsibilities.

GEO and CEOS • Internationally highly dependent on them. • But both “best efforts” organizations. • Much talk about cooperation but concepts such as virtual constellations will be very difficult without • Agreements on data policy • Agreements on formats and pre-processing • Common portals that work. • Perhaps the greatest challenges is to get these organizations acting in an integrated coordinated fashion responding to user needs.

Thank you

Time Series for Amazon Forests Solar Radiation (W/m²) 1000 950 900 850 800 750 5.5 5.0 4.5 4.0 Precipitation (mm /mo) 300 200 100 Leaf Area Index 2000 2001 2002 2003 2004 2005 2006 *Dry seasons are in grey shaded bars. The phase-shift between LAI and solar radiation suggests rainforests’ adaptation to anticipating more sunlight. Part Four

Transitioning to operational capabilities • Get the data policy right • Standardization of formats • Orthorectification • Atmospheric correction • Use of improved algorithms

Summary • Performance of remotely sensing studies in the real world largely relies on two factors: • 1. How well can algorithms handle unknown errors • 2. How to adaptively design the training set so that we can balance the overestimation/underestimation problem Training Algorithm

Global Agricultural Fires Korontzi et al. 2007

Grand Challenges in Global Remote Sensing

Grand Challenges in Global Remote Sensing

Presentation Transcript

Remote Sensing

Remote Sensing

Remote Sensing

REMOTE SENSING

Remote Sensing

Remote Sensing:

REMOTE SENSING

Remote Sensing

Global Change: Remote sensing

Remote Sensing

Remote Sensing Hyperspectral Remote Sensing

Remote Sensing

Remote Sensing

Remote Sensing Applications in Global Change Studies

Remote Sensing

Remote Sensing

Remote Sensing:

Remote Sensing

Remote Sensing:

Global Remote Sensing Technology Market