1 / 51

1 Peter Bajcsy, 1 Chulyun Kim, 2 Jihua Wang and 2 Yu-Feng Lin

A FRAMEWORK FOR GEOSPATIAL MODELING FROM SPARSE FIELD MEASUREMENTS USING IMAGE PROCESSING AND MACHINE LEARNING. 1 Peter Bajcsy, 1 Chulyun Kim, 2 Jihua Wang and 2 Yu-Feng Lin 1 National Center for Supercomputing Applications (NCSA) 2 Illinois State Water Survey (ISWS)

aldon
Download Presentation

1 Peter Bajcsy, 1 Chulyun Kim, 2 Jihua Wang and 2 Yu-Feng Lin

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A FRAMEWORK FOR GEOSPATIAL MODELING FROM SPARSE FIELD MEASUREMENTS USING IMAGE PROCESSING AND MACHINE LEARNING 1Peter Bajcsy, 1Chulyun Kim, 2Jihua Wang and 2Yu-Feng Lin 1National Center for Supercomputing Applications (NCSA) 2Illinois State Water Survey (ISWS) University of Illinois at Urbana-Champaign (UIUC)

  2. Outline • Introduction • Problems Addressed by Spatial Pattern To Learn (SP2Learn) • SP2Learn Architecture and Functionality Overview • Running SP2Learn • Summary

  3. Introduction

  4. General Problem • Compute a set of geo-spatially dense accurate predictions of variables • given a set of direct geo-spatially sparse point measurements and • auxiliary variables with implicit relationships with respect to the predicted variable • Motivation: • minimize cost of taking direct point measurements • maximize accuracy of predictions and • automate discovering relationships among direct field measurements and indirect variables

  5. Formulation • Input: sets of geo-spatially sparse variables {Vi{pij}} & dense auxiliary variables & a priori tacit knowledge of experts • Output: geo-spatially dense (raster) {Ok} • Unknown: selection of methods & workflow of operations/methods & parameters of methods & relationships of auxiliary variables w.r.t Ok & quantitative metric of output goodness p2j Interpolations Mathematical models p1j V1 & V2 O1 Auxiliary Variables & Tacit Knowledge

  6. Applied Problem Recharge and Discharge Rate Prediction Bedrock elevation Discharged Recharged Water table elevation

  7. Interdisciplinary Objectives • Ground Water (Hydrologic Science) View: • Evaluation of Alternative Conceptual (implicit relationships) and Mathematical Models (explicit relationships) • Accurate Prediction of Groundwater Recharge and Discharge Rates from Limited Number of Field Measurements • Computer Science View: • Computer-Assisted Learning to Assess Alternative Conceptual and Mathematical Models • Optimization of Prediction Models From a Set of Geo-Spatially Sparse Point Measurements DIALOG

  8. Recharge zone Noisy pattern or weak R/D Uniform Grid: 80mX80m Discharge Recharge Min. Grid: 805mX805m Discharge zone State-of-the-Art Results • Limited Spatial Resolution and Accuracy

  9. Existing Software for Groundwater and Surface Water Modeling • MODFLOW is a three-dimensional finite-difference ground-water model • http://water.usgs.gov/nrp/gwsoftware/modflow2005/modflow2005.html - freeware (2005) • PEST - is software for model calibration, parameter estimation and predictive uncertainty analysis • http://www.sspa.com/pest/ - freeware (2007); University of Queensland, Australia • Precipitation-Runoff Modeling System (PRMS) – is deterministic, distributed-parameter modeling system developed to evaluate the impacts of various combinations of precipitation, climate, and land use on streamflow, sediment yields, and general basin hydrology • http://water.usgs.gov/software/prms.html - freeware (1996); USGS • Deep Percolation Model (DPM) - facilitates estimation of ground-water recharge under a large range in climatic, landscape, and land-use and land-cover conditions • http://pubs.usgs.gov/sir/2006/5318/; USGS

  10. Related Work • Singh A. et al. “Expert-Driven ‘Perceptive’ Models for Reducing User Fatigue in an Interactive Hydrologic Model Calibration Framework” Conductivity (K) and Hydraulic heads (H) for the hypothetical aquifer

  11. Motivation • Ground Water (Hydrologic) Science: • Currently, there is no single method that could estimate R/D rates and patterns for all practical applications. • Therefore, cross analyzing results from various estimation methods and related field information is likely to be superior than using only a single estimation method. • Computer Science : • It is currently impossible • (a) to replace an expert with a lot of tacit domain knowledge by computer algorithms or • (b) to learn by an expert new I/O relationships from a plethora of possible variables and an extremely large space of processing methods and their parameters • Thus, assisting experts to discover, evaluate and validate new relationships in an iterative way will likely enable • (a) better understanding of the underlying phenomena, and • (b) more automated and cost-efficient predictions

  12. Problems Addressed by Spatial Pattern To Learn

  13. Our Approach • Data-Driven Analyses to Test Alternative Models, and to Search the Space of Processing Operations and Their Parameters • Interpolation methods • Mathematical models • Image processing algorithms • Machine learning algorithms • Scalability of algorithms with large size data • Computer-Assisted Comparisons and Evaluations of Multiple Models and Sub-Optimal Solutions • Model/Solution Representation • Closed Loop (Iterative) Workflows • Human Computer Interfaces • Overall Approach: An Exploration Framework for a Class of Alternative Models/Hypotheses and Optimal Solutions

  14. SP2Learn Problem Formulation • Given a set of geo-spatially sparse field measurements and auxiliary variables, derive accurate, spatially dense, R/D rate map by • (a) using physics-based model • (b) incorporating boundary conditions and • (c) exploring auxiliary variables representing prior knowledge about R/D patterns but missing in the physics-based model

  15. Challenges • (1) How to Recognize ‘Meaningful’ Pattern of Predicted Map? • (2) How to Quantify the Goodness of the Pattern? Approach: • (1a) Recognize patterns by utilizing multiple image enhancement and segmentation techniques applied to R/D rate predictions • (1b) Introduce relationship between R/D pattern and auxiliary (a priori reference) information • (2a) Define goodness w.r.t. reference information using expert’s selection of ‘meaningful’ relationships • (2b) Define goodness w.r.t. reference information using complexity of machine learning

  16. Using Physics-Based Model R/D Rate Prediction Field Measurements + + + + + + + + + + + + + + Discharged Recharged + Water table elevation + Hydraulic conductivity + Incoming water Outgoing water Bed rock elevation + Ground water flux=hydraulic conductivity * cell area * gradient of water table elevation (head) over cell distance

  17. Incorporating Spatial Boundary Conditions • BC: R/D rate prediction could have smooth transitions and recharge & discharge regions (contiguous pixels) should be clearly delineated • Approach: Apply Image Restoration and De-noising Techniques • Moving average based low pass filter • TVL (Total Variation regularized L1-norm function) based filter • Morphological operation based filter • Using multiple techniques multiple times Discharged Recharged

  18. Exploring Auxiliary Variables Driving R/D Patterns Prior Tacit Knowledge about R/D and Auxiliary Variables • Soil Type: P(R or D area/Soil=Clay)~low • Proximity to River: P(R or D area/River is close)~high • Slope: P(R or D area/ slope=high)~low moving average normalization+TVL normalization+TVL moving average

  19. From Auxiliary Variables To Knowledge and Accurate R/D Load Variables Integrate Maps Load R/D Map Create Decision Tree Define ROI Apply Rules

  20. SP2Learn Output • A set of rules that define relationships between predicted (R/D rate) variable and auxiliary variables • Modified (more accurate) predictions according to the user selected rules defining relationships of predicted and auxiliary variables • Sensitivity analysis results with respect to • Methods (interpolations, image enhancement, …) • Models • Parameters

  21. Example Results ROI • <RULE ID=138 NUM_OF_CASES=3975 SUPPORT=32.65%> • <IF>Elevation is not in {330-344} AND • Soil type is in {Rm=Roscommon muck} AND • Proximity to water body is not {near_water} AND • Slope is in {0-0.9} </IF> • <THEN>R/D rate is -0.004,-0.002</THEN> = +

  22. SP2Learn Architecture and Functionality

  23. Underlying SP2Learn Technology

  24. SP2Learn Functionality Overview Load Raster Step Integration Step Create Mask Step Rules Step Attribute Selection Step Apply Rule Step

  25. SP2Learn Workflow

  26. On-Line Help

  27. Software and Test Data Download • Download web page of Image Spatial Data Analysis group at NCSA: http://isda.ncsa.uiuc.edu/download/

  28. Running SP2Learn

  29. Input Data to SP2Learn • Raster files (maps) • Predicted R/D rate models • Auxiliary variables • For mask creation • Tables with geo-points • Vector files with boundaries • Raster files of categorical or continuous variables

  30. Image Processing • Filtering Methods • Low pass (moving average) filters • Morphological filters • TVL1 (Total Variation regularized L1 function) • Using multiple techniques multiple times • Parameters • Kernel size (row dimension, column dimension)

  31. Example Input Maps Low Pass Filter Morphological Closing Morphological Opening Kernel = (10,10) Kernel = (10,10) Kernel = (10,10) Kernel = (5,5) Kernel = (5,5) Kernel = (5,5)

  32. Example Auxiliary Maps • Slope • DEM • Soil • River Stream

  33. Loading Files • Load R/D rate models (maps) • Load auxiliary maps to explore alternative models • Proximity to water • Soil type • Slope • …

  34. Mosaic Maps • Large spatial coverage – a set of tiles • Out-of-core representation

  35. Viewing Images • Right mouse click • Image information • Zoom • Check boxes • Pseudo-color • Auto-fit images

  36. Registration • Integration of all maps (raster images) to a common projection and spatial resolution Before “Convert” After “Convert”

  37. Create Mask C A Mask Parameters Visualization Panel B Mask Operations

  38. Mask Creation Options in SP2Learn

  39. User Defined Mask Creation • Set Parameter: User defined • Mouse click-and-drag selection of region • Click Paint and Show • Click Apply

  40. Label Editor • Assign categorical labels to colors

  41. Attribute Selection • Output: Predicted Variable • Input: Auxiliary Variables • Check-boxes • Show Table • Prune Tree

  42. Soil Type is {sand}? no yes Distance from river ≤ 100 ft? yes Discharge no Case A.. Recharge Discharge Case E.. Case J.. Decision Tree Based Modeling • Tree structure can be represented as a set of rules

  43. Rules from Decision Tree • Num: Node number in a decision tree. • Support(%): Among all cases satisfying conditions, the ratio of cases having the same class (conclusion). • # of cases: The number of cases satisfying conditions • Class: Conclusion of a rule • Conditions: Conditions of a rule • MDL Score: MDL score of a decision tree. The less the score is, the better the tree is

  44. Show Decision Tree Show Tree Option

  45. Export Rules • XML format Export Rules Option

  46. Apply Rules • Visualization of • Modified output variable • Changed pixels • Magnitude of changes (differences)

  47. Summary • Novel Frameworks and Methodologies for Exploratory Data-Driven Modeling and Scientific Discoveries • Problems addressed in the prototype SP2Learn solution: • Prediction accuracy improvement by a combination of mathematical models and data-driven (knowledge based) models, supervised and unsupervised iterative model optimization • Better Data Utilization!

  48. Extra Information • A stack of informatics and cyber-infrastructure software is open source • Other software of potential interest: • GeoLearn is an exploratory framework for extracting information and knowledge from remote sensing imagery • CyberIntegrator to support creation of exploratory workflows, reuse of workflows, remote server execution, data and process provenance tracking and analysis, streaming data support • Image Provenance to Learn (IP2Learn) to support decision processes based on visual inspection of images • Load Estimation (work in progress) to support optimal sampling of sediment loads using several sediment-discharge rating curves, bias correction factors and Monte Carlo simulations to predict confidence limits • Download web page of Image Spatial Data Analysis group at NCSA: http://isda.ncsa.uiuc.edu/download/

  49. Acknowledgement • Funding Agencies: • NASA, NARA, NSF, NIH, NAVY, DARPA, ONR, NCSA Industrial Partners, NCSA Internal, COM UIUC, State of Illinois • Full Time Employees: • Peter Bajcsy, Rob Kooper, Sang-Chul Lee, Luigi Marini • Students: • Shadi Ashnai, Melvin Casares, Miles Johnson, Chulyun Kim, Qi Li, Tim Nee, Arlex Torres, Ryo Kondo, Henrik Lomotan, James Rapp • Collaborators: • College of Applied Health Sciences UIUC, Kinesiology Dept. UIUC, CEE UIUC, CS UIUC, GISLIS UIUC • UIC, UC Berkeley, Univ. of Texas at Austin, Univ. of Iowa • ISWS, NARA, Nielsen, State Farm • Instituto Tecnológico de Costa Rica, UNESCO-IHE Netherlands

  50. Thank you! • Questions: • Peter Bajcsy pbajcsy@ncsa.uiuc.edu • Need More Details • Publications: http://isda.ncsa.uiuc.edu

More Related