230 likes | 338 Views
On February 25, 2008, key contributors from OHSU, Utah, and Portland State gathered for the CMOP All-Hands Meeting to discuss innovations in cyber-infrastructure to support scientific exploration. The meeting highlighted developments including the Data Mart, VisTrails, Quarry, and RoboCMOP, emphasizing principles of data visibility, product generation, and strategic sensor deployment. Key topics included data policies, network optimization, and collaboration tools designed to enhance data accessibility and research outputs. This meeting aimed to strengthen synergy among institutions for collaborative scientific endeavors.
E N D
Cyber-Infrastructure Activities CMOP All-Hands Meeting 25 February 2008 1
Cyber-Folk OHSU Bill Howe, Charles Seaton, Paul Turner, Antonio Baptista Utah Juliana Freire, Claudio Silva Portland State David Maier, Nirupama Bulusu, Wu-Chi Feng + grad & undergrad students
Activities • Data Mart • VisTrails • Quarry RoboCMOP • Network Optimization • Cruise Dashboard • Ocean Appliance • Data Policies
CMOP Data Mart http://www.stccmop.org/datamart
Data Mart Design Principles • 100% visibility of data assets • On-demand generation of products • Can always download data behind a product • Highly configurable: navigation, data selection, products, product parameters Have a look, leave comments
The VisTrails Project (Utah) • Vision: Provenance-enable the world • Comprehensive provenance infrastructure for computational tasks • Captures provenance transparently • Provides intuitive query interfaces for exploring provenance data • Supports collaboration • Designed to support exploratory tasks such as visualization and data mining • Task specification iteratively refined as users generate and test hypotheses • VisTrails system is open source: www.vistrails.org
Keeping Scientific Exploration Trails Workflows Data Products Trail
Integrating Tools and Libraries SCIRun Workflow that combines 5 different libraries Value added: provenance, query, parameter-space exploration, easier sharing & collaboration
Quarry Structured browse capability for model products • Harvest fine-grained meta-data • Automatically design efficient database schema based on data patterns • Can explore space of products via alternating property, value selections http://www.stccmop.org/quarry
Our Trajectory: RoboCMOP Vision:Lift scientific C-I to an active participant in the scientific process, acting autonomously to provide the data, products, and context you need, right when needed. Stages • Locate existing products (based on “cues” in conversation) • Instantiate existing product types on demand • Propose new product variants (Cf. VisTrails “Creating workflows by analogy”) • Task observatory systems to collect relevant data (serendipitous gap-filling, active direction of assets)
Network Optimization: Nirupama Bulusu • Sensor stations are deployed based on • Physical Intuition: Sensing coverage, Flow dynamics • Physical Limitation: Power and Communication wiring • Little understanding which sensors are important • Is the current deployment optimal? • If not, which sensors we should remove, which sensors we should keep? • If we want to deploy more sensors, where should we deploy them?
Sensor Selection Problem • Find a configuration of the network that reduces the most error in the data assimilation process • Set of all sensor configurations • Sensor configuration • type: sanity, elevation, temperature • x,y,z : sensor location • δ: sensor standard deviation • Error reduction in data assimilation
Results Exploring a genetic-algorithms approach • Reduce 26% of number of sensors, reduce accuracy by 1.55%
Cruise Dashboard Project of Nick Hagerty, summer REU • Fast visibility of collected data • With appropriate information context One of the drivers of pluggable products
Interface • Cast-specific interface fully functional • First deployed (successfully) on July 2007 cruise • Useful simply as convenient grouping of relevant data, graphs, information • Hope to link with workflow
2 1
Ocean Appliance • We must “IOOS-enable” local data providers • Someone has to write the code • Responsibility usually falls to RAs • The cost of hardware is falling • The cost of software support is rising • Provision complete platforms to control cost
IOOS: System of Systems (of Systems …) http://www.ocean.us/ DMAC standards National Service Nodes DMAC standards Ad hoc protocols Univ. Local Prov. RA RA Value-add services: Local Prov. Discovery Brokerage Aggregation Fusion Applications Local Prov.
System of Systems (of Systems …) How can we “DMAC-enable” the Local Data Providers, quicklyand inexpensively? Univ. Local Prov. RA RA Local Prov. Ad Hoc Protocols -- FTP -- screen scraping -- ASCII -- netCDF Local Prov.
The Ocean Appliance Software • Linux Fedora Core 6 • web server (Apache) • database (PostgreSQL) • ingest/QC system (Python) • telemetry system (Python) • web-based visualization (Drupal, Python) Hardware • 2.6GHz Dual • 2GB RAM • 250 GB SATA • 4 serial ports • ~$500 • ~1’x1’x1.5’
Deployed on Multi-ship Coordinated Cruise Forerunner SWAP Network; collaboration of: - OSU - OHSU - UNOLS Barnes Wecoma
Data Standards • What counts as data? • What are the standard procedures for collecting data during cruises? • How are new data sources added? • What external data archives will we use? • What are our QA/QC procedures for each data source? • How is instrument calibration information handled? • How will data processing levels and data release versioning be handled? Charles Seaton: cseaton@stccmop.org