330 likes | 406 Views
Explore PrIMe's cloud infrastructure, data flow, interfaces, big data, and new tools. Share data and apps while controlling privacy. Discover remote servers, species ID, UQ, and Bayesian simulations. Collaborate on Bayesian knowledge projects.
E N D
MACCCR 5th Fuels Research Review September 17, 2012 PrIMe Next Frontier: Large, Multi-dimensional Data Sets Michael Frenklach Supported byAFOSR
OUTLINE • PrIMe Cloud Infrastructure: • Data Flow Network • Remote Server: PrIMe-RMG • Interfaces • Big Data • Other new developments: • Species identification app • UQ: Statistical sampling of the feasible set • . . . • PrIMe with Humanities
PrIMe http://primekinetics.org Infrastructure for UQ-predictive modeling Process Informatics Model • Data sharing • App sharing • Automation
Present-day Science Sharing:via web-page access Internet domain 1 web page database domain 2 web page database apps apps
PrIMe Science Sharing:via web-service data/app access Internet database database science domain 1 science domain 2 apps apps
PrIMe Science Sharing:via web-service data/app access clientweb service data flownetwork Internet database database science domain 1 science domain 2 clientworkflowapp apps apps
PrIMe Data Model • Initial Model: • “Upload your data to PrIMe Warehouse” (“give me your data”) • New, Distributed Model: • “You may, if choose, connectyour data to the communal system” • with a switch in the OFF position: “you can use the communal data and tools but your own data is private to you only” • “but please flip the switch to the ON position when you are ready to share your own data”
same for apps • “Connect your codeto the communal system” • - you control your own code: • release version • user access, licenses • collect fees, if desired
Technology: How • Remote server app—PrIMe Web Services (PWS) • no restrictions on platform • no restrictions on data formats • no restrictions on local programming language(s) • PrIMe Workflow Interface (PWI) is the only “standard” • developed, maintained, and controlled by the community
PrIMe Dispatcher PrIMe Data Flow Network client machine PrIMe I n t e r f a c e PrIMeweb services clientdata
Big Data • excessively large data sets • do not move the data • but use “smart agents” (eg, HTML5 walkers) web services with user-reloaded tasks: fetch data features for user-requested analysis
PrIMe Remote-Server Webservices • Created ~2 years ago • installed by professional programmers • implemented on Reaction Design site • Modified June 2012 • can be installed by users • implemented with RMG at MIT site • installed by first-year grad students!
PrIMe – RMG • User creates a PrIMe Workflow (PWA) project • User submits a request: “create a reaction model for …” • The request activates RMG code at MIT server • User receives email when the model is generated • User retrieves the model or it “moves” along the PWA project to the next component
PrIMe Interfaces binary XML – HDF5 e.g., reaction model: GRI-Mech 3.0 client machine PrIMe I n t e r f a c e PrIMeweb services clientdata
New Developments • input data for UQ bypassing Warehouse • species identification via crowd-sourcing • UQ: sampling within the feasible region • comparison between interval-to-interval UQ and rigorous Bayesian • parallelization of Chemkin II
DataCollaboration: bounds-to-bounds predictions constrained to the feasible set
experiment/theory constrain feasible set M(x1,x2) F experimental uncertainty feasible set prior knowledge
Comparison between Bounds-to-Bounds UQ (DataCollaboration)andrigorous Bayesian An ongoing collaborative study with Jerome Sacks, National Institute of Statistical Sciences Rui Paulo, ISEG Technical University of Lisbon Gonzalo Garcia-Donato, Universidad de Castilla-La Mancha • Bayesian simulations: • no simplifying assumptions, • but utilize the Solution Mapping strategy for numerical efficiency
Parallelization: Chemkin II Execution time of flame simulations with a large acetylene model
Parallelization: Chemkin II Execution time of flame simulations with a hydrogen model
Knowledge UNIX • A collaborative project of PrIMe with Humanities: • Berkeley Electronic Cultural Atlas Initiative
“Study of Buddhist Texts” PrIMeis used to predict the past The abstracted dots represent 166000 “panes”
Knowledge UNIX • A collaborative project of PrIMe with Humanities: • Berkeley Electronic Cultural Atlas Initiative • Berkeley Institute of Information: “Editors Notes”
Current and Next • Remote-server app and new apps • RMG: interface (with MIT, Bill Green) • Communal/User tools: Cantera (with NCSU, Phil Westmoreland) • Big Data: feature collection for UQ(with Utah, Phil Smith) • Enabling new science infrastucture • ALS-data analysis (with NCSU; Phil Westmoreland) • Species IDs (with Kaust; Mani Sarathy) • H2-O2: automation/addition of flame targets (with Tsinghua, Xiaoqing You) • Submission of Chemkin mechanisms (with Kaust and Tsinghua)