250 likes | 324 Views
Explore the CLAM Composition Language and its primitives for arbitrary result extraction and composition of various processing tools and repositories. This framework enables domain experts to seamlessly integrate and compose different services while optimizing data extraction runtime issues. Learn how CLAM enhances data synchronization, simplifies integration, and provides better scheduling for legacy services.
E N D
A Comprehensive Model for Arbitrary Result Extraction Neal Sample, Gio Wiederhold Stanford University Dorothea Beringer Hewlett-Packard
Shift in Programming Tasks Integration/Composition Coding 1970 1990 2010
Sample Composition Tasks • Logistics • Reservation and distribution systems, “find the best transportation route from A to B” • Genomics • Framework for composing various processing tools and repositories • Modeling • Weather prediction, complex chemical systems, basin modeling • Composition of processes (vs. components, data)
CLAM Composition Language • Purely compositional • no primitives for arithmetic • no primitives for I/O, etc. • Splitting up CALL-statement • parallelism by asynchrony in sequential program • novel possibilities for optimizations • reduction in complexity of invocation statements • Higher-level language • assembly HLLs • HLLs compositional paradigm • Intent: Enable domain experts
CLAM Primitives Pre-invocation: SETUP: set up the connection to a service SET-, GETPARAM: in a service ESTIMATE: for optimization Invocation and result gathering: INVOKE: begin execution EXAMINE: test progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation/connection to a service
Data Dependencies & Scheduling START service1 service2 service3 service4 service5 END // begin program A = service1(); B = service2(); C = service3(A,B); D = service4(C); E = service5(C); // end of program
Runtime: data extraction is hard • Data extraction with native modules worked • No language-level specifications in CLAM • E.g., Polling, threading, exception handling… • Multiple middleware for transport difficult mapping • CORBA-RMI, RMI-COM, COM-CPAM, etc. • Crisis of legacy services • To generalize or restrict? • Refine the strategy…
Strategy: hide it & depend on it • Have to respect service capabilities • Or suffer the LCD… (more in a bit) • Simple and flexible programming • Data extractions is a runtime issue, it is not central to composition task • Simplified Integration • Legacy ambivalence • Simple bridging for middleware • Increase audience for services • Better scheduling • Declarative language, data dependencies
Where are we? • Declarative language for composition • Data is used synchronization • No primitives to support synchronization • Apparent “mismatch” in data extraction methods & capabilities among various actors • What does the data look like? • How can data be extracted?
Data View: Services RESULTS Result A Result B Result C
Extraction Techniques • Asynchrony • Explicitly controlled: spin-locks, polling, interrupt handling, etc. • Can use with any DAG schedule • Partial extraction • web browsing - HTML text as a schema • SQL cursors (thanks to the reviewer) • Progressive extraction (exceptional) • Adaptive mesh refinements, JPEG interleaving
Current Focus Pre-invocation: SETUP: set up the connection to a service SET-, GETPARAM: in a service ESTIMATE: for optimization Invocation and result gathering: INVOKE: begin execution EXAMINE: test progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation/connection to a service
Current Focus Pre-invocation: SETUP: set up the connection to a service SET-, GETPARAM: in a service ESTIMATE: for optimization Invocation and result gathering: INVOKE: begin execution EXAMINE: test progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation/connection to a service
EXAMINE Primitive in CLAM • Returns “status” and “progress” • Status – 2 bits of state • status = {DONE, NOT_DONE, PARTIAL, ERROR} • Progress – open descriptor • Indicates progress in application specific-way • Could be variance, mean, amplitude, etc. • Default assumption: integer 0-100 = % done • Resolution of EXAMINE • Can apply per service (black box) • Can apply per result (white box) • Not complete for many legacy systems:only “status”, no “progress”
EXAMINE Service A B C Service.EXAMINE() {PARTIAL, 40} Service.EXAMINE(A) {DONE, 100} Service.EXAMINE(B) {NOT_DONE, 0} Service.EXAMINE(C) {PARTIAL, 20} Service A B C Service.EXAMINE() {DONE, 100} Service.EXAMINE(A) {DONE, 100} Service.EXAMINE(B) {DONE, 100} Service.EXAMINE(C) {DONE, 100} Service A B C Service.EXAMINE() {NOT_DONE, 0} Service.EXAMINE(A) {NOT_DONE, 0} Service.EXAMINE(B) {NOT_DONE, 0} Service.EXAMINE(C) {NOT_DONE, 0}
EXTRACT Primitive • Extracts data from a service • Per service (black box) • (var) = Service.EXTRACT(); • Per result (white box) • (varA = A, varC = C) = Service.EXTRACT(); • Allows partial data extraction • saves volume: abandon uninteresting elements • saves time: termination of useless invocation • Allows progressive data extraction with 2-value EXAMINE (status+progress) • Steering, time saving
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extractionCLAM
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extractionCLAM
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM
Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM
Conclusions • Data extraction hiding is bueno! • User is not responsible for data management • Synchronizing extractions not in the language simplicity • Enables effective service scheduling • Simplified integration • Blueprint for proactive design pattern for future services