Middleware diarrhea and other ailments
1 / 43

Middleware Diarrhea and Other Ailments - PowerPoint PPT Presentation

  • Uploaded on

Middleware Diarrhea and Other Ailments. Michael Stonebraker Adjunct Professor Massachusetts Institute of Technology ([email protected]). Outline. Too much middleware XML ailments Web services ills Our professional sickness. Client-Server Got Replaced by N-Tier Computing. The Web

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Middleware Diarrhea and Other Ailments' - tegan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Middleware diarrhea and other ailments

Middleware Diarrhea and Other Ailments

Michael Stonebraker

Adjunct Professor

Massachusetts Institute of Technology

([email protected])


  • Too much middleware

  • XML ailments

  • Web services ills

  • Our professional sickness

Client-Server Got Replaced by N-Tier Computing

  • The Web

  • Gizmos

  • Scalability and management problems with client server

Humility Lesson

  • We all sold client-server hard

    • during the 80’s

    • and even into the 90’s

  • Less than 10 years later

    • it is the worst idea on the planet

  • We should feel really dumb!

    N-Tier Computing Produced Lots of Middleware

    • App servers

    • EAI/messaging

    • ETL

    • Federators

    • Workflow

    • CMS

    • Portals

    • DBMS

    Middleware Diarrhea

    • Average enterprise has

      • one (or more) app servers

      • one (or more) EAI packages

      • one (or more) ETL packages

      • one (or more) portal products

      • one (or more) application packages

      • and maybe someday a federated DBMS

    All of these systems

    • Contain transformation engines

    • And often do function activation (app service)

    • And often have adapters to legacy systems

    Huge overlap in functionality!!

    Less Moving Parts

    • Less systems

    • More uniformity

    • Less duplication

    Less Systems

    • Less system administrators

    • Less training

    • Less manuals

    • Less bugs

    • Less cross system issues

    More Uniformity

    • Every island has

      • memory management

      • security model

      • threading model

  • Less is better

  • Less Duplication

    • Most of the islands support transformations

      • reasonable chance you will do each one 6 or more times

      • maintenance headache

    So How To Consolidate……

    • Converge app server into OR DBMS

      • dumbest OR query is execute function

    Remember that everything looks like a

    nail to the guy with the hammer!









    This Requires….

    • DBMS to send queries to other DBMSs

    • I.e. be a data federator

    • Load balance also requires a federator

    Best of Breed Federators

    • Support schema heterogeneity

      • by executing OR functions

  • Support materialized views

    • to cache static data

  • Less Moving Parts….

    • Federators dominate ETL

      • ETL only supports “push”

      • federators do both “push” and “pull”


    • A collection of rules

      • who’s allowed to buy what

      • and who must approve it

  • Best considered as a boxes and arrows diagram

  • And compiled into components to run on an app server

  • Workflow framework po s
    Workflow Framework -- PO’s










    Data intensive workflow should move inside an or dbms
    Data Intensive Workflow Should Move Inside an OR DBMS

    • GUI for “boxes and arrows”

    • Compiler for the diagram

      • processing steps become components

      • business rules become triggers

      • all data flow inside the DBMS

    • Worked great in Media/360


    • Big Big Big performance advantage

      • no polling of the DBMS

      • no data movement

      • easy to change!

    Watch for Informix product in this area!


    • One integrated system that does

      • federation

      • EAI

      • app service

  • With a single transformation system

  • Based on DBMS technology (or something else….)

  • XML

    • Good for content storage and movement

    • Good as “on the wire” format for data movement

      • as long as you don’t need to send a lot of stuff fast

  • Bad for data storage!

  • History Lesson

    • 1960’s

      • IMS and IDMS get traction

      • customers start complaining about rewriting everything when schema changes

    History Lesson

    • 1970

      • Codd writes pioneering paper

      • starts a decade long argument between IMS/CODASYL advocates and Codd supporters

    Net-Net of Argument

    • Putting semantics into data order is bad

      • restricts storage options

  • Hidden meaning bad

    • no self-defining fields

  • Net-Net of Argument

    • Data independence is good

      • schemas change often

      • don’t want to rewrite anything when this happens

    Net-Net of Argument

    • Complexity is bad

      • high level query languages are good

      • KISS arguments

  • Call these three premises “Codd’s laws”

  • History Lesson

    • 1983(?)

      • Codd wins Turing award

      • acknowledgement for being right

    XML in This Historical Light

    • Most of the bad features of IMS/Codasyl

      • allows semantics in data order

      • data independence will be a challenge

        • try updates on inverted hierarchies

        • look at IMS LDBs

    • more complex than Codasyl

    Our Field

    • We look a little silly saying

      • an idea renounced in the 1970’s

      • is back

  • Leading our colleagues to ask “What’s different?”

    • if somebody disproved Codd’s laws; they didn’t tell me…..

  • How to Win the Turing Award Circa 2020

    • 2000’s

      • XML data storage gets traction

  • 2010

    • dust off Codd’s paper

  • Wait 10 years to be proven right

  • In Any Case

    • In line tags turn 1Tbyte of EMP data into 10 Tbytes of EMP data

    • Won’t store anything big in native XML

      • will use something else….

      • like what?


    • XML is merely this year’s data type

    • Next year it will be WML or …

      • and there will be a next year….


    • Contains the kitchen sink

    • Complexity run amok

      • diarrhea from the SGML types

  • Includes lots of known hard stuff

    • e.g. union types

  • Xquery

    • Mostly syntactic sugar on OR SQL

      • // is a user-defined function in Informix OR engine

  • Try to keep the semantics close to OR SQL

  • Another History Lesson

    • Typical enterprise wanted data integration for business analysis badly

      • needed data in a variety of systems

      • in a variety of formats

      • often with no unique ids

      • often with incompatible semantics

        • 2 day delivery means lots of things

    • often dirty

    ETL Warehouse Projects of the 90’s

    • Well into 8 digits

    • Usually a factor of three behind schedule

    • Delivering a factor of 3 less stuff

    • Everybody dented their pick on semantic heterogeneity

      • which is hard, hard, hard

      • and not solved by the blizzard of 3 letter acronyms from Redmond

    Web Services

    • Will be a long time coming outside of simple domains (where there is no data integration to deal with)

    • E.g. catalog management

      • Grainger perspiration….

    The Depressing State of Affairs

    • ~50-75% of IT projects fail

      • if we built bridges, our profession would be fired

      • and the same mistakes are repeated over and over (excessive ambition, rolling specs, bad design, failure to load a large data set early)

    What To Do?

    • We typically don’t teach this stuff (and do a serious disservice to our students)

      • probably because we don’t (can’t) spend any time in industry to figure it out

    Action item: at the very least read a couple of Robert L. Glass’s books

    The Depressing State of Affairs

    • Hardware “half-life” is 18 months

    • Software half-life is 18 years (or more)!

    • In 25 years we moved from

      • C to Java

      • SQL to Xquery

    What To Do?

    • Much higher level design environments

      • vis

      • workflow

      • special purpose languages (report writers,…)

  • And stop turning down papers on this stuff

  • Grand Challenge

    • Improve application productivity (probability of success * programmer productivity) by 2 this decade