810 likes | 817 Views
Data on the Inside versus Data on the Outside. Developed and Written By: Pat Helland Architect Architecture Strategy Team Developer and Platform Evangelism Microsoft Corporation. Magic Quadrant for Web-Services Enabled Software, 3Q04. Source: Gartner Research (September 2004).
E N D
Data on the Inside versus Data on the Outside Developed and Written By: Pat HellandArchitect Architecture Strategy Team Developer and Platform Evangelism Microsoft Corporation Magic Quadrant for Web-Services Enabled Software, 3Q04 Source: Gartner Research (September 2004) Session Delivered By: Lars Laakes Architect Specialist, Financial Services Microsoft Canada llaakes@microsoft.com
What Is Hubble? • Two Names for Architectural Thoughts: • Metropolis • Looking at the History of Urban Development • Explaining Service Oriented Architecture Trends • Predicting Upcoming Changes • Three Talks So Far; More Planned • Hubble • Hard-Core Architectural Discussion of Issues Related to Service Oriented Architectures • Many New Talks to Come… (Services, Data, Pub-Sub, etc) Why “Hubble”? Edwin Hubble (1889-1953) showed the universe is expanding. As CPUs get faster, latency for communications is not dropping much. While a message is moving, more CPU work is lost next year than this. “Computing is like Hubble’s Universe; everything is getting farther away from everything else” -- Pat Helland (circa 1992)
Introduction: The Shift Towards Services • Behavior: Encapsulation and Trust • Data: Then and Now • Outside Data: Reference Data • Outside Data: Sending Messages • Outside Data: XML and Schema • Outside Data: Commonality of Schema and Understanding • Inside Data • Inside/Outside: Representations of Data • Inside/Outside: Tying It All Together • Conclusion Outline
Introduction: The Shift Towards Services • Behavior: Encapsulation and Trust • Data: Then and Now • Outside Data: Reference Data • Outside Data: Sending Messages • Outside Data: XML and Schema • Outside Data: Commonality of Schema and Understanding • Inside Data • Inside/Outside: Representations of Data • Inside/Outside: Tying It All Together • Conclusion Outline
Service Policy Schema and Contract Service-Oriented Architecture • Service-Orientation • Independent Services • Chunks of Code and Data • Interconnected via Messaging • Four Basic Tenets: • Boundaries Are Explicit • Services Are Autonomous • Services Share Schema and Contract • Not Implementation • Service Compatibility Is Based on Policy
Service-A Service-B Services Communicate With Messages • Services Communicate with Messages • Nothing Else • No Other Knowledge about Partner • May Be Heterogeneous
Data MSG MSG SQL Data Outside the Service Data Inside the Service Data Inside and Outside Services • Data Is Different Inside from Outside • Outside the Service • Passed in Messages • Understood by Sender and Receiver • Independent Schema Definition Important • Extensibility Important • Inside the Service • Private to Service • Encapsulated by Service Code
Introduction: The Shift Towards Services • Behavior: Encapsulation and Trust • Data: Then and Now • Outside Data: Reference Data • Outside Data: Sending Messages • Outside Data: XML and Schema • Outside Data: Commonality of Schema and Understanding • Inside Data • Inside/Outside: Representations of Data • Inside/Outside: Tying It All Together • Conclusion Outline
Service • Things I’ll Do for Outsiders • Deposit • Withdrawal • Transfer • Account Balance Check Bounding Trust via Encapsulation • Services Only Do Limited Things for Their Partners • This Is How They Bound Their Trust • Encapsulation Is About Bounding Trust • Business Logic Ensures Only the Desired Operations Happen • No Changes to the Data Occur Except Through Locally Controlled Business Logic!
Sanitized Datafor Export Data Exported Data PrivateInternalData Business Request Encapsulating Both Change and Reads • Encapsulating Change • Ensures Integrity of the Service’s Work • Ensures Integrity of the Service’s Data • Encapsulating Exported Data for Read • Ensures Privacy by Controlling What’s Exported • Allows Planning for Loose Coupling and Expirations • E.g. Wednesday’s Price-List
Trust and Transactions • Some Propose Atomic Transactions Across Services • E.g. WS-Transactions • Requires Holding Locks • Lots of Trust in Timely Unlock • Doesn’t Sound Autonomous and Independent to me… • Debate Is the Definition of the Word Service • Requires Autonomy and Independence? • Allows Intimacy across Service Boundaries? • There Will Be Code Connected by 2-Phase Commit • Same Service or in Different Services? • For This Talk, I Presume No Cross-Service Txs • Simply the Definition of the Word “Service”
Service Contract One Of Tentative Place-Order Accept-Order One Of Reject-Order Confirm Place-Order Cancel Place-Order Interconnecting with Independent Services • Services Are Connected by Messaging • The Only Interaction Between Two Services Is by the Messages that They Exchange • Schema: The Formats of the Individual Messages • Contracts :The Allowable Sequences of Messages
Service Deposit Operands Operator Operators and Operands • Messages Contain Operators • Requests a Business Operation • Operators Provide Business Semantics • Part of the Contract between the Two Services • Operator Messages Contain Operands • Details Needed To Do the Business Operation • The Sending Service Must Put Them into the Message
Where Do Operands Come From? • Operands Come from Reference Data • New Kind of Data in SOA • Except It’s Not New; We’ve Done Variations of SOA for Decades… • We’re Just Getting Better at It! • Catalogues • Reference Data is Versioned and each Version Is Immutable • Immutable Images Are Shared Across Many Services • We Will Talk About the Creation, Publication, and Management of Reference Data
Introduction: The Shift Towards Services • Behavior: Encapsulation and Trust • Data: Then and Now • Outside Data: Reference Data • Outside Data: Sending Messages • Outside Data: XML and Schema • Outside Data: Commonality of Schema and Understanding • Inside Data • Inside/Outside: Representations of Data • Inside/Outside: Tying It All Together • Conclusion Outline
Transactions and Inside Data • Transactions Make You Feel Alone • No One Else Manipulates the Data When You Are • Transactional Serializability • The Behavior Is As If a Serial Order Exists
Life in the “Now” • Transactions Live in the “Now” Inside Services • Time Marches Forward • Transactions Commit • Advancing Time • Transactions See the Committed Transactions • A Service’s Biz-Logic Lives in the “Now”
Sending Unlocked Data Isn’t “Now” • Messages Contain Unlocked Data • Assume No Shared Transactions • Unlocked Data May Change • Unlocking It Allows Change • Messages Are Not From the “Now” • They Are From the Past • There Is No Simultaneity At a Distance! • Similar to Speed of Light • Knowledge Travels at Speed of Light • By the Time You See a Distant Object It May Have Changed! • By the Time You See a Message, the Data May Have Changed! • Services, Transactions, and Locks Bound Simultaneity! • Inside a Transaction, Things Appear Simultaneous (to Others) • Simultaneity Only Inside a Transaction! • Simultaneity Only Inside a Service!
Outside Data: a Blast from the Past • All Data From Distant Stars Is From the Past • 10 Light Years Away; 10 Year Old Knowledge • The Sun May Have Blown Up 5 Minutes Ago • We Won’t Know for 3 Minutes More… • All Data Seen From a Distant Service Is From the “Past” • By the Time You See It, It Has Been Unlocked and May Change • Each Service Has Its Own Perspective • Inside Data Is “Now”; Outside Data Is “Past” • My Inside Is Not Your Inside; My Outside Is Not Your Outside • Going to SOA Is Like Going From Newtonian to Einstonian Physics • Newton’s Time Marched Forward Uniformly • Instant Knowledge • Before SOA, Distributed Computing Many Systems Look Like One • RPC, 2-Phase Commit, Remote Method Calls… • In Einstein’s World, Everything Is “Relative” To One’s Perspective • SOA Has “Now” Inside and the “Past” Arriving in Messages
Versioned Images of a Single Source • A Sequence of Versions Describing Changes to Data • Updates FromOne Service • Owner Controlled • Owner Changes the Data • Sends Changes as Messages • Data Is SeenAs AdvancingVersions
Operators: Hope for the Future • Messages May Contain Operators • Requests for Business Functionality Part of the Contract • Service-B Sends an Operator to Service-A • If Service-A Accepts the Operator, It Is Part of Its Future • It Changes the State ofService-A • Service-B Is Hopeful • It Wants Service-A To Dothe Work • When It Receives a Reply,It’s Future Is Changed!
Operands: Past and Future • Operands May Live in the Past • Values Published As Reference Data • Come From Service-A’s Past • Operands May Live in the Future • They May Contain a Proposed Value Submitted to Service-A
Between Services: Life in the “Then” • Everything Between Services Lives in the Past or Future • Operators Live in the Future • Operands Live in the Past or the Future • It’s Not Meaningful to Speak of “Now” Between Services • No Shared Transactions No Simultaneity • Life in the “Then” • Past or Future • Not Now • Each Service Hasa Separate “Now” • Different TemporalEnvironments!
Services: Dealing with “Now” and “Then” • Services Make the “Now” Meet the “Then” • Each Service Lives in Its Own “Now” • Messages Come and Go Dealing with the “Then” • The Business-Logic of the Service Must Reconcile This!! • Example: Accepting an Order • A Biz Publishes Daily Prices • Probably Want to Accept Yesterday’s Prices for a While • Tolerance for Time Differences Must Be Programmed • Example: “Usually Ships in 24 Hours” • Order Processing Has Old Info • Available Inventory Not Accurate • Deliberately “Fuzzy” • Allows Both Sides to Cope with Difference in Time Domains! • The World Is No Longer Flat! • SOA Is Recognizing That There Is More Than One Computer • Multiple Machines Mean Multiple Time Domains • Multiple Time Domains Mandate We Cope with Ambiguity to Allow Coexistence, Cooperation, and Joint Work
Introduction: The Shift Towards Services • Behavior: Encapsulation and Trust • Data: Then and Now • Outside Data: Reference Data • Outside Data: Sending Messages • Outside Data: XML and Schema • Outside Data: Commonality of Schema and Understanding • Inside Data • Inside/Outside: Representations of Data • Inside/Outside: Tying It All Together • Conclusion Outline
Purposes for Reference Data Historic Artifacts Shared Collections of Data Operands What Is Reference Data? • Reference Data Is Published Across Service Boundaries • For Each Collection of Reference Data: • One Service Creates and Publishes the Data • Other Services Receive Periodic Versions of the Data
Service Deposit Operands Operator Reference Data: Operands for the Operators • As Discussed Above, Messages Across Services Invoke Business Operations… • Each Service-to-Service Message Is an Operator • Each Operator Message Is Filled with Operands • Parameters, Options, Customer-Id, Parts-Being-Ordered, etc • The Data for These Operands Is Published as Reference Data
Service(Bank) BankStatementJan-2004 Reference Data: Historic Artifacts • Historic Artifacts Report on What Happened in the Past • Sometimes These Snapshots Need to Be Sent to Other Services • Examples: • Sales Quarterly Results • Monthly Bank Statements • Any and All Monthly Bills • Well… • Both Requests for Payment (Operations) and the Historic Artifact of How Much Power You Used… • Inventory Status at End of Quarter
Ref Vers#24of EmployeeData Vers#24 UpdateEmployees Reference Data: “Shared Collections of Data” • Many Services May Need Access to the Same Data • The Data Is Changing… • Someone Owns Updating and Distributing the Data… • Examples: • Customer Database • Employee Database • Parts Database and Price-List HR Service Sales Service Authoritative CustomerData Authoritative EmployeeData – Vers#24 Authoritative EmployeeData – Vers#23 Ref Vers#23of EmployeeData Update! Ref Vers#24of EmployeeData
1 2 A’s-Data Vers-Z A’s-Data Vers-Y A’s-Data Vers-X 3 Request Uses: Vers-Z 4 Publishing Versioned Reference Data • The Owner of Data Periodically Publishes • Using Whatever Messaging Technique It Wants • Publications Are Always Versioned • The Version Numbers Increase Service-A Service-B
1 2 Request Uses: Vers-X Please MakeData Change A’s-Data Vers-Y A’s-Data Vers-X 3 Business Operations May Request Changes • If a Non-Owner Wants a Change It Must Do a Biz-Operation • This is a Request Sent to the Owning Service • The Owning Service May Agree to the Operation Causing Changeto the Data in Question • If It Changes, This Affects the Next Version Owning Service-A Service-B
Optimistic Concurrency Control:Anti-Encapsulation • What Is Optimistic Concurrency Control? • Data Is Read • Changes Are Made and Submitted to the Data’s Owner • If the Original Data Hasn’t Changed, the New Changes Are Applied • This Assumes the Remote System Should Be Able to Write Directly on the Data • This Is a Trusting Relationship… Not Autonomous! • Autonomy and Updates to Data • Autonomy Means Independent Control • My Local Biz-Logic Decides How My Data Changes! • If You Want a Change, Ask Me To Do a Business Op • It’s My Data…I’ll Decide How It Changes!
Example: Updating the Customer’s Address • What About a Salesperson Updating a Customer’s Address? • Shouldn’t That Just Be Optimistic Concurrency Control? • No! It Should Invoke Business Logic with a Request! • Not All Fields of the Customer Record Should Be Updated by Sales People • Requests Across Service Boundaries Invoke Business Logic when the Customer Address Is Changed
Introduction: The Shift Towards Services • Behavior: Encapsulation and Trust • Data: Then and Now • Outside Data: Reference Data • Outside Data: Sending Messages • Outside Data: XML and Schema • Outside Data: Commonality of Schema and Understanding • Inside Data • Inside/Outside: Representations of Data • Inside/Outside: Tying It All Together • Conclusion Outline
VersionIndependent Immutable And/Or Versioned Data • Windows NT4, SP1 • The Same Set of Bits Every Time • Data May Be Immutable • Once Written, It Is Unchangeable • Immutable Data Needs an ID • From the ID, Comes the Same Data • No Matter When, No Matter Where • Versions Are Immutable • Each New Version Is Identified • Given the Identifier, the Same Data Comes • Version Independent Identifiers • Let You Ask for a Recent Version • Recent NY Times • Maybe Today’s, Maybe Yesterday’s • New York Times; 7/3/03 • Specific Version of the Paper -- Contents Don’t Change • Latest SP of NT4 • Definitely NT4, Results Vary Over Time
Service-A Once It’s Outside,It’s Immutable! Immutability of Messages • Retries are a Fact of Life • Zero or more delivery semantics • Messages Must Be Immutable • Retries Must Not See Differences… • Once It’s Sent, You Can’t Un-send!
To Cache Or Not To Cache • OK to Cache Immutable Data • It’s Never Wrong • Never Have to Invalidate! • Caching Should Only Be Used for Immutable Data • Caching Data that Changes May Lead to Anomalies • Consider Caching Data Labeled with a Version Dependent ID • Because Versions Are Immutable It Will Work • Store the Mapping from Version Independent to Version Dependent in an Accurate Location
Classic problemwith de-normalization Can’t updateSam’s phone #since there aremany copies Emp # 91 18 66 47 Joe Emp Name Mary Pete Sally 5-1234 Emp Phone 5-7349 3-3123 2-1112 13 Mgr # 13 02 38 Betty Harry Sam Mgr Name Sam 5-6782 6-9876 6-9876 Mgr Phone 4-0101 Normalization And Immutable Data • Databases Design for Normalized Data • Can Be Changed Without “Funny Behavior” • Each Data Item Lives in One Place • Sometimes Data Should Be De-Normalized • If Data Is Immutable It’s OK De-normalization is OK if you aren’t going to update!
Stability Of Data • Immutability Isn’t Enough! • We Need a Common Understanding • President Bush 1990 vs. President Bush 2004 • Stable Data Has a Clearly Understood Meaning • The Schema Must Be Clearly Understood • The Interpretation of Values Must Be Unambiguous • Suggestion • Timestamping or Versioning Makes Stable Data • Observation • A Monthly Bank Statement Is Stable Data • Advice • Don’t Recycle Customer-IDs • Observation • Anything Called “Current” Is Not Stable
A Few Thoughts on Stable Data • Outside Data Must Be Stable • Consistent Interpretation Across Valid Spaces and Times • Inside Data May Be Stable • Notably, When It Is the Same Data as Outside Data… • Sometimes Data Inside Is Not Stable • Classic Normalization for Vibrant Update • Needs to Be Cast Into a Stable Shape to Send Outside
Validity Of Data In Bounded Space And Time • Bounding the Valid Times • It May Have an Expiration • Bounding the Valid Locations • Restrictions on Where the Data Is Valid • When Valid, the Data Should Be: • Immutable (the ID Yields the Same Bits) • Stable (the Meaning Is Clear) Price-List Valid Until Dec 31st Data Valid For Service-X Only “Offer Good Until Next Tuesday” “Offer Good to Washington State Residents Only”
Identify theMessage Put Unique ID in All Messages Part of the Unique ID May Be a Version… ImmutableData Don’t Change the Data Associated withthe Unique ID; Never Return Different Bits OK toCache The Same Bits Will Always Be Returned Define ValidRanges Valid for a Certain Time Period and OverSome Space; OK to Always Be Valid Must BeStable Must Ensure There Is Never Any ConfusionAbout the Meaning (Within Valid Range) Rules For Sending Data In Messages
Introduction: The Shift Towards Services • Behavior: Encapsulation and Trust • Data: Then and Now • Outside Data: Reference Data • Outside Data: Sending Messages • Outside Data: XML and Schema • Outside Data: Commonality of Schema and Understanding • Inside Data • Inside/Outside: Representations of Data • Inside/Outside: Tying It All Together • Conclusion Outline
SQL, DDL, and Serializability • SQL’s DDL (Data Definition Language) is Transactional • Changes Are Made Using Transactions • The Structure of the Data May Be Changed • The Interpretation After the DDL Change Is Different • DDL Lives Within the Time Scope of the Database • The Database’s Shape Evolves Over Time • DDL Is the Change Agent for This Evolution • SQL Lives in the “Now” • Each Transaction’s Execution Is Meaningful Only Within the Schema Definition at the Moment of Its Execution • Serializability Makes This Crisp and Well-Defined
Service-A Message Schema Immutable Message Immutable Schema for the Message Message Schema and Immutable Messages • When a Message Is Sent, It Must Be Immutable • It Is Crossing Temporal Boundaries • Retries Mustn’t Give Different Results • The Message’s Schema Must Be Immutable • It Makes a Mess If the Interpretation of the Message Changes
Immutable Schema and Its Identifiers • Immutable Schema Needs an Identifier • It Must Be Possible to Unambiguously Identify the Schema • This Must Occur Across the Namespaces of Sender and Receiver • The Schema Definition Must Never Change • Given the Identifier, the Same Schema Is Returned • URIs (Universal Resource Identifiers) Work Well • Guaranteed to Be Unique • If You Follow the Rules • URLs (Universal Resource Locators) Are Cool • URLs Are URIs • Also Tell You a Location To Get the Stuff (e.g. the Schema)
Customer Address Purchase Order SKU Name Number/Street Part Customer City/State Address Delivery Addr Color Credit Rating Postal Code Size SKUs Country Composition of Schema as a DAG • Schema Make Contain Sub-Schema • Inside the Message Are Chunks of Data • A Purchase-Order May Contain Customer Information • They Have Their Own Definitions • The Sub-Schema Are Referenced by Identifier • This Leads to a Tree of References to Immutable Schema • It’s Really a DAG (Directed Acyclic Graph) • Sometimes, Different Sub-Schema Reference the Same Stuff
Versioning and Schema • Frequently, Schema Is Versioned • A New Format of the Schema Is Created • It Is Given a New Identifier • Version Independent Schema Identifiers • Specify a Set of Versions for a Type of Schema • The Set May Evolve Over Time • Version Dependent Schema Identifiers • Specify a Specific Version of a Specific Schema • The Version-Dependent Schema Is Immutable • Messages Always Specify a Version-Dependent Schema • This Ensures No Ambiguity
Purchase Order Service-A Customer Delivery Addr Don’t Deliver in Morning Purchase Order SKUs Customer Delivery Addr SKUs Extensibility and Schema • Extensibility Is the Addition of Non-Schema Specified Information Into the Message • The Schema Does Not Specify the Additional Stuff • The Sender Wanted to Add It Anyway • Adding Extensions Is Like Scribbling in the Margins • Sometimes Adding Notes to a Form Helps! • Sometimes It Does No Good at All!
Infosets, XML-Schema, And PSVI • XML-Infoset • Semantics of XML, Not Syntax • Tree: Parents, Children, Elements, & Attributes • Allows (Encourages) Schema • Any Representation OK • XML-Schema • Datatype Library and Schema Definition • Composed Schema Uniquely Identified (URI) • PSVI – Post Schema Validated Infoset • Infoset After Validation Against Schema • Can Leverage Schema Knowledge