reliable distributed systems n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Reliable Distributed Systems PowerPoint Presentation
Download Presentation
Reliable Distributed Systems

Loading in 2 Seconds...

play fullscreen
1 / 45

Reliable Distributed Systems - PowerPoint PPT Presentation


  • 139 Views
  • Uploaded on

Reliable Distributed Systems. Web Services. The slides are adapted from those of Prof. Ken Birman. Motivational questions. What do we need web services?. Today. Web Services – Introduction “Remote Procedure Call” in WS Binding, Marshalling… Using TCP as the transport for RPCs

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Reliable Distributed Systems' - delphina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
reliable distributed systems

Reliable Distributed Systems

Web Services

The slides are adapted from those of Prof. Ken Birman.

motivational questions
Motivational questions
  • What do we need web services?
today
Today
  • Web Services – Introduction
  • “Remote Procedure Call” in WS
    • Binding, Marshalling…
  • Using TCP as the transport for RPCs
    • Connectivity Issues: NAT, Firewall
what are web services
What are Web Services?
  • Today, we normally use Web browsers to talk to Web sites
    • Browser names document via URL (lots of fun and games can happen here)
    • Request and reply encoded in HTML, using HTTP to issue request to the site
  • Web Services generalize this model so that computers can talk to computers
what are web services1
What are Web Services?

Client System

SOAP Router

Backend Processes

SOAP: Simple Object Access Protocol

Web Service

what are web services2
What are Web Services?
  • “Web Services are software components described via WSDL which are capable of being accessed via standard network protocols such as SOAP over HTTP.”

SOAP Router

Backend Processes

WSDL: Web Service Description Language

Web Service

what are web services3
What are Web Services?
  • “Web Services are software components described via WSDL which are capable of being accessed via standard network protocols such as SOAP over HTTP.”

SOAP Router

Backend Processes

Today, SOAP is the primary standard. SOAP provides rules for encoding the request and its arguments.

Web Service

what are web services4
What are Web Services?
  • “Web Services are software components described via WSDL which arecapable of being accessed via standard network protocols such as SOAP over HTTP.”

SOAP Router

Backend Processes

Similarly, the architecture doesn’t assume that all access will employ HTTP over TCP. In fact, .NET uses Web Services “internally” even on a single machine. But in that case, communication is over COM

Web Service

what are web services5
What are Web Services?
  • “Web Services are software components described via WSDL which are capable of being accessed via standard network protocols such as SOAP over HTTP.”

SOAP Router

WSDL documents are used to drive object assembly, code generation, and other development tools.

Backend Processes

WSDLdocument

+

Web Service

web services are often front ends

WSDL-described Web Service

Web Service invoker

COMApp

SAP

WebServer

(e.g., IBM WebSphere,BEAWebLogic)

WebAppServer

C#App

SOAPmessaging

CORBAApp

DB2server

Client Platform

Server Platform

Web Services are often Front Ends
the web services stack
The Web Services “stack”

BPEL4WS (IBM only, for now)

BusinessProcesses

ReliableMessaging

Security

Transactions

QualityofService

Coordination

WSDL, UDDI, Inspection

Description

SOAP

OtherProtocols

Messaging

XML, Encoding

TCP/IP or other network transport protocols

Transport

terminology http en wikipedia org wiki xml
Terminology (http://en.wikipedia.org/wiki/XML)
  • XML: The Extensible Markup Language (XML) is a general-purpose markup language. It is classified as an extensible language because it allows its users to define their own tags. Its primary purpose is to facilitate the sharing of structured data across different information systems, particularly via the Internet.
terminology http www webopedia com term s servlet html
Terminology (http://www.webopedia.com/TERM/S/servlet.html)
  • Servlet: A small program that runs on a server. The term usually refers to a Java applet that runs within a Web server environment. This is analogous to a Java applet that runs within a Web browser environment. Java servlets are becoming increasingly popular as an alternative to CGI programs.
  • The biggest difference between the two is that a Java applet is persistent. This means that once it is started, it stays in memory and can fulfill multiple requests. In contrast, a CGI program disappears once it has fulfilled a request. The persistence of Java applets makes them faster because there's no wasted time in setting up and tearing down the process.
what are web services6
What are Web Services?
  • Amazon would hand out “serverlets” for 3rd party developers to use
  • This connects their applications directly to Amazon’s system

SOAP Router

Backend Processes

serverlet

Web Service

advantages of web services
Advantages of web services?*
  • Web services provide interoperability between various software applications running on various platforms.
    • “vendor, platform, and language agnostic”
  • Web services leverage open standards and protocols. Protocols and data formats are text based where possible
    • Easy for developers to understand what is going on.
  • By piggybacking on HTTP, web services can work through many common firewall security measures without requiring changes to their filtering rules.

*: From Wikipedia

how web services work
How Web Services work
  • First the client discovers the service.
    • More in next lecture!
  • Typically, client then binds to the server.
    • By setting up TCP connection to the discovered address .
    • But binding not always needed.
how it works
How it works…
  • Next build the SOAP request: (Marshaling)
    • Fill in what service is needed, and the arguments. Send it to server side.
    • XML is the standard for encoding the data (but is very verbose and results in HUGE overheads)
  • SOAP router routes the request to the appropriate server(assuming more than one available server)
    • Can do load balancing here.
how it works1
How it works…
  • Server unpacks the request, (Demarshaling) handles it, computes result.
  • Result sent back in the reverse direction: from the server to the SOAP router back to the client.
marshalling issues
Marshalling Issues
  • Data exchanged between client and server needs to be in a platform independent format.
    • “Endian”ness differ between machines.
    • Data alignment issue (16/32/64 bits)
    • Multiple floating point representations.
    • Pointers
    • (Have to support legacy systems too)
discovery
Discovery
  • This is the problem of finding the “right” service
    • In our example, we saw one way to do it – with a URL
    • Web Services community favors what they call a URN: Uniform Resource Name
  • But the more general approach is to use an intermediary: a discovery service
repository summary
Repository summary
  • A database listing servers
  • Each is described using the UDDI (Universal Description, Discovery and Integration) language, which is defined over XML
    • Hence can be searched with XML queries
  • An extensible standard
    • Defines some required information about interfaces available and argument types, etc
    • But services can provide extra information too.
roles
Roles?
  • UDDI is used to write down the information that became a “row” in the repository (“I have a temperature service…”)
  • WSDL documents the interfaces and data types used by the service
  • But this isn’t the whole story…
discovery and naming
Discovery and naming
  • The topic raises some tough questions
    • Many settings, like the big data centers run by large corporations, have rather standard structure. Can we automate discovery?
    • How to debug if applications might sometimes bind to the wrong service?
    • Delegation and migration are very tricky
    • Should a system automatically launch services on demand?
example why discovery is tricky
Example: Why discovery is tricky
  • Client has opinions
    • “I want current map data for Disneyland showing line-lengths for the rides right now”
  • Service has opinions
    • Amazon.com would like requests from Ithaca to go to the NJ-3 datacenter, and if possible, to the same server instance within each clustered service
  • DNS has opinions
    • Many systems play with name -> IP bindings
  • Internet has opinions (routing)
so what s tricky
So, what’s tricky?
  • Web Services doesn’t standardize these four steps, it just assumes that people will hack solutions
  • Hence some are hard to implement, we lack standards, and in some cases, solutions are poor ones
  • UDDI and WSDL are just a corner of the overall picture!
network address translation
Network address translation…
  • Another issue: Often, the internal address is not addressable from outside!
    • A tiny bit of security.
    • But if RPC server is behind a NAT, trouble!
      • NAT needs the host behind it to start the connection process.
      • Need to configure NAT to let specified traffic through.
      • Generally: (WS traffic)HTTP is let through.
    • Tough to have a connection in between two hosts behind NATs.
      • There are some tricks to bypass this though.
firewalls
Firewalls
  • These allow/disallow traffic, depending on source, destination, protocol used, etc.
    • Often only allow connection from the inside to the outside!
  • Stateful: remember active flows, and disallow unexpected packets (NAT)
    • Again, need to configure to ensure server traffic gets through. (General RPC)
    • Again, (WS)HTTP does not face as much of a restriction.
  • Get traffic statistics.
  • Spam/virus checking, etc.
  • NAT and firewall typically in the same box.
demilitarized zone dmz
Demilitarized Zone (DMZ)
  • DMZ: used to host publicly accessible services like company webpages, ftp, dns.
  • Good place to host the Web Service!
  • DMZ situated outside the private network.
  • No outgoing connections from DMZ.
  • If DMZ attacked, damage limited to DMZ.
client talks to estuff com
Client talks to eStuff.com
  • Moving on… let’s oversimplify and just assume the client manages to find the data center
  • We think of remote method invocation and Web Services as a simple chain:

Clientsystem

SOAProuter

WebService

WebService

WebServices

Soap RPC

so suppose we get in
So… suppose we get in
  • Assuming we can connect to the data center (to its Web Services router), then what?
  • If you just use Visual Studio out of the box, you end up with a single-machine Web Server
  • But massive datacenters are common!
a glimpse inside estuff com

LB

LB

LB

LB

LB

LB

service

service

service

service

service

service

A glimpse inside eStuff.com

“front-end applications”

Pub-sub combined with point-to-pointcommunication technologies like TCP

clusters and load balancing
Clusters and load balancing
  • Idea here is that some form of load balancer spreads work over a cluster
  • And cluster replicates data for availability and load management
  • How it does this is a topic we need to discuss in more detail (not today)
what about legacy applications
What about “legacy” applications?
  • Some of these Web services are really just front-ends to older legacy applications
    • So to talk to an old IBM database, we might
      • Run the database on some sort of machine, or virtual machine
      • Build one of these translator front-ends
      • And then register it with the Web Services router
  • This may sound expensive (it is) but it works!
  • Obviously, our fancy clustering and load-balancing won’t apply to a legacy application, so those fancy tricks are only for “new” code
discovery in estuff com
Discovery in eStuff.com
  • Data centers are increasingly common
  • And they raise hard questions!
    • How can a data center in California control decisions a client is making in Ithaca?
    • Services are clustered. How should client request be “routed” to the right member
    • Once you start talking to a server it may cache data for you. How can you be sure to get the right one next time?
these are modern challenges
These are modern challenges
  • Web Services can be seen as evolving from prior work
  • Most often cited: CORBA, which also was used in many big data centers
  • But CORBA didn’t assume that clients came in over the public Internet
    • More often, CORBA was used between a hand-built client and the service it talks to
corba approach
CORBA approach
  • CORBA had what are called
    • Ways to export specialized client stubs
      • The client stub could include server provided decision logic, like “which data center to connect with”
      • Gives data center a form of remote control
    • Factory services: manufacture certain kinds of objects as needed
      • Effect was that “discovery” can also be a “service creation” activity
corba is object oriented
CORBA is object oriented
  • Seems obvious… and it is. CORBA is centered around the notion of an object
    • Objects can be passive (data)
    • … active (programs)
    • … persistent (data that gets saved)
    • … volatile (state only while running)
  • In CORBA the application that manages the object is inseparable from the object
    • And the stub on the client side is part of the application
    • The request per-se is an action by the object on itself and could even exploit various special protocols
    • We can’t do this in Web Services
web services are document centric
Web Services are document-centric
  • That is, communication is by sending documents (like pages) from client to server and back
  • And most guarantees or properties are associated with the document itself, not the service
    • For example, WS_RELIABILITY isn’t about making services reliable, it defines rules for writing reliability requests down and attaching them to documents
    • In contrast, CORBA fault-tolerance standard tells how to make a CORBA service into a highly available clustered service
will web services help with naming and discovery
Will Web Services “help” with naming and discovery?
  • Web Services tells us how
    • One client can…
    • … find one server and
    • … bind to that server and
    • … send a request that will make sense
    • … and make sense of the response
  • So sure, WS will help
but web services won t
But Web Services won’t…
  • Allow the data center to control decisions the client makes
  • Assist us in implementing naming and discovery in scalable cluster-style services
    • How to load balance? How to replicate data? What precisely happens if a node crashes or one is launched while the service is up?
    • Help with dynamics. For example, best server for a given client can be a function of load but also affinity, recent tasks, etc
how we do it now
How we do it now
  • Client queries directory to find the service
  • Server has several options:
    • Web pages with dynamically created URLs
      • Server can point to different places, by changing host names
      • Content hosting companies remap URLs on the fly. E.g. http://www.akamai.com/www.cs.cornell.edu (reroutes requests for www.cs.cornell.edu to Akamai)
    • Server can control mapping from host to IP addr.
      • Must use short-lived DNS records; overheads are very high!
      • Can also intercept incoming requests and redirect on the fly
why this isn t good enough
Why this isn’t good enough
  • The mechanisms aren’t standard and are hard to implement
    • Akamai, for example, does content hosting using all sorts of proprietary tricks
  • And they are costly
    • The DNS control mechanisms force DNS cache misses and hence many requests do RPC to the data center
  • We lack a standard, well supported, solution!
coming up
Coming up?
  • How content is managed in even larger systems, that have multiple data centers
  • The main example is Akamai…
summary
Summary
  • Why do we need web services?
  • What are web services? Is it a mature technology or still in its infancy?
  • How web services work?
  • What are the advantages of web services?