Anca-Andreea Ivan Advisor: Vijay Karamcheti Computer Science Department New York University

Partitionable Services Framework: Seamless access to distributed applications deployed in heterogeneous environments Anca-Andreea Ivan Advisor: Vijay Karamcheti Computer Science Department New York University

Motivation • Trends: • Designing component-based applications (CORBA, .NET, Globus) • Deploying applications in heterogeneous environments • Heterogeneous environments: • Environment composed of nodes and links organized in multiple administrative domains • Nodes and links exhibit diverse properties: • Links – variable bandwidth, latency • Nodes – variable CPU, memory, operating system • Nodes, links, and their properties change in time and space: e.g. mobility

Problem: Seamless access to distributed applications • Hard to write and deploy distributed applications such that users seamlessly access services • Seamless access i.e. the quality of service (QoS) requirements specified by users are satisfied • Current systems are very complex and provide only partial solutions: e.g. CORBA, DCOM • Ideally, one would like to: • Design flexible applications • Automatically choose the appropriate application deployment that ensures seamless access to services

Our solution: Partitionable Services Framework • PSF uses the first trend (i.e. designing applications as sets of components) as a solution to these problems. • PSF facilitates seamless access to services by: • Deploying services into the network • Creating appropriate connections • Given: • Current state of the network • Application description • User’s QoS requirements

Roadmap • Motivation • Problem: Seamless access to distributed applications • Our solution: Partitionable Services Framework (PSF) • Architecture • Modules • Evaluation • Related work • Contributions and future plans

Cache Mail Server Mail Client Weak Mail Client Mail Server Cipher Example: Web-based mail application • Components • QoS requirements: • Qualitative: security, access to various services • Quantitative: bandwidth, CPU, minimum operation time (send/receive), number of messages/time

Deploying the mail application into the network Weak Mail Client Mail Client Mail Server Cache Mail Server Cipher A C Secure, slow link B Insecure, slow link Secure, fast link

1. register application Service Description Wrapper Wrapper Wrapper Wrapper Wrapper Wrapper PSF Manager 4. deploy comp 4. deploy comp 5. connect 2. make request 4. deploy comp Partitionable Services Framework (PSF) Architecture overview 3. find plan Insecure, slow link Secure, fast link

Physiology of PSF • Basic functionality: • Modeling application/network properties and behaviors (qualitative and qualitative) • Planning, i.e. finding valid applications deployments • Securely deploying applications into the network • Extensions: Views = component customizations • Increase the chances of successfully finding a plan • Permit efficient maintenance of consistency between components • Provide flexible, fine-grain access control

Problem 1:Describing the application and the network • Challenges: • Describing application properties and behaviors • Capturing the effects of the network on the application • Extending CPU-like QoS requirements (e.g. security) • Extracting information from the network • Our solution: PSF application and network models (HPDC 2002)

PSF application and network specification models • PSF models use general, quantitative and qualitative properties, belonging to different namespaces. • Network contains nodes and by links associated with properties: • Nodes – e.g. OS, software version, CPU • Links – e.g. security, bandwidth, latency • Applications are defined as set of components • Linkage information • Deployment requirements and effects • Effects of the network over the application and vice-versa

MSI MSI MSI MCI MSI MSIe MSIe MSI MSI MSI MCI PSF application model: Component linkages • Component linkages are defined as: • Implemented interfaces (i.e. component functionality) • Required interfaces (i.e. services required for correct execution)

Deployment effects: Node.CPU = 50 units MSI.Secure = true MSI.NumReq = 200 MSI.ReqSize = 1 Kb How to specify deployment requirements and effects? Deployment requirements: Node.CPU > 50 units Node.OS = Win XP MSI Secure = true NumReq = 200 Linux CPU = 60 Win XP CPU = 50 Win XP CPU = 100 56Kbps, Secure = false

How to specify link crossing requirements and effects? Deployment requirements: Node.CPU > 50 units Node.OS = Win XP Link crossing requirements: Link.BW > 2Mb/s Deployment effects: Node.CPU = 50 units MSI.Secure = true MSI.NumReq = 200 MSI.ReqSize = 1 Kb Link crossing effects: Link.BW = MSI.NumReq * MSI.ReqSize MSI.NumReq = min( MSI.NumReq, Link.BW / MSI.ReqSize ) MSI.Secure = MSI.Secure & Link.Secure MSI.ReqSize = MSI.ReqSize MSI Secure = false NumReq = 56 MSI Secure = true NumReq = 200 Linux CPU = 60 Win XP CPU = 50 0Kbps, Secure = false 56Kbps, Secure = false

Problem 2: Finding a valid deployment plan • Challenges: • Searching for valid component compositions • Satisfying network and application resource constraints • Scaling with size of network and size of application • Expressing deployment requirements and effects • Our solution: AI-based planning algorithm (IPDPS 2003) (joint work with Tatiana Kichkaylo)

Contributions of the planning algorithm • Deploys dynamically created DAGs of components • Satisfies application requirements given the network state • Guarantees minimum planning cost • Allows monotonic functions to express requirements and effects • Scales with: • Network size (number of links and nodes) • Size of the application (number of components) • Number of relevant vs. irrelevant components • Reuses existing deployments

Problem 3: Securely deploying applications • Challenges: • Different domains → administrators → namespaces • No centralized certification authority • No total knowledge about all namespaces • Our solution: PSF deployment infrastructure (HPDC 2003)

Contributions of PSF deployment infrastructure • Authorizes entities across multiple domains • Distributed role based access control system (dRBAC) • Securely downloads and connects components • Communication abstraction to create secure channels: Switchboard • Monitors trust relationships between domains • Switchboard • Translates properties between namespaces • Distributed trust management system (dRBAC)

Translating properties between namespaces Intel NYU system OS trusted secure PDSG software secure [Intel.OS→ NYU.system]NYU [NYU.system → PDGS.software]PDSG

PSF: From basic to extended functionality • So far: • Models application and network specification formats • Finds valid deployments • Securely deploys application into networks • Problems: • Chances of successful planning depend on component set • Multiple instances of the same components might require that data be kept consistent • Access control to services should be fine-grained • Our solution: Views

View Definition • Component c is defined as a tuple (FC , VC), where FC is the set of implemented functions and VC is the set of declared variables. • Def: A component v is a view of a component c if: or • Examples: CacheMailServer is a data view of MailServer WeakMailClient is an object view of MailClient

View Generation • VIG is an automatic view generator • Input: original component and view definition rules • Output: new component (i.e. view) • Based on bytecode modifier (Javassist) • Interactive tool to verify correctness of views • Operations allowed when defining a view: • Add new fields; copy fields from the original component • Add new methods; copy or customize methods from the original component • Restrict interfaces; add new interfaces • Extend class or view

View Description view WeakMailClient represents MailClient MailClient class implements MessageInterface , AddressInterface , MeetingInterface { MailServer server; VectorrecvMessage() { return server.getMessages(); } void sendMessage( Message _mes ) { server.sendMessages( _mes ); } ContactInfo getInfo( String _name ) { String phoneNumber = server.getPhoneNumber(); String email = server.getEmailAddress(); return new ContactInfo( phoneNumber, email ); } String email = server.getEmailAddress(); return new ContactInfo( email ); void addMeeting( Time _t, Person _p ) { Calendar cal = server.getCalendar( _p ); cal.requestMeeting( _t, this.name ); } }

Benefits of using views • Increase chances to find a valid deployment plan: • Different properties of new components • Provide customized, single sign-on access control: • Customizing / removing / adding operations • Distributing minimum necessary code to users • No need to access sources • Permit flexible consistency between multiple replicas • Ease the programming effort: • Defining simple rules instead of duplicating code

Flexible cache coherence protocol • Challenges: • PSF deploys general distributed applications • There might be consistency requirements for multiple instances of the same component • No assumptions can be made about the application structure or data access patterns • Our solution: Flexible cache coherence protocol (submitted to IPDPS 2004)

Consistency requirements for the mail application Weak Mail Client Mail Client Mail Server Cache Mail Server Cipher A C Secure, slow link B Insecure, slow link Secure, fast link

Contributions of the cache coherence protocol • PSF employs a flexible and neutral cache coherence protocol that uses application-specific information. • With whom to synchronize? • Data properties are used to characterize the shared data • Two views share data if their property sets intersect • When to synchronize? • Quality triggers indicate when updates should be pushed or pulled between replicas • What to synchronize? • Merge/extract methods are used to merge/extract data and resolve conflicts between updates.

Cache manager Cache manager 5. sendIm 2. getIm 4. imRel 3. relIm 1. acqIm Directory manager 3. relIm Cache manager 4. imRel 3. relIm 4. imRel Cache manager Cache manager Description of the cache coherence protocol (user, {alice, bob}) (user, {alice, david}) (user, {charlie}) (user, {alice}) (user, {alice})

Roadmap • Motivation • Problem: Seamless access to distributed applications • Our solution: Partitionable Services Framework (PSF) • Architecture • Modules • Evaluation • Related work • Contributions and future plans

Evaluation of Partitionable Services Framework Would like to verify that: • User input is minimal • Describing application, network, QoS • Adding PSF-specific code • Flexible application deployments might improve the application performance when executing in resource constrained networks. • Overhead of automatic deployments is not significant • Results of dynamically deployed applications and manually deployed applications are comparable

User input is minimal • Natural expression of required additional information: • Description of application properties and the network status • Description of views • Description of application specific information for the consistency protocol • Minimal PSF-specific code: • Component needs to extend PSFObject • Small number of cache coherence APIs

Example of a PSF-aware component public class CacheMailServer extends PSFObject implements MailServerInterface { ViewPropertyList vpl = new ViewPropertyList( “user”, “alice” ); CacheManager cm = new CacheManager( vpl ); public CacheMailServer() { super(); cm.initImage(); } public void sendMessage( Message _message ) { cm.startUseImage(); accounts.addMessage( _message ); cm.endUseImage(); } public void setInterface( String _interface ) { ... } }

Testbed used to evaluate PSF • 3 domains, 10 nodes/domain • Nodes have 100 units CPU • Links have 100Mbps/0ms • Links connecting domains: • 56Kbps/75ms • 2Mbps/50ms • 50Mbps/20ms • Testbed of 32 nodes • Click router to modulate bandwidth and latency B 2M/50ms 50M/20ms 56k/75ms C A

A A A A A B A B B A C A C C A C C B A Scenarios • Mail server is running in domain A • Clients connect from domains A, B, C: C-S C-$-S C-$-S C-S C-$-$-S C-S C-$-S

C-S C-$-S 20sec C-$-S 90sec C-$-$-S 20sec C-$-$-S 90sec Flexible application deployments - Switchboard • No single configuration ensures best performance in all scenarios → need for automatic deployments 25 20 15 Time[s] 10 A 5 0 A B C Client domains

Dynamic vs. manual deployments • Performance of dynamically deployed applications is comparable with performance of applications manually deployed in similar configurations. manual deployment dynamic deployment 12 10 8 Time [s] 6 4 2 0 A B C Client domains

install client generate plan contact wrapper contact wrapper Overhead of automatic applications deployments • Overhead of automatic application deployments is negligible. install cache 12 10 8 Time [s] 6 4 2 0 A B C Client domains

Properties of the planning algorithm • Scales with size of application • Scales with the size of the network • Reuses existing deployments • Finds deployment plans fast 2.5 2.0 1.5 Time (s) 1.0 0.5 0.0 20 28 44 54 64 72 79 93 Number of nodes in the network

Related work

Summary • Hypothesis: By exposing qualitative and quantitative properties of component-based applications and the relationships between the applications and the environment, automatic deployment is feasible. • Partitionable Services Framework • Application and network description models • AI-based planning algorithm • Secure deployment process • Extensions based on views • Flexible cache coherence protocol

Future work • Refine the Partitionable Services Framework: • Extend the application model • Design a network monitoring system that extracts the information required by PSF • Decentralize PSF • Support run-time adaptation through re-planning • Use the ideas developed while working on PSF to address problems in other areas: e.g. web caching, sensor networks • Placement of web caches • Automatic deployment senselets

Thank you!

General stuff

Hypothesis: Automatic application deployment • Hypothesis: By exposing qualitative and quantitative properties of component-based applications, and the relationships between the applications and the environment, automatic deployment is feasible. • Feasibility: • User input is minimal • i.e. additional programming effort, application specification • If a valid deployment exists, it will be found • Dynamic deployment does not incur significant overhead • Performance of dynamically deployed application is comparable with manual deployments

Service Description 3. plan Registrar Wrapper Wrapper Deployer Wrapper Planner Wrapper Wrapper Network Monitor 1. register application 2. make request 4. deploy Partitionable Services Framework (PSF) Architecture overview PSF Runtime

Anatomy of PSF • PSF runtime infrastructure • Registrar: registers applications with PSF • Monitor: extracts the current state of the environment • Planner: finds valid application deployments • Deployer: deploys applications • Wrappers • Forward client requests to the PSF runtime infrastructure • Download, connect, and start applications • Provide additional services: e.g. cache coherence

Application and network model

<Node> <Address>node1.nyu.edu</Address> <Property> <Name>NodeCPU</Name> <Value>100</Value> </Property> <Property> <Name>Secure</Name> <Value>false</Value> </Property> <Property> <Name>OS</Name> <Value>Windows XP</Value> </Property> </Node> <Link> <Start>node1.nyu.edu</Start> <End>node2.nyu.edu</End> <Property> <Name>LinkBandwidth</Name> <Value>500</Value> </Property> <Property> <Name>Secure</Name> <Value>true</Value> </Property> </Link> Example – Node and link specifications

Cache Mail Server Mail Client Weak Mail Client Mail Server Cipher What information should be exposed by the application? • Design applications as sets of components • A component specifies: • Linkage information • Deployment requirements and effects • Quantitative: number of requests, available CPU • Qualitative: privacy, availability of certain software • Link crossing requirements and effects • QoS requirements

Example – Mail Client <Component> Name: MailClient <Linkages> <Implements> Name: MCI </Implements> <Requires> Name: MSI </Requires> </Linkages> <Conditions> Node.OS = Linux Node.CPU > 50% MSI.Secure = true MSI.NoReq > 120 </Conditions> <Effects> MCI.Secure = true Node.CPU = Node.CPU – 50% Link.Bandwidth = Link.Bandwidth – ( MSI.NoReq * MSI.ReqSize ) </Effects> </Component>

Anca-Andreea Ivan Advisor: Vijay Karamcheti Computer Science Department New York University