A secure publisher centric web caching infrastructure
Download
1 / 46

A Secure,Publisher-Centric Web Caching Infrastructure - PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on

A Secure,Publisher-Centric Web Caching Infrastructure. April 19 th , 2001. Selcuk Uluagac Aravind Pavuluri. Outline. Dynamic Caching Motivation & Gemini Security Issues Incremental Deployment Design & Implementation Performance Conclusions & Discussion. Outline.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' A Secure,Publisher-Centric Web Caching Infrastructure' - payton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
A secure publisher centric web caching infrastructure

A Secure,Publisher-Centric Web Caching Infrastructure

April 19th, 2001

Selcuk Uluagac

Aravind Pavuluri


Outline
Outline

  • Dynamic Caching

  • Motivation & Gemini

  • Security Issues

  • Incremental Deployment

  • Design & Implementation

  • Performance

  • Conclusions & Discussion

18845-01


Outline1
Outline

  • Not Finished Yet !! 

  • Active Cache: Caching Dynamic Contents on The Web “ Pei Cao et al.”

  • A Publishing System For Efficiently Creating Dynamic Data “Arun Iyengar et al.“

18845-01


Dynamic web caching
Dynamic Web Caching ?

  • Content generated on every request

  • Scripting Languages (Perl, CGI, Java,VBScript, etc.)

  • Personalization and E-commerce transactions

  • Presently not cached

18845-01



Gemini motivation
Gemini & Motivation

  • Drawbacks of Current Cache Infrastructure

    • Incapable of reporting access statistics

    • Not able to handle dynamic content

    • Loss of publisher control over the content

      • Not publisher centric

  • Solution is Gemini..

18845-01


Key elements of gemini architecture
Key Elements of Gemini Architecture

  • Node (Cache)

  • Security Architecture

  • Incremental Deployment Strategy

    Gemini

  • Control Plane Data Plane

    Consistency control Filtering

    Logging&Reporting Versioning

    QoS Sand boxed VM

    Access Control

18845-01


Security issues
Security Issues..

  • The need for a new security approach???

    • Active participant caches, not just end-to-end

    • Cache is responsible for reporting logs

  • Design Goals

    • Protect the publisher as well as the cache

    • Publisher decides who to trust

    • Publishers/clients find out about attacks eventually

    • The system should be incrementally deployable

18845-01


Security background
Security Background

  • RSA (Rivest,Shamir, Adleman)

    • Encryption

    • Public Key  Private Key

  • Public Key Infrastructure (X.509)

  • Digital Signature

    • Verification

  • Certificate

  • Certificate Authority

18845-01


A new trust model
A New Trust Model

  • Cache Authorization

    • Publishers explicitly specify which content a cache can generate

  • Cache Verification

    • Publishers and clients verify that authorized caches are performing correctly

18845-01


Authorization content generation steps
Authorization & Content GenerationSteps…

  • PKI provides key distributions to clients, caches, publishers

  • Publisher’s certificate identifies its web site & PK

    • Certificate {P, KP,Valid, Expires, CA}Kca-1

  • Publisher lists authorized caches for an object ??

    • ACL: {URL,K1 K2,.. Kn,,Valid, Expires,P}Kp-1

  • Publisher gives the cache: ACL, {Headers, Body} Kp-1

  • Uses Pragma header field not to confuse legacy caches

  • Cache generates the content using the Body

  • Cache sends client

    • ACL,{URL,Cache,Client,H(Request),CurrDate,Body}Kcache-1

18845-01


Authorization content generation steps1
Authorization & Content GenerationSteps…

  • Client is able check the signature on ACL and verify the authorization of the cache

    • Client verifies

      • Cache is in ACL & Cache Signature is valid

  • Cache signature’s purpose

    • Tamper detection by client

    • ID of cache generating the content

    • Non-repudiation

  • Cache can perform access control on the content based on the demand of publisher (cookie etc.)

18845-01


Verification
Verification

  • Client sends a feedback to the publisher regarding the misbehaving cache

  • Similarly, inconsistencies in cache log reporting can be detected

  • Publisher removes the cache from the ACL list ???

  • When to question cache responses?

    • Publisher initiated (fake clients..)

    • Client initiated

18845-01


Protecting the cache
Protecting the cache

  • Publishers may send malicious code to caches

  • To prevent that..

    • Publisher’s code runs inside sand boxed JVM

    • Limited API exposed to publisher’s code

  • Resource restrictions using OS level controls to counter denial-of-service attacks

18845-01


Incremental deployment strategy
Incremental Deployment Strategy…

Principles

  • Cache and document

    heterogeneity

  • Transparency to clients

  • Transparency to legacy

    caches

  • Proximity

Leaf Cache

18845-01


Discovering gemini documents
Discovering Gemini Documents…

  • Publishers explicitly notify Gemini caches about documents that have associated Gemini documents.

  • Notification contains

    • Server name

    • Pattern to match

    • Transformation

  • They’re piggy-backed on HTTP responses

  • Caches store notifications as soft state

18845-01



Leaf discovery
Leaf Discovery

  • Leaf Cache Gemini cache which translates a request for a regular document into a request for a Gemini document.

  • With security the leaf cache becomes the first cache that both has the proper lookup table entry and is authorized by the publisher

18845-01


Scalability
Scalability

  • Leverages thousands of legacy caches to help deliver Gemini documents

  • Computational burden is pushed as close to the edge of the network as possible.

18845-01



Node design implementation cont
Node Design & Implementation(cont…. )

  • Platform => On top of Squid

  • Runtime Language => Java

    • Platform independent

    • Allows sand boxing

  • Partitioning of functionality

    • Squid Process

      • Look up table

      • Fetch Gemini Documents

      • Forwarding Gemini requests

    • Gemini Process

      • JVM

      • Security

18845-01


Node operation
Node Operation

  • Squid front end receives the request from the client

  • Hands the requests to Gemini process via IPC

  • Gemini threads begin to process (Dispatcher,Checker, Worker)

  • The output is signed by the worker thread and sent to client

  • Request is logged

18845-01


Performance evaluation
Performance Evaluation

  • 5 to 15 times response time degradation for non-active Gemini documents

  • Signing the reply accounts for 90% of processing time

18845-01



Conclusions discussion
Conclusions & Discussion

  • Gemini addresses the Security issues in Dynamic Web Caching

  • Provides a node implementation

  • Provides a publisher centric architecture

  • End user performance ???

18845-01


A publishing system for efficiently creating dynamic data
A Publishing System For Efficiently Creating Dynamic Data

Arun Iyengar et al.

IBM Research

T.J. Watson Research Center

18845-01


Problems with dynamic caching at a first glance
Problems with Dynamic Caching At A First Glance

  • Several Problems With Dynamic Data Generation

    • Expensive to create

    • Overhead

    • Consistent update (we already know this!)

    • More ???

18845-01


Little fragments
Little Fragments…

  • Fragments

  • Objects

  • Atomic vs. Complex Object

  • Object Dependence Graph(ODG)

  • Dynamic Pages…

    • Embedded fragments

      automatically updated

  • Atomic vs. Incremental Publication

    • Problems ??

    • 3 proposed algorithms

18845-01


Publishing process
Publishing process

  • Immediate fragments

  • Quality controlled fragments

  • Trigger Monitor’s notified

  • Fetches new copies from source

  • The ODG is updated

  • Graph Traversal algorithms applied

  • Bundles of web pages are written to sink

18845-01


Sample screen
Sample screen

18845-01


Performance
Performance

  • Deployed in 2000 Olympic Games Web Site

18845-01


Performance1
Performance

  • Easier to design web sites

    • Users specifies and modifies relationships among web pages& fragments

  • Performance improvement

  • Incremental publication

    • Faster with 3 algorithms

18845-01


Active cache caching dynamic contents on the web

Active Cache: Caching Dynamic Contents on the Web

April 19th, 2201

Selcuk Uluagac

Aravind Pavuluri


Motivation and active cache
Motivation and Active Cache

  • Dynamic documents constitute an increasing percentage of contents on the web

  • Affects the scalability of the web

  • No approaches presently to do Dynamic Content Caching

  • Solution: Active Cache…..

18845-01


Brief overview
Brief Overview

  • Migrates parts of server processing on each user request to the caching proxy via “cache applets”

  • A cache applet is a server-supplied code that is attached with a URL

  • On a user request the proxy invokes the cache applet

  • Cache applets allow servers to obtain the benefit of proxy caching without losing the capability to track user accesses and tailor the content presentation

18845-01


The active cache protocol
The Active Cache Protocol

  • Web server specifies association between a cache applet and a URL-named document by sending a new entity header “Cache Applet” with the document

    • CacheApplet: code = “code.class”, archive=“code.jar”, codebase=“codebase.url”

    • For security reasons, codebase of the applet has to has the same server URL as the document.

18845-01


The active cache protocol cont
The Active Cache Protocol (cont…)

  • Active Cache Obligations

    • If a document is cached, it will either invoke the cache applet or send the request directly to the server.

    • If an applet’s execution fails due any reason, the request is sent to the server

    • If applet’s execution succeeds , the proxy will take the appropriate action based on the return value of the FromCache method

    • Each applet can deposit information in a log object and the proxy will send the log object back to the server.

18845-01


Proxy decides
Proxy Decides….

  • Whether to cache a document

  • Whether to invoke the applet

    • Cache applet may not process every request for the document

    • Some requests may go the original server

  • What document or applet to evict from the cache at any time

18845-01


Active cache interface
Active Cache Interface

  • Cache applet must implement the “ActiveCacheInterface”

  • FromCache( user_http_request, client_ip, client_name, cache_file, new_file)

  • Cache Applet can only call the ActiveProxy class to perform its functions

  • ActiveProxy provides methods for file access, cache query, locking and unlocking as well as sending requests to the server

18845-01


Active cache interface1
Active Cache Interface …

Methods in ActiveProxy

  • Boolean is_in_cache( string url)

  • Public int open(string url, int mode)

  • Public int close(int fd)

  • Public int create(string url, int mode)

  • Public int read(int fd, byte[] buf, int size)

  • Public int lock(int fd)

  • Public string curtime()

18845-01


Cache applet examples
Cache Applet Examples

  • Logging User Requests

    • Logs eventually sent to the server

  • Advertising Banner Rotation

    • Decides which banner to put according to the specifications

  • Access Permission Checking

    • Applet verifies weather the server signed the document

  • Client-Specific Information Distribution

    • www.my.yahoo.com

18845-01


Security mechanisms
Security Mechanisms

  • Language-based Protection

    • ActiveProxy class implements the constraints

    • Java built in security measures

    • Prevents illegal access to information belonging to the other web servers

  • Resource Accounting

    • Proxy keeps track of an applets resource consumption in terms of storage size, disk bandwidth,network bandwidth , CPU usage and virtual memory size

    • Set upper limits on resources using setrlimit

    • Prevents Denial of Service attacks

18845-01


Implementation
Implementation

  • Extended the CERN httpd proxy

  • Handles each request in a separate process

  • Makes it easy to set limits on the resources

  • Implements the Active Cache Protocol and the security mechanisms

18845-01


Performance2
Performance

  • Degrades the performance at least by 50 – 75%

  • Increase in client latency by a factor of 1.5 to 4

  • CPU becomes the bottleneck

18845-01


Conclusions
Conclusions

  • Active Cache trades local CPU resources for network bandwidth savings

    • $6K - $10K/month for a T1 line vs.

    • $2K for high end Computer with sufficient CPU

  • Improves object hit and byte hit count from 35% and 30% to 55% and 41% respectively

18845-01



ad