Mashups and Language-Based Isolation

Winter 2009 CS 142 Mashups and Language-Based Isolation John Mitchell

Mashups

Advertisements

Social Networking Sites

Third-party content: Ads Customer accounts Advertising network

Third-party content: Apps User data User-supplied application

Why Use Frames • Isolation • Different frames can represent different principals • Same-origin policy: frame can only read or modify frames from same scheme/host/port • Delegation • Frame can draw only on its own rectangle • Modularity • Reuse the same content in multiple places • Failure containment • Parent may work even if frame is slow to load or broken src = google.com/… name = awglogin src = 7.gmodules.com/... name = remote_iframe_7

Why Not To Use Frames • Inconvenient • Container does fit content • Quirky browser behavior (history, sound) • Performance impact • Security Concerns • Frame hijacking • Browser exploits • Inability to Communicate • Cannot send messages to cross-domain frames • Alternatives: • Flash • Rewriting: FBJS, ADsafe, Caja

postMessage frames[0].postMessage("Hello world."); document.addEventListener("message", receiver); function receiver(e) { if (e.domain == "example.com") { if (e.data == "Hello world") e.source.postMessage("Hello", e.domain, e.uri); } }

Remember this from Lecture 12 by Collin? Referer Suppression Experiment • Measure how often Referer suppressed • Placed a JavaScript advertisement for $200 • 283,945 impressions

How does this work? Advertiser Ad Network Publisher Browser Content Content Ad Ad Ad Ad Ad Server Can retrieve “image” that is part of ad

“Zero-click attacks” • Clients vulnerable • Malware can attack browser implementation errors • Browser-resident malware can use intended functionality to carry out malicious attacts • Easy to place • $30 in advertisements reach 50,000 browsers Brian Krebs on Computer Security Hackers Exploit Adobe Reader Flaw Security Fix has learned that … security hole in … Adobe Reader … is actively being exploited to break into Microsoft Windows computers. According to information released Friday by iDefense, … Web site administrators … spotted hackers taking advantage of the flaw on Jan. 20, 2008, when tainted banner ads were identified that served specially crafted Acrobat PDF files designed to exploit the hole and install malicious software . • Ad serves PDF file that installs Zonebac, modifies search engine results

Problems with advertisements • Ad network, publisher have incentives to show ads • Could place ads in iframe • Rules out more profitable floating ads, etc. • Ad network and publisher can try to screen ads • Yahoo! AdSafe • Google Caja • Some limitations in current web • Ads may contain links to “images” that are part of ad • Important to remember • This is a very effective way to reach victims: $30-50 per 1000 • User does not have to click on anything to run malicious code

Sandbox A safe place for kids to play without hurting each other or anyone else

Possible approach • Goal • Write a static analyzer to check untrusted JavaScript and determine if it is malicious • Solvable? • Very difficult because of functions that can convert string to code and vice versa, for eg : eval • More likely to have a solution • Find a well-defined and meaningful subset of JavaScript for which this is solvable • Prohibit problematic functions like eval

Some JavaScript examples • Use of this inside functions • Implicit conversions • var b = 10; • var f = function(){ var b = 5; • function g(){var b = 8; return this.b;}; • g();} • var result = f(); // has as value 10 var y = "a"; var x = {toString : function(){ return y;}} x = x + 10; js> "a10" // implicit call toString

Sometimes tricky • Which declaration of g is used? • String computation of property names • for (p in o){....}, eval(...), o[s] allow strings to be used as code and vice versa var f = function(){ var a = g(); function g() { return 1;}; function g() { return 2;}; var g = function() { return 3;} return a;} var result = f(); // has as value 2 var m = "toS"; var n = "tring"; Object.prototype[m + n] = function(){return undefined};

Facebook FBJS • Subset of JavaScript for Facebook applications • Application code is fetched from the publisher's (untrusted) server and embedded as a subtree of the page. • Not placed in an Iframe. • Application code written is statically checked to see if it is valid FBJS • FBJS code is re-written and certain run-time checks are added

FBJS restrictions • Security Goal • Restrict access: Document Object Model (DOM), global object • Prevent clashes with other applications • Method 1: Filtering • Forbid eval, with • Disallow explicit access to properties (via the dot notation o.p) valueOf, __parent__ , constructor. • Method 2: Rewriting • Add application specific prefix to all top-level identiers. • Example : o.p is renamed to a1234_o.p • Separate effective namespace of an application from others

More about FBJS08 • Some details of rewriting: • this is re-written to ref(this) • ref is a function dened by the host (Facebook) in the global object • ref(x) = x if x 6= window else ref(x) = null • Prevents application code form accessing the global object. • o[p] gets rewritten to o[idx(p)]. • Returns error if p is a black-listed property, such as "__x__“ • Facebook also provides libraries • accessible within the application namespace, allow applications to safely access certain parts of the global object.

Problem with FBJS08 • Attack: • Get a handle to the global object in the application code • Almost works • var getthis = function() {return this;}; • Except that • this gets re-written to ref(this) and the code returns null. • But we can redefine ref itself • ref is defined in the global object and application code is disallowed from having handle to global object • But can define a local ref in a local scope and defeat FBJS08 try {throw (function() {return this;});} catch (f) {curr scp = f();}

Exploit code (now fixed!) <a href="#" onclick="b()">Test B (Safari, Opera and Chrome)</a> <script> function b(){ try {throw (function(){return this});} catch (get_scope){get_scope().ref=function(x){return x}; this.alert("Hacked!")}} </script> <a href="#" onclick="a()">Test A (Firefox and Safari)</a> <script> var get_win = function get_scope(x){ if (x==0) {return this} else {get_scope(0).ref=function(x){return x}; return get_win(0)}}; function a(){get_win(1).alert("Hacked!")} </script>

Attack 1 try {throw (function(){return this});} catch (get_scope){get_scope().ref=function(x){return x}; • ECMA-262 semantics for try{...} catch(f){...} says that whenever an exception is thrown: • New object o is created with property f pointing to the exception object • o is placed on top of the scope chain. (o does not have the activation object status). • The "this" of a function not defined in an activation object is the object containing it. In code above, this for get_scope resolves to o. • Shadow the original ref by re-defining it in o.

Attack 2 var get window = function f(x){ if (x===0) {return this} else {f(x-1)}; • ECMA-262 says that whenever a named recursive function f is created then the internal scope chain (fscp) of the function (environment pointer of the closure) is set to the current lexical scope with a dummy object (of) placed on top.

Attack 2 var get window = function f(x){ if (x===0) {return this} else {f(x-1)}; • When the function f is called, the current scope chain is replaced with fscp and an activation object for f is placed on top of it • Every recursive call to f will resolve to property f of the dummy object of (which is not an activation object) • Accessing this inside f will resolve to of • Shadow the original ref by redefining it in of

What is possible? • Filtering principle • Subset of JavaScript: if program accesses property p, either p appears textually in program, or is from list of “implicit” properties • Isolation principle 1 • Subset of JavaScript: semantics-preserving capture-avoiding renaming of identifiers (except names of predefined properties) • Isolation principle 2 • Subset of JavaScript: no program can access any scope object • Isolation principle 3 • Given a lists of forbidden properties PnoWand PnoRW, cannot write properties in PnoWand cannot read or write properties in PnoRW • Rewriting principles • Achieve some forms of isolation by restricting semantics

Isolation of property names (Jt) • Goal • All property names that get accessed must appear textually in the code • If the program does not contain • eval, Function, o[..] etc which convert string to code • Then any property accessed is either in code or • an implicit property access: toString, toNumber, valueOf, length, prototype, constructor, message, arguments, Object, Array • Application • If we want to prevent access to certain properties, restrict to this sublanguage Jt and inspect code

Isolating scope objects (Js) • How can code in subset Jt access scope objects? • Identifier this • Object.prototype.valueOf, Array.prototype.sort /concat/reverse can implicitly access this • Define subst Js of Jt • Prohibit this, valueOf, sort, concat and reverse • Properties of Js • Programs cannot access scope object • Can rename variables; variable names can never be accessed (explicitly) as properties • But not variable with same name as native properties

Example: • Security Goal • Restrict access: Document Object Model (DOM), global object • Method 1: Filtering • Forbid eval, with, ... • Method 2: Require special program idioms • Access property p of object o by calling ADSAFE.get(o, p)

Subtlety: • AdSafe restriction "All interaction with the trusted code must happen only using the methods in the ADSafe object." • This may not be possible ! // Somewhere in trusted code Object.prototype.toString = function() { ... }; ... // Untrusted code var o = {}; o = o + “ “; // converts o to String • Bottom line: need to restrict definitions that occur in trusted code

Possible approach • Analyze the library of the host page • Compute a blacklist PnoRW of security-critical properties that could lead to security breach (How?) • Use subset Js + Filter for PnoRW

Conclusion • Modern sites incorporate third-party content • Advertisements • Applications • Third-party content must be isolated • Or expose everyone to easy malicious attacks • Two basic approaches • Use browser mechanism, such as iframes • Filter, rewrite, and restrict execution of untrusted content • Language-based sandboxing is tricky • Subtle problems with recent methods • Progress on reliable foundations is possible

Web Advertising • Deliver advertisements to viewers via Web • More effective and more profitable if user profile is known Source: U Texas iSchool student study, www.ischool.utexas.edu/~i385e/studentsPPT/fogle_IA&WebAdv.ppt

Web ad placement and type • Ad positions • Dark orange (strong), light yellow (weak) • Ads near rich content and navigation, and at the top-left do better • Ad types • Banner • Sidebar • Pop-ups, pop-unders • Floating • Unicast

Banner • HTML code loads a specific website • Varies in content and shape • Horizontal • 50 cents/ 1000

Sidebar • Skyscraper • Vertical • 2-3 times larger than banner • Harder to scroll it off page • $1.00 - $1.50/ 1000

Pop-ups • Opens in its own window • Obscures the page your viewing • Forced to close or move it

Pop-unders • Opens under the content your viewing • Less intrusive than pop-up • Both are more effective than banner • Banners: 2-5 clicks/ 1000 • Pop-ups: 30 clicks/ 1000 • Can cost 4-10 times more than banner

Floating • Float or fly over page 5-30s • Obscure view; block mouse input • Gets attention: animation & sound • Powerful branding tool - hard to ignore • 30 clicks/1000 • $3 - $30/ 1000

Unicast • TV commercials that run in pop-up • 10-30s • Same branding power as TV commercial + being able to go to website • 50 clicks/1000 • $30/1000 From AOL.com

Web Publishing and Advertising Advertiser Ad Network Publisher Browser Content Content Ad Ad Ad Ad Attacker Intermediary Intermediary Victim

Mashups and Language-Based Isolation

Mashups and Language-Based Isolation

Presentation Transcript

Mashups!

Mashups :

Efficient Software-Based Fault Isolation

Web Mashups

Efficient Software-Based Fault Isolation

Leadership Mashups

Efficient Software-Based Fault Isolation

Mashups , Models, and Monetization

Library Mashups

JavaScript, jQuery , and Mashups

Efficient Software Based Fault Isolation

Web Mashups

VoIP Mashups

Acoustic Mashups

Efficient Software-Based Fault Isolation

Efficient software-based fault isolation

Isolation Support for Services-based Applications

Web Mashups

VoIP Mashups

Efficient Software-Based Fault Isolation