1 / 34

Interactive deobfuscation

Interactive deobfuscation. A thrift shop for static deobfuscation. whoami. Security researcher Break stuff, reverse, make them better and break again Part of nullsec non profit group. How it all started. blame this person =>. Presumably a simple crackme Eventually discovered as wb aes

kieu
Download Presentation

Interactive deobfuscation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interactive deobfuscation A thrift shop for static deobfuscation

  2. whoami • Security researcher • Break stuff, reverse, make them better and break again • Part of nullsec non profit group

  3. How it all started blame this person => • Presumably a simple crackme • Eventually discovered as wb aes • I wanted to solve it statically • Since running things is cheating • Goal was to solve in lt a month • A race I didn’t manage to fulfill when working statically

  4. Name is md5’ed • Serial is transformed / permutated using unknown function

  5. Challenge archeology • Overall the crackme was deployed into 2 main parts • Deobfuscation • Opaque predicates, lookup tables, value tables and “spaghetti” code • Cryptanalysis • The original cipher was whitebox’ed

  6. Deobfuscation

  7. Deobfuscation - Layer0 • Found some jmps, decided to map them all • find_lookuptables(“Mov <register>, dwordptr [addr*4]”) • Add xrefs, define locs • IDA can’t map them all into graph views (due to size, more RAM == bigger graph) • After looking a bit there seem to be some logic and different operations inside them • However they all lead to the same path eventually

  8. Deobfuscation Layer1 • Removal of jmps and basic block identification • All the obfuscation was done in a matter to effect the bb itself, after a jmp to another table occurred everything was restored • Follow_jmps_by_addr(addr) to find bb boundaries • Follow jcc until a jmp / push + ret sequence is found • Compress it, remove jccs and make one BB • In case xrefs, patch them together

  9. Deobfuscation – Layer2 • Opaque predicates • Ops which used to make the bb bigger • Simple rule – operations are per bb and do not exceed it • Wrote a simple emulator to emulate bb and optimize them to simple instructions • 1 exception – do not touch lookup tables values • More on this later

  10. Deobfuscation – Layer 3 • Tables, and lots of them • Apart from the jmptables which lead the way • Tables are used as part of the cipher itself • Key is dismantled inside them (more on this later) • Each table has a different role and some are doubled for obfuscation • FindTables to the rescue

  11. Deobfuscation – Layer3 • FindTables basically taints memory and looks for read of 16b tables • Once it finds one it defines an array of 0xFF to that addr • All value tables are mapped using this way, their usage however varies

  12. Deobfuscation – Layer 4 • Once we have all the code cleaned we get several consecutive lookup tables • Loops are unrolled and become normal repetitive ops (per round and state) • All deobfuscated code was written into a new section called “deobf” to make code reading easier • It is now time to move on to the cryptanalysis stage

  13. Cryptanal

  14. Cryptanal • The idea to automate every process is infeasible and too much time consuming • I decided to split the work into two main stages: • Operation identification • Key extraction • Both are used interactively • Thus the name interactive deobfuscation

  15. Cryptanal archeology • Discovered BGE attacks from the academia • Chow , Xiao • sysk’s phrack article • Eventually said FUCK YOU ALL gonna do it myself w/o cryptic math • Lack of algebra lessons and focus

  16. Cryptanal – Layer0 • Actual wb code to encrypt a text • Loops 9 times which made me quite frustrated • Before discovering it was wb’ed • After counting the loops by hand I thought it might be AES • But where’s the key ? • LOLWTF ? md5(user) == wbaes.dec(serial,user_as_key) • No, key must be *embedded* • LOLWUT? md5(user) == wbaes.d/enc(serial,key) ?? • Output isn’t ascii so it could be both enc/dec

  17. Cryptanal – Rijndael on a toe • Several simple operations • AddRoundKey, SubBytes , ShiftRows,MixColumns • Some operations are linear and could be replaced with their previous op • The key to understand the attack is to sniff the first round and extract the key • In the future I found Eloi made my life harder

  18. rijndael

  19. whitebox(rijndael) => evolves into =>

  20. whitebox(rijndael) • 1st transformation: • ShiftRows is linear, and thus could be replaced in op position with AddRoundKey • SubBytes and ShiftRows could be replaced in op position, as SubBytes does the same op • Let “Linear” aka lin be • lin(x) ^ lin(y) == lin(x ^ y)

  21. 2nd transformation • It is possible to tranform and “compress” several ops into one • By using XORtables and T/yboxes • T/yibox • Combine AddRoundKey and SubBytes into one operation (lookup table) to emit 1 byte • SubBytes(x ^ k[i]) • XORtable • Transform MixColumns into a series of lookuptables, particulary these tables are created by XORing one input byte at a time through the MixColumns vector

  22. 3rd transformation • Append external encoding into the keys and lookuptables • Replace table values with random ones upon stage • 41 => 32, 21 => 56, 12 => 4 • Let G & F be encoding values • G() o AES() o F() • Such that G & F cancel each other out eventually • The external encoding is what makes the whitebox variant “attack resistant”

  23. Attaq

  24. Attaq 101 • Chow stated that his implementation doesn’t leak any information • In reality the XORtables and T/ytables still leaks one nibble each time • Not very helpful but still something • Since the external encoding cancel each out it might be worth to understand them • Hint hint

  25. Attaq! • If we look at input encoding and output encoding we know that they both cancel each other out • Thus if we manage to find the values of the encoding we’d only have a “naked” implementation of wbaes • And then just sniff the first round key and extract the key

  26. Cryptbox • Let’s try to look at MixColumns in the Ty/itables transformations • In a general idea it transforms32b to 32b values • Let P be input encodingand Q output encoding

  27. Now let’s try to give an approximation about the encoding values • Billet suggests to zero out two bits out of the 4 and build up a new lookup table and perform the transformation • Once we have that we construct a new lookup tableto their reversed operation

  28. whitebox^whitebox • We get 256 possible bijectionswhich can be used to build up output encoding approximations • The same operation is done to the input encoding using the acquired approximation we had for Q • Once have the external encoding values we can just sniff the first round key and extract the keys

  29. FIN • @shiftreduce • shiftreduce@gmail.com • Thanks to Eloi for making this challenge • greetz @ #ecl,#nullsec,inbarr,nirizr,skier_,emdel,over, Mikae, l_inc,

More Related