attack of the clones detecting cloned applications on android markets n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Attack of the Clones: Detecting Cloned Applications on Android Markets PowerPoint Presentation
Download Presentation
Attack of the Clones: Detecting Cloned Applications on Android Markets

Loading in 2 Seconds...

play fullscreen
1 / 42

Attack of the Clones: Detecting Cloned Applications on Android Markets - PowerPoint PPT Presentation


  • 150 Views
  • Uploaded on

Attack of the Clones: Detecting Cloned Applications on Android Markets. Jonathan Crussell 1,2 , Clint Gibler 1 , and Hao Chen 1 1 University of California, Davis 2 Sandia National Labs Source: ESORICS 2012. Outline. Introduction Background Threat Model

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Attack of the Clones: Detecting Cloned Applications on Android Markets' - medge-hebert


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
attack of the clones detecting cloned applications on android markets

Attack of the Clones: Detecting Cloned Applications on Android Markets

Jonathan Crussell1,2, Clint Gibler1, and Hao Chen1

1 University of California, Davis

2 Sandia National Labs

Source: ESORICS 2012

outline
Outline
  • Introduction
  • Background
  • Threat Model
  • Clone Detection Approaches and Related Work
  • Methodology
  • Evaluation
  • Case Studies
  • Discussion
  • Conclusion
introduction
Introduction
  • Much of the user experience of Android relies on third-party apps.
  • Android has numerous marketplaces.
  • Protect users from malicious apps.
  • Protect developers from plagiarists.
introduction1
Introduction
  • Developers can charge directly for their apps.
  • Offer free apps that are ad-supported or contain in-game billing.
  • Some apps have two version.
  • Paid app -> cracked &release for free
  • Free app -> cloned & change ad libraries
background
Background
  • Android Markets
  • Android Application Structure
threat model definition of clone
Threat Model-Definition of “Clone”.
  • Clones occur when two applications have similar code but have different ownership.
  • IgnoreThird-party librariesMultiple versions of the same application if they have the same ownership.
resistance to evasion techniques
Resistance to Evasion Techniques.
  • High level modifications
  • Method Restructurings
  • Control Flow Alterations
  • Addition/Deletion
  • Reordering
non goals
Non Goals
  • Find cloning in native code.
  • Determine which applications are the victims and which are clones.
clone detection approaches feature based
Clone Detection Approaches-Feature Based
  • Feature based approaches analyze a program and extract a set of features.
  • Number or size of classes, methods, loops, or variables to included libraries.
  • Low detection rate or high false positive rate.
clone detection approaches structure based
Clone Detection Approaches-Structure Based
  • Structure based systems convert programs into a stream of tokens and then compare the streams between two programs.
  • More robustly than feature based systems.
  • JPLAG, Winnowing and MOSS.
  • Comparing DEX byte code streams could be a quite quick and scalable method to find exactly or near exactly copied code.
  • But byte code streams contain no higher level semantic knowledge about the code.
clone detection approaches pdg based
Clone Detection Approaches-PDG Based
  • Program Dependence Graph: each node is a statement each edge shows a dependency between statements two types of dependencies: data and control
  • A data dependency edge between statements and exists if there is a variable in whose value depends on .
  • A control dependency between two statements exists if the truth value of the first statement controls whether the second statement executes.
related work
Related Work
  • Androguard, DEXCD and DroidMOSS.
  • All these approaches are structure based or structure based approximations.
  • None of these tools use any semantic information to aid in detecting plagiarism.
selecting potentially cloned applications
Selecting Potentially Cloned Applications
  • The goal of an application plagiarist is to entice unwary users to choose her cloned application instead of the original.
  • Name and description.
determining application similarity based on attributes
Determining Application Similarity Based on Attributes
  • We use Solrto mimic the search engines on Android markets.
  • Attributes of the apps: name, package, market, owner, and description
constructing pdgs
Constructing PDGs
  • dex2jar: Convert both apps’ code from the DEX format to a JAR.
  • WALA: Construct PDGs for each method in every class of the applications.
  • Only data dependency edges: More robust against statement reordering, insertion and deletion.
comparing pdgs excluding common libraries
Comparing PDGs-Excluding Common Libraries
  • Ad library Admob, Facebook API, etc.
  • Dumped both the package name and SHA-1 hash of known library files and recorded the most frequent SHA-1 hashes for each library.
lossless and lossy filters
Lossless and Lossy Filters
  • Lossless filter: Removes PDGs from consideration that are smaller than a specified size (< 10 nodes).
  • Lossy filter: Calculate a frequency vector for each of the methods in the pair.
  • This vector counts how many times a specific node type occurs in the PDG.
  • Compare these two vectors using hypothesis testing (G-test).
subgraph isomorphism
Subgraph Isomorphism
  • Find a mapping between nodes in and nodes in .
  • Subgraphisomorphism is NPComplete.
  • VF2 algorithm.
computing similarity scores
Computing Similarity Scores
  • For each method (excluding the methods in known libraries) in application , let be the number of nodes in this method’s PDG. Find the best match of this PDG in ’s PDGs and denote it as .
  • Similarity score:
evaluation
Evaluation
  • 75,000 free apps from 13 Android markets.
  • Randomly selected 9,400 pairs from the potential clones.
  • Hadoop: parallelize DNADroid.
  • HDFS: share data across a small cluster.
  • The average throughput of DNADroid on this small cluster is 0.71 application pairs per minute.
benign cloning
“Benign” Cloning
  • DNADroid found 30 pairs that both have a 100% similarity score.
  • Translation.
changes to advertising libraries
Changes to Advertising Libraries
  • We can see when an application has most likely been cloned for monetary gain.
  • Ex: XWind Downloader
  • For the 141 apps, we found that 91 (65%) of these pairs had different libraries, all of which included changes to advertising libraries.
malware added to an application
Malware Added to an Application
  • “HippoSMS” is a malicious application requires 10 permissions.
  • It shares the same package name as a Chinese video player requires 11 permissions.
  • 6 permissions that video playerdoesn’t use.
two variants of the same malware
Two Variants of the Same Malware
  • Two malicious apps that are identified by VirusTotalas being variants of the “BaseBridge” malware family.
  • Both applications have been stripped of meaningful class and method names.
  • DNADroid found coverages of 35% and 28% between the two variants.
use of freeware cracking tool in the wild
Use of Freeware Cracking Tool in the Wild
  • AntiLVLDecompiling an app with baksmaliInserts a new file:SmaliHook.classAnd hide AntiLVL’s modifications from the app itself by returning the original file size, MD5, and signatures.
  • Android License Verification Library (LVL), Amazon Appstore DRM and Verizon DRM.
  • 189 of 310 applications containing SmaliHook.class
  • 235 of 310 containing references to AntiLVL in their signature files.
  • Only 8% of our total apps were acquired from Chinese markets, 88% of the apps including AntiLVL traces were from Chinese markets.
false positive
False Positive
  • Since it is a serious allegation to claim an application is a clone, we design DNADroid to have a very low false positive rate.
false negative
False Negative
  • Cloned applications often have similar attributes as the original. (?)
  • There exist advanced program transformations that can evade PDG-based clone detection.
comparison to other approaches
Comparison to Other Approaches
  • Androguard: miss 18%
  • DEXCDhad problems running on the pairs DNADroid identified.
  • DroidMOSS is not currently publicly available.
performance
Performance
  • DNADroid are more expensive but result in fewer false positives and false negatives.
conclusion
Conclusion
  • DNADroid is a tool for finding clones on a large scale.
  • We evaluated DNADroidon applications crawled from 13 Android markets. Identified at least 141 apps that have been cloned An additional 310 apps that were cracked with AntiLVL
  • We describe five case studies
  • DNADroid has a very low false positive rate
  • DNADroid is an effective tool.