1 / 14

Crawling Chinese Android Markets

Crawling Chinese Android Markets. Xiang Pan Biyan Zhou George Liu. Over view. Background & Purpose Goals Work Division Proposed Algorithm & Examples Problem & Solution Precaution Measures Results & Future Improvements. NOT IN CHINA. It’s FREEEEE BUT. DANGEROUS.

cleta
Download Presentation

Crawling Chinese Android Markets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Crawling Chinese Android Markets Xiang Pan Biyan Zhou George Liu

  2. Overview • Background &Purpose • Goals • Work Division • Proposed Algorithm &Examples • Problem &Solution • Precaution Measures • Results &Future Improvements

  3. NOT IN CHINA

  4. It’s FREEEEE BUT DANGEROUS

  5. Background &Purpose • Existing malicious activates • Ex: NickiBot(Spyware) • Runs in background forever, difficult to detect • Can record phone call, monitor phone logs and SMS, detect location and send information to remote server • Purpose of our project • Collect a sizable amount of android applications from less popular Chinese markets for analysis

  6. Goals • Create a robust crawler that can be tailored for different markets with minimal effort • Analyze at least 5 markets to collect suspicious applications • Exanimate the precaution measures of these markets

  7. Work Division

  8. Proposed Algorithm • Manually inspect each market for overall data structure • Meta data HTML • Downloading URL (redirection via JScript) • Select appropriate unique application attribute (id, names… etc) • Correctly parse meta data using regular expressions • Store meta data and the application in a user specified location

  9. Example #1

  10. Example #2

  11. Problem &Solution • Different HTML structures for meta data of applications in the same market • Only capture one set of data (the most frequently used one) • Slow download speed • Utilize multithread download technique, split a single application to multiple parts • Wrong Application ID results in termination of downloading • Using try catch structure when a specified file doesn’t exist

  12. Precaution Measures

  13. Results &Future Improvements • Created a robust and easy to use crawler • Collected over 70 GB (~30,000) of suspicious applications • Exanimated 10 different markets for precaution measures • Create simple GUI to improve usability • Automatic authentication • Circumvent market’s cap for daily traffic on a given IP • Maintain a Database for theseapplication

  14. Q&A

More Related