1 / 1

Investigating Privacy Complaints

Investigating Privacy Complaints. Jennifer Felder 1 , Jennifer King 2 , Nick Doty 2 , Prof. Deirdre Mulligan 2 1 North Carolina State University, 2 University of California Berkeley School of Information. Visualization Analysis Conclusions and Next Steps

gerek
Download Presentation

Investigating Privacy Complaints

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Investigating Privacy Complaints Jennifer Felder1, Jennifer King2, Nick Doty2, Prof. Deirdre Mulligan2 1North Carolina State University, 2University of California Berkeley School of Information Visualization Analysis Conclusions and Next Steps Both types of analysis reveal interesting facts about the data collected. They demonstrate which keywords are most effective in retrieving large quantities of questions from Yahoo! Answers. Furthermore, the more qualitative approach of the Many Eyes visualization shows not only the most common words appearing in the questions, but also the relationship of the word searched for within the text to other words in the text analyzed. The next steps for this research include additional natural language processing and visualizations, like those provided on the Many Eyes web site. Furthermore, this research contributes to the preliminary data collection stage of a larger project being conducted at the School of Information at UC Berkeley. In the scheme of the project in general, the next steps and final goal are to produce a taxonomy of privacy terms. Acknowledgments I would like to thank the team with which I worked to produce the command-line tool discussed in this research, consisting of the following individuals: Christopher Castillo, German Gomez, Rafael Negron, and AnandSonkar. In addition, I would like to thank my graduate student mentors, Nick Doty, MS and Jen King, and my faculty mentor, Professor Deirdre Mulligan. Finally, I would like to thank Dr. Kristen Gates, TRUST (The Team for Research in Ubiquitous Secure Technology), the NSF and UC Berkeley for the opportunity to conduct this research. • Introduction • With recent advances in technology comes an increase in the quantity of information available in the public domain, which raises concerns regarding the individuals’ right to privacy. Our team is interested in understanding the public’s concerns about information privacy in general. To study this issue, we sought to identify publicly available data to study. After exploring several sources, we chose Yahoo! Answers as an initial source of privacy complaint data because it provided both a useful and free API and a vast amount of publicly available data that could be obtained, thus eliminating any violations of personal privacy that could arise. To collect this data, we wrote a python script to create a command line executed tool that queries Yahoo! Answers for specified keywords and stores selected attributes of questions in a MySQL database. My focus in this team was on adding command line flags, including additional parameters in the Yahoo! Answers URL, and creating a cronjob to automatically run the script. • Methods • The flowchart below illustrates the design of the overall script. The process includes connecting to and querying Yahoo! Answers for a specified keyword and store the results in a database. My focus is highlighted in purple. • Process Overview • Script Refinement • Refinements of the script, which increased flexibility, autonomy and the quantity of data collected: • Command line flags • URL Parameters: start, sort • Cronjob • While loop (illustrated below) • While Loop Flowchart • The flowchart above illustrates the while loop refinement: Yahoo! Answers is queried and the ‘start’ parameter is incremented until an error message from Yahoo! is received. • Results • After running the script automatically every two hours for three days, over seven thousand questions were added to the database. • Quantitative Analysis

More Related