chatcoder toward the tracking and categorization of internet predators l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
ChatCoder: Toward the Tracking and Categorization of Internet Predators PowerPoint Presentation
Download Presentation
ChatCoder: Toward the Tracking and Categorization of Internet Predators

Loading in 2 Seconds...

play fullscreen
1 / 24

ChatCoder: Toward the Tracking and Categorization of Internet Predators - PowerPoint PPT Presentation


  • 213 Views
  • Uploaded on

ChatCoder: Toward the Tracking and Categorization of Internet Predators. April Kontostathis Lynne Edwards Amanda Leatherman Ursinus College. Where are we coming from?. Spring/Summer 2008

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'ChatCoder: Toward the Tracking and Categorization of Internet Predators' - cadee


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chatcoder toward the tracking and categorization of internet predators

ChatCoder: Toward the Tracking and Categorization of Internet Predators

April Kontostathis

Lynne Edwards

Amanda Leatherman

Ursinus College

where are we coming from
Where are we coming from?
  • Spring/Summer 2008
    • Amanda Leatherman, Ursinus class of 2009, approaches Lynne Edwards, Associate Professor of Media and Communication Studies, about a new project.
summer 2009
Summer 2009
  • Amanda and Lynne research related work
  • Olson, L. N., Daggs, J. L., Ellevold, B . L., & Rogers, T. K. (2007). The communication of deviance: Toward a theory of child sexual predators' luring communication. Communication Theory, 17, 231-251.
  • Lynne and Amanda channel this project in two directions
    • Modify the theory for the online environment
    • Operationalize the theory
original lct model olson et al
Original LCT Model (Olson, et. al)
  • Gaining Access
    • Characteristics of the perpetrator
    • Characteristics of the victim
    • Strategic placement
  • Deceptive Trust Development
  • Grooming
    • Communicative desensitization
    • Reframing
  • Isolation
  • Approach
process
Process
  • Read many transcripts from Perverted-justice.com
    • … not an appealing job
meanwhile
Meanwhile …
  • I am planning a Fall 2008 Software Engineering course – looking for projects to assign to students
  • Lynne asks if my students can build a system to find phrases in the perverted-justice transcripts
  • … a collaboration is born!
where are we now
Where are we now?

Revised LCT Model

  • Gaining Access
    • Strategic Placement
  • Deceptive Trust Development
    • Activities
    • Compliments
    • Personal Information Exchange
    • Relationship Exchange
  • Grooming
    • Communicative Desensitization
    • Reframing
  • Isolation
  • Approach
categorization experiments
Categorization Experiments
  • First Experiment
    • Class: {Predator , Victim}
      • 32 instances, 16 in each class (talking to each other)
    • Eight numeric attributes - Count of tagged phrases in each category
      • Activities
      • Personal Information
      • Compliments
      • Relationship
      • Reframing
      • Desensitization
      • Isolation
      • Approach
results
Results
  • Classifier: C4.5 (J48 in Weka)
  • 3-fold cross validation
  • Success Rate: 59%
    • baseline 50%
  • Confusion matrix
decision tree
Decision Tree

DesensitizationCount <= 35

| RelationshipCount <= 0

| | ActivitiesCount <= 1

| | | IsolationCount <= 5: Predator (5.0/1.0)

| | | IsolationCount > 5: Victim (4.0)

| | ActivitiesCount > 1: Predator (2.0)

| RelationshipCount > 0: Victim (10.0)

DesensitizationCount > 35: Predator (11.0/1.0)

categorization experiments12
Categorization Experiments
  • Second Experiment
    • Class: {PJ , Non-PJ}
      • 31 instances, 14 PJ Transcripts, 15 Non-PJ
      • Non-PJ obtained from Dr. Susan Gauch – collected during her ChatTrack project
      • PJ transcripts, both Victim and Predator were coded
    • Same eight attributes
results13
Results
  • Classifier: C4.5 (J48 in Weka)
  • 3-fold cross validation
  • Success Rate: 93%
    • baseline 48%
  • Confusion matrix
clustering experiments
Clustering Experiments
  • All 288 PJ Transcripts
  • K Means Clustering
  • Same eight attributes
    • column normalized
  • Four Clusters found
    • minimum intra-cluster variation
    • multiple runs to avoid local minima
labeling the clusters
Labeling the Clusters
  • 60 Transcripts Analyzed Closely
  • Age Deception Data Categorized
    • Four distinct ways that deception can be achieved when communicating with others
        • Quantity
        • Quality
        • Relation
        • Manner

McCornack, S.A., Levine, T.R., Solowczuk, K.A., Torres, H.I., & Campbell, D.M. (1992). When the alteration of information is viewed as deception: An empirical test of information manipulation theory. Communication Monographs, 59, 17-29.

  • Age data captured for all 288 transcripts
type of deception
Type of Deception
  • Quantity manipulation findings
      • Honest predators average real age was 31 yrs old
      • Deceptive predators average real age was 38 yrs old
  • Quality manipulation findings
      • Average age given by deceptive predators was 27 yrs old
  • Relation and Manner manipulation findings
      • Rarely used by online sexual predators
synergistic activities
Synergistic Activities
  • Content Analysis for the Web 2.0
    • Misbehavior Detection Task
  • Pendar, Nick (2007) "Toward Spotting the Pedophile: Telling victim from predator in text chats " In The Proceedings of the First IEEE International Conference on Semantic Computing: 235-241. Irvine, California.
    • Study for the Termination of Online Predators (STOP)
  • Hughes, D., P. Rayson, J. Walkerdine, K. Lee, P. Greenwood, A. Rashid, C. MayChahal, and M. Brennan. 2008. Supporting Law Enforcement in Digital Communities through Natural Language Analysis,. In the proceedings of the 2nd International Workshop on Computational Forensics (IWCF’08). Washington D.C., USA, August 2008.
    • Isis – Protecting Children in Online Social Networks
where are we going
Where are we going?
  • Data remains a big problem
    • PJ data is problematic
    • Access to large chat or “chat-like” collections is hard to get
  • Labeling is a bigger problem
    • Finding predatory chat is a “needle in haystack” problem
  • Applications are nice, but applications need to be grounded in text mining and communicative theory research.
acknowledgements
Acknowledgements
  • Amanda Leatherman
  • Lynne Edwards
  • Kristina Moore
  • Brian D. Davison and students at Lehigh Univ.
  • Ursinus College
    • Media and Communication Studies faculty and students
    • Mathematics and Computer Science faculty and students
  • Text Mining Workshop organizers and reviewers
contact information
Contact Information

April Kontostathis

Ursinus College

akontostathis@ursinus.edu

http://webpages.ursinus.edu/akontostathis

610-409-3000 x2650