80 likes | 183 Views
This study explores automatic gender identification using cell phone calling behavior data. Evaluating popular approaches, the research aims to understand the association between phone usage and gender and to propose efficient classification methods. The dataset consists of 2 million calls from 10,000 phone numbers, including encrypted caller and callee numbers, call timestamps, durations, and locations. Various behavioral, social, and mobility variables are analyzed to create ranked distribution charts contrasting behavior with gender. Classification techniques such as SVM, Random Forests, and a semi-supervised approach combining K-means, labeling, and KNN are investigated.
E N D
Automatic Gender Identification using Cell Phone Calling Behavior Presented by David
Motivation • Existing gender classification • Based on voice • Based on image • Violate user privacy • Purpose of this work • understanding how phone usage associated with gender id • evaluating common approaches for gender id
Call Data Records • Dataset • 2 M calls • 10 K phone numbers • Features • encrypted cell phone numbers of caller and callee • the date and time of the call • the duration of the call • The initial and final location of the caller while making the call
Variables • Behavioral Variables • Number of Calls • Average Duration of Calls • Expenses • Social Variables • In Degree • Out Degree • Degree • Mobility Variables • Talk Distance • Route Distance
Behavior vs. Gender • Ranked distributions charts
Behavior vs. Gender (Cont.) • Ranked distributions charts • Selected features • number of incoming calls • Number of outgoing calls • average duration of incoming calls • average duration of outgoing calls • Expenses • Degree
Gender Classification • SVM • Random Forests
Gender Classification (Cont.) • Semi-supervised (K-means + Labeling + KNN)