250 likes | 571 Views
User Authentication Using Keystroke Dynamics. Jeff Hieb & Kunal Pharas. ECE 614 Spring 2005 University of Louisville. Three types of authentication. Something you know. A password Something you have. An ID card or badge Something you are. Biometrics. Biometrics.
E N D
User Authentication Using Keystroke Dynamics Jeff Hieb & Kunal Pharas ECE 614 Spring 2005 University of Louisville
Three types of authentication • Something you know. • A password • Something you have. • An ID card or badge • Something you are. • Biometrics
Biometrics • Biometrics measure physical or behavioral characteristics of an individual. • Physical (do not change over time): • Fingerprint, iris pattern, hand geometry • Behavioral (may change over time): • Signature, speech pattern, keystroke pattern
Keystroke biometrics • A keystroke dynamic is based on the assumption that each person has a unique keystroke rhythm. • Keystroke features are: • Latency between keystrokes. • Duration of key presses. • 4 possible authentication outcomes: • Genuine individual is accepted. • Genuine individual is rejected. • Imposter is accepted. • Imposter is rejected. • Biometric classification accuracy measures • FRR – false rejection rate (ii) • FAR – false acceptance rate (iii) • EER – equal error rate FRR = FAR
Methods for classifying keystroke rhythms • Statistical / probabilistic approaches • Data Mining Techniques • Neural Networks • EBP networks • CPNN (based on SOM) • ART2 networks (unsupervised learning) • LVQ networks • RBFN
Project Description • Authenticate users based on the keystroke times captured while typing their name. • Use EBP to train a neural network to generate a user identification that can be compared to a known user identification. • Result of the system will be either authentication failed or authentication successful.
Implementation • Capturing keystrokes: GUI in C# • Requirements • Near microsecond accuracy (HiPerfTimer) • Enrollment times and labels • Authentication using captured times. • Remote call Matlab to processes times. • Processing Data, Matlab • Subroutines needed • Error back propagation • Evaluate a vector of authentication times using trained network • Normalization of training times • Normalization of authentication times
Capturing Training Times • Time the interval between successive key_up and key_down events, keystroke latency. • Maximum of 50 time intervals can be captured and stored. • Unused elements are set to 0. • User must correctly type name or trial is thrown out. • Training times are stored in a text file. • Additional training times are appended to this file. • An enrollment is comprised of 7 successful (correct name typed) captures. • After enrollment the neural network is retrained.
Labeling training times • Each user is represented by a binary string • Ex. • User Jeff Hieb: 1 0 0 • User Kunal Pharas: 0 1 0 • User Suman: 0 0 1 • Training labels are stored in a text file: • Each line in the file is the user label for the same line in the training file. • Additional training labels are appended to this file. • When a new user enrolls a 0 is appended to all existing user labels in the file.
Training Data Files • Sample of training times file: . . . 150 31 52 43 125 9 83 14 90 86 69 261 50 213 129 41 166 80 65 253 68 27 67 5 77 10 62 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 165 83 31 195 105 6 78 11 155 1 61 220 70 192 140 52 93 129 57 272 70 24 69 7 86 5 67 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 190 62 52 115 92 21 73 13 111 32 72 223 77 152 129 52 114 131 56 275 69 39 64 1 82 9 74 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 173 62 42 103 105 31 41 38 97 51 63 235 56 187 125 51 125 109 57 269 73 16 67 13 81 1 61 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 199 62 21 126 103 10 53 30 93 170 59 175 63 145 135 41 114 130 56 293 70 21 61 14 80 1 63 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 208 62 52 117 112 1 82 6 98 208 62 168 81 168 123 53 103 163 66 348 77 33 61 10 83 1 71 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 162 73 62 111 97 20 52 36 109 36 78 216 64 155 136 52 125 126 71 308 76 30 63 4 79 10 62 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . . • Sample of training labels file: . . . 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 . . . 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 . . .
Training the Neural Network • GUI calls Matlab function EBP(filename) where filename denotes the training times and training labels. • EBP normalizes the data and stores the normalization parameters in a file • Number of output neurons is determined by the training labels, 5 users 5 output neurons. • Output layer uses uni-polar activation function. • Trained weights are stored in file.
Authentication • Capture keystrokes using same procedure as before. • If user mistypes name, authentication fails, but user is informed why and trial is discarded. • GUI calls matlab function evaluate(filename) where filename is a file containing the captured times. • Evaluate normalizes the data using the parameters stored during training • Evaluate then uses the stored weights to produce the output of the network, which are returned • The GUI maps the network output to a string of 0’s and 1’s. • If f(net) is greater than alpha (i.e. .95) then the value is 1, otherwise the value is 0. • This string is then compared to the desired user string. • If there is a match, authentication is successful, other wise authentication fails.
Testing and Results • Enrolled 7 users (49 training pairs). • Each user had at least 3 authentication attempts (total of 45 authentication trials). • 42 imposter trials. • The majority of imposter authentication attempts were made by us. • Many authentication trials are for one user.
Effect of hidden layers on accuracy Alpha = .95 C = .2 Emax = .0005
Effect of Training error on accuracy Alpha = .95 C = .2 Hidden Neurons = 24
Overall Classifier Accuracy Max error =.0005 C = .2 Hidden Neurons = 24 Best performance Alpha = .75 FRR = 7% FAR = 30%
Conclusions • For users short name (less than 8 characters) or with long latency (not proficient typists) circumvention was high. • Creating an interface that is acceptable and easy to use for a wide variety of users is not trivial. • Not allowing for typographical errors is irritating to users and may effect acceptance. • Don’t require imposter training samples.
Future Research Directions • Ways of handling typographical errors. • Ways to scale keystroke biometrics to large numbers of users. • Explore other methods of evaluations, particularly unsupervised learning. • Explore extraction of more sophisticated keystroke features.
References • J. Bechtel, “Passphrase authentication based on typing style through an ART 2 Neural network,” IJCIA Vol. 2, No. 2 (2002) pp 1 –22. • A. Peacock, “Typing Patters: A Key to User Identification,” IEEE Security and Privacy, September / October 2004, pp 40- 47. • L. Araujo, “User Authentication Through Typing Biometrics Features,” IEEE Transactions on Signal Processing, Vol. 53, No. 2, February 2005. • A. Guven, “Understanding users’ keystroke patters for computer access security,” Computers & Security, Vol. 22, No. 8, 2003, pp 695-706. • F. Monrose “Keystroke dynamics as a biometric for authentication,” Future Generation Computer Systems, Vol. 16, 2000, pp. 351-359. • M. Obiadat, “An On-Line Neural Network System for Computer Access Security,” IEEE Transactions On Industrial Electronics, Vol. 40, No. 2, April 1993, pp. 235-242.