1 / 22

Long Text Keystroke Biometrics Study

Long Text Keystroke Biometrics Study. Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana Sung-Hyuk Cha, Charles Tappert (Software Engineering Project Team + DPS Student). Keystroke Biometric. Biometrics important for security apps

dahms
Download Presentation

Long Text Keystroke Biometrics Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Long Text Keystroke Biometrics Study Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana Sung-Hyuk Cha, Charles Tappert (Software Engineering Project Team + DPS Student)

  2. Keystroke Biometric • Biometrics important for security apps • Advantage - inexpensive and easy to implement, the only hardware needed is a keyboard • Disadvantage - behavioral rather than physiological biometric, easy to disguise • One of the least studied biometrics, thus good for dissertation studies

  3. Focus of Study • Previous studies mostly concerned with short character string input • Password hardening • Short name strings • We focus on large text input • 200 or more characters per sample

  4. Focus of Study (cont) • Applications of interest • Identification • 1-of-n classification problem • e.g., sender of inappropriate e-mail in a business environment with a limited number of employees • Verification • Binary classification problem, yes/no • e.g., student taking online exam

  5. Software Components • Raw Keystroke Data Capture over the Internet (Java applet) • Feature Extraction (SAS software) • Classification (SAS software) • Training • Testing

  6. Keystroke Data Capture(Java Applet) Raw data recorded for each entry • Key’s character • Key’s code text equivalent • Key’s location on keyboard • 1 = standard, 2 = left, 3 = right • Time key was pressed (msec) • Time key was released (msec) • Number of left, right, double mouse clicks

  7. Keystroke Data Capture(Java Applet)

  8. Aligned Raw Data File(Hello World!)

  9. Feature Extraction • 10 Mean and 10 Std of key press durations • 8 most frequent alphabet letters (e, a, r, i, o, t, n, s) • Space & shift keys • 10 Mean and 10 Std of key transitions • 8 most common digrams (in, th, ti, on, an, he, al, er) • Space-to-any-letter & any-letter-to-space • 18 Total number of keypresses for • Space, backspace, delete, insert, home, end, enter, ctrl, 4 arrow keys, shift (left), shift (right), total entry time, left, right, & double mouse clicks

  10. Feature Extraction Preprocessing • Outlier removal • Remove samples > 2 std from mean • Prevents skewing of feature measurements caused by pausing of the keystroker • Standardization • x’ = (x - xmin) / (xmax - xmin) • Scales to range 0-1 to give roughly equal weight to each feature

  11. Sample Datasets Prior to Standardization After Standardization

  12. Classification • Identification • Nearest neighbor classifier using Euclidean distance • Input sample compared to every training sample

  13. Experimental Design:Identification Experiment • 8 subjects that know the purpose of exp. • Training – 10 reps of text a (approx. 600 char) • Testing • 10 reps of text a • 10 reps of text b (same length as text a) • 10 reps of text c (half length of text a)

  14. Experimental Design: Instructions for Subjects • Subjects were told to input the data using their normal keystroke dynamics • Subjects were asked leave at least a day between entering samples

  15. Experimental Design:Text a – about 600 characters • This is an Aesop fable about the bat and the weasels. A bat who fell upon the ground and was caught by a weasel pleaded to be spared his life. The weasel refused, saying that he was by nature the enemy of all birds. The bat assured him that he was not a bird, but a mouse, and thus was set free. Shortly afterwards the bat again fell to the ground and was caught by another weasel, whom he likewise entreated not to eat him. The weasel said that he had a special hostility to mice. The bat assured him that he was not a mouse, but a bat, and thus a second time escaped. The moral of the story: it is wise to turn circumstances to good account.

  16. Expected Outcomes: Recognition Accuracy • Accuracy on text a > that on text b • text a is the training text • Accuracy on text b > that on text c • text b is longer than text c • Accuracy on texts a, b, c > arbitrary text • texts a, b, & c are similar, all Aesop fables

  17. Preliminary Results – Reduced Experiment • Reduced identification experiment • Smaller text input • “The quick brown fox jumps over the lazy dog.” • Fewer subjects • Three project team members • Fewer feature measurements • Mean and std for “e” and “o” key press durations • Accuracy of 80%, which is promising

  18. Results – Comparison to Same Text Predicted • Prior to Standardization only yielded a 59% accuracy • 100 % accuracy with standardization (76 out of 76) • Confusion Matrix of Results after Standardization  Actual

  19. Results – Comparison to Different Text of ~Equal Length Predicted • Prior to Standardization only yielded a 38% accuracy • 98.5 % accuracy with standardization (65 out of 66) • Confusion Matrix of Results after Standardization  Actual

  20. Results – Comparison to Different Text of Shorter Length Predicted • Prior to Standardization only yielded a 14% accuracy • 97% accuracy with standardization (74 out of 76) • Confusion Matrix of Results after Standardization  Actual

  21. Conclusions • System is a viable means of differentiating between individuals based on typing patterns • Standardization is crucial to the accuracy of the system • It is likely that the shorter the text used for verification, the lower the accuracy • Decreasing # measurements used also decreases accuracy

  22. Questions/Comments? • Focus or applications? • Software implementation? • Experimental design? • Expected experimental outcomes?

More Related