"Using encryption on the Internet is the equivalent of arranging an armored car to deliver credit-card information from someone living in a cardboard box to someone living on a park bench" - Gene Spafford, CERIAS @ Purdue
SSH Timing Attacks Elliot Metsger email@example.com Christopher Soghoian firstname.lastname@example.org
The Dark Ages • Before SSH, before the age of enlightenment – the world was shrouded in darkness • Telnet and ftp were used everywhere, and thus passwords were sent over the wire.
History of SSH (ssh.com) • Created by Tatu Ylönen in July 1995, a student of Helsinki University of Technology • Spun off into SSH Communications Security Ltd. (Finland), software distributed for free (think beer). • Prior to v. 1.2.12, licence was relatively free, except for a requirement that there be no windows/DOS port.
History of SSH (ssh.com) • Post v. 1.2.12, license restricted the use of ssh in a commercial environment, instead requiring companies to buy an expensive version from Datafellows.
History of SSH (openssh.org) • OpenBSD forked the commercial ssh version at v. 1.2.12. • All components of a restrictive nature (i.e. patents) directly removed from the source code • any licensed or patented components are chosen from external libraries (e.g. OpenSSL).
History of SSH (openssh.org) • OpenSSH code including full SSH 1.3 and SSH 1.5 protocol support shipped on December 1, 1999 • Around May 4, 2000, the SSH 2 protocol support was implemented sufficiently to be useable. • Rapidly became de facto standard ssh application.
History of SSH (lsh) • GNU blessed group implement the SSH v.2 protocol, ship it with the GNU GPL license. • Shares no common code with openssh/commercial SSH, and does not use the openssl library. • Security advantage for system administrators due to heterogeneous SSH servers. • Yet still susceptible to buffer overflows.
IETF Standard • IETF Secure Shell (secsh) working group has submitted an internet-draft on the SSH-2.0 protocol.
Reasons to use SSH. • Designed to be a secure replacement for rsh, rlogin, rcp, rdist, and telnet. • Strong authentication. Closes several security holes (e.g., IP, routing, and DNS spoofing). • Improved privacy. All communications are automatically and transparently encrypted. • Secure X11 sessions. The program automatically sets DISPLAY on the server machine, and forwards any X11 connections over the secure channel.
Reasons to use SSH • No retraining needed for normal users. • Never trusts the network. Minimal trust on the remote side of the connection. Minimal trust on domain name servers. Pure RSA authentication never trusts anything but the private key. • Client RSA-authenticates the server machine in the beginning of every connection to prevent trojan horses (by routing or DNS spoofing) and man-in-the-middle attacks, and the server RSA-authenticates the client machine before accepting .rhosts or /etc/hosts.equiv authentication (to prevent DNS, routing, or IP-spoofing).
Reasons to use SSH • Host authentication key distribution can be centrally by the administration, automatically when the first connection is made to a machine. • Any user can create any number of user authentication RSA keys for his/her own use. • The server program has its own server RSA key which is automatically regenerated every hour. • An authentication agent, running in the user's laptop or local workstation, can be used to hold the user's RSA authentication keys.
Reasons to use SSH • Arbitrary TCP/IP ports can be redirected through the encrypted channel in both directions • The software can be installed and used (with restricted functionality) even without root privileges. • Optional compression of all data with gzip (including forwarded X11 and TCP/IP port data), which may result in significant speedups on slow connections.
Keystroke Timing Theory • Keystroke timing is a biometric signature, measurable similar to iris patterns, finger/hand prints. • Researchers are able to identify users based on their inter-keystroke timing patterns. • Authentication and recognition systems can be developed or augmented by keystroke timing patterns.
Keystroke Timing Theory • Identification of users are based on the statistical comparison of known keystroke latencies to unknown keystroke latencies. • If there is no statistical difference between the known latency to the unknown latency, then you cannot say that the keystroke pairs were typed by different individuals.
Keystroke Timing Theory • There are two types of statistical error possible in timing research. • In the context of authentication systems the errors are: • Type I Error: A valid user is rejected access to the system • Type II Error: An impostor is allowed access to the system • Goal is to have no Type II error while minimizing Type I error.
Keystroke Timing Theory • This research attempts to develop a model by which user's keystrokes can be recoveredby observing the timing between keystrokes. • Instead of identifying the user based on keystroke characteristics, the authors attempt to recover the actual keystrokes the user typed.
Keystroke Timing • Measures the latency between individual key presses, the amount of time a key was depressed, and released. • This research focuses on key press events only, not key press duration, or key release. • Data gathering: e.g. the key pair (also called digraph) v,o for user X has a mean latency of 50ms; the digraph t,h for user X has a mean latency of 120ms.
Timing Research: Gaines/Shapiro • Research in keystroke and keyboard dynamics dates back to 1924. • Sporadic bursts of research through the 1970's. • 1980 Rand research directed by S. Gaines and N. Shapiro attempted to establish whether a user could be identified by the statistical characteristics of their typing behaviour. • Goal was to provide a basis for a computer authentication system.
Timing Research: Gaines/Shapiro • Gaines and Shapiro established that individual users appear to have typing “signatures”, and thus be identified. • Established that the typing signatures appear to be stable over time. • Established that certain digraphs (keystroke pairs) can be used to distinguish the typists.
Timing Research: Gaines/Shapiro • Problems with Gaines and Shapiro: • Small typist sample size (only 6 subjects) • Typists were expert touch typists (not a random sampling from the population). • Keystroke latencies were measured to a precision of 1ms. • Chose to treat their data as normally distributed data even though tests for normalcy were not decisive. • Assumed the standard deviation for all measurements were zero (e.g. they ignored variance in latency measurements).
Timing Research: Leggett, et. al. • Late 1980's work validated much of the work by Gaines and Shapiro. • Also had some issues: • Sample size was larger (17 computer programmers), but still too small. • Samples are not from the general population. • Type II error occurrence was too high (~ 5.0%) • Required too much training data (about 1000 words per user); not practical.
Timing Research: Monrose/Rubin • Improved methodology • Timing analysis program was designed to reduce user error and make raw data collection more efficient. • User ran a binary on their machine at their convenience. • Screen layout was designed such that the participant's attention was focused on the screen, so as not to introduce outlying data points. • Raw data were emailed to the investigators • Were able to collect timing data from 47 participants.
Timing Research: Monrose/Rubin • Graphical front-end was used to analyze raw data and display plots. • Easily explain outlying data points • Efficiently analyze large amounts of data • Presents different perspectives on the data quickly • This analytical toolkit may be used by future research in keystroke timing.
Timing Research: Monrose/Rubin • Assayed multiple features of user keystroke behaviour: • Inter-keystroke latencies (digraphs and trigraphs) • Keypress duration • Other features • Used different types of text for comparison: • Structured text compared to Structured text • Structured text compared to free-form text • Free-form text compared to free-form text
Timing Research: Monrose/Rubin • Used multiple methods to compare keystroke data (which did not ignore variance in the data) • Euclidean Distance Algorithm • Non-weighted Portability Measure • Weighted Portability Measure • Tested three thresholds (number of standard deviations from the mean) which enabled them to minimize their Type II error (the acceptance of impostors).
Timing Analysis of Keystrokes and Timing Attacks on SSH by Dawn Xiaodong Song David Wagner Xuqing Tian of the University of California, Berkeley. The paper documents work produced with DARPA and NSF funding.
SSH Timing: Review • What are they measuring? • Inter keystroke latencies. • Key press events only. • How are they measuring? • Sniffing packet deltas on the wire • One keystroke equals one packet (in interactive mode).
SSH Timing: Review • Where else could they measure? • On the wire (including but not limited to hosts connected by hubs or via a wireless network). • The client host. • The remote host. • An intermediate host (perhaps via a MITM attack).
SSH Timing: Collection of known training data • Participants repeatedly (30 – 40 times) entered keypairs to train their model. • The users did not: • Enter whole passwords to train the model • Enter freeform text to train the model • Investigators measured 142 keypairs (enough?) • These keypair latencies are the known latencies that the investigators will use later in their Hidden Markov Model and Viterbi algorithms.
SSH Timing: The Training Data • The 142 training data keypairs were classified into 5 groups: • Two letters, Two hands • Two letters, Same hand, Different fingers • Two letters, Same hand, Same finger • Letter and Number, Two hands • Letter and Number, Same hand • Attacker may learn one bit of information from the keystroke latency.
SSH Timing: Gaussian Modelling • The data look normal (Gaussian). • The investigators derive and plot Gaussian graphs for each keystroke pair (142 total graphs) per user. • A lot of overlap between mean digraph latencies. • So how does one tell the difference between a digraph peak at 75 ms and 80 ms?
SSH Timing: Entropy & Info Gain • Investigators calculate the information gain from keystroke latency to be a maximum of 1.2 bits of information per character pair. • Assuming that the character pair is selected from the keyboard uniformly and at random. • Investigators postulate that information gain will be more for english text since the entropy of english text is lower than the random passwords chosen in this research.
SSH Timing: Markov Model • Now what? We have a mess of overlapping keystroke latency data. • The investigators use a “Markov Model” combined with keystroke latencies to predict the typed keystroke pairs. • Generally, a Markov Model says that the probability of moving to the next state depends only on the current state (not any of the previous states in the chain).
SSH Timing: Markov Model • In the context of this research, the “state” of the Markov model is the keypair that was entered. • So Markov rephrased in context: the probability that keypair qnext is going to be entered is based solely on the fact that keypair q was entered. • Keypresses that occurred up to keypair q will not influence the probability that keypair qnext will be entered.
SSH Timing: Hidden Markov Model • Investigators cannot see the keypairs: this is SSH. • All the investigators can measure is keystroke latency (which is indirectly measured by packet deltas). • Hidden Markov is where the state (keystroke pair) cannot be measured directly. • A property of the state (keystroke latency) is observed instead , by which you can probabilistically determine the hidden state (keystroke pair).
SSH Timing: Hidden Markov Model • The Hidden Markov Model makes an assumption: • The probability of transitioning to the next state is only determined by the current state and is not dependant on previous states that have occurred. • In context: • The probability of the latency distribution for the next character pair is only dependant on the current character pair and not based on any previous character sequences.
SSH Timing: Viterbi algorithm • Authors decide to use the Viterbi algorithm to analyze the keystroke latency data. • The Viterbi algorithm is regularly used to analyze HMM problems. • Is more efficient than calculating the probability for every keystroke pair. • Given a latency y, list in order of decreasing probability the character pair q that is responsible for producing the observed latency.
SSH Timing: n-Viterbi algorithm • Remember the mess? • Based on the fact that digraph latencies have severe overlap, the authors don't think that the Viterbi algorithm will produce the correct keystroke pair. • They modified Viterbi to output the n most likely keystroke pairs: n-Viterbi • Authors hope that the correct keystroke sequence is within the first n possibilities.
SSH Timing: Herbivore • Herbivore is the author's attacking engine. • It sniffs the network for su or login packets and measures packet arrival times. • Packet arrival times are compared to the known digraph latencies obtained during the password keypair HMM training. • The output from Herbivore is a candidate password list. Somewhere on that list is the correct password.
SSH Timing: Herbivore • Attacker must execute an attack using the output from Herbivore. • Dictionary Attack (assuming local access to /etc/shadow) • Brute force (assuming unix host is not configured to lock acct after x failures)
SSH Timing: Herbivore • Authors state that Herbivore reduces the brute force work by a factor of 50. • Herbivore gleaned 0.8 bits of information per character pair vs. a theoretical maximum of 1.2 bits. • Attribute difference to differences in distributions between training data and observed data. • Authors state that one users training data can be used for another users unknown timing data.
SSH Timing: Herbivore • Herbivore's effectiveness
Responses to the paper • Multiple parties responded to the data presented in the paper.
Latency • 4 UVA students, Mike Hogye, Thad Hughes, Josh Sarfaty and Joe Wolf, responded to this paper in Fall 2001. • They raise many issues, the most important of which is the issue of latency, which the paper depends upon.