- Gene Spafford, CERIAS @ Purdue "Using encryption on the Internet is the equivalent of arranging an armored car to deliver credit-card information from.

- Gene Spafford, CERIAS @ Purdue
"Using encryption on the Internet is the equivalent of arranging an armored car to deliver credit-card information from someone living in a cardboard box to someone living on a park bench" - Gene Spafford, Purdue

Elliot Metsger emetsger@jhu.edu Christopher Soghoian csoghoian@jhu.edu
SSH Timing Attacks Elliot Metsger Christopher Soghoian

The Dark Ages Before SSH, before the age of enlightenment – the world was shrouded in darkness Telnet and ftp were used everywhere, and thus passwords were sent over the wire.

History of SSH (ssh.com)
Created by Tatu Ylönen in July 1995, a student of Helsinki University of Technology Spun off into SSH Communications Security Ltd. (Finland), software distributed for free (think beer). Prior to v , licence was relatively free, except for a requirement that there be no windows/DOS port.

History of SSH (ssh.com)
Post v , license restricted the use of ssh in a commercial environment, instead requiring companies to buy an expensive version from Datafellows.

History of SSH (openssh.org)
OpenBSD forked the commercial ssh version at v All components of a restrictive nature (i.e. patents) directly removed from the source code any licensed or patented components are chosen from external libraries (e.g. OpenSSL).

History of SSH (openssh.org)
OpenSSH code including full SSH 1.3 and SSH 1.5 protocol support shipped on December 1, 1999 Around May 4, 2000, the SSH 2 protocol support was implemented sufficiently to be useable. Rapidly became de facto standard ssh application.

History of SSH (lsh) GNU blessed group implement the SSH v.2 protocol, ship it with the GNU GPL license. Shares no common code with openssh/commercial SSH, and does not use the openssl library. Security advantage for system administrators due to heterogeneous SSH servers. Yet still susceptible to buffer overflows.

IETF Standard IETF Secure Shell (secsh) working group has submitted an internet-draft on the SSH-2.0 protocol.

Reasons to use SSH. Designed to be a secure replacement for rsh, rlogin, rcp, rdist, and telnet. Strong authentication. Closes several security holes (e.g., IP, routing, and DNS spoofing). Improved privacy. All communications are automatically and transparently encrypted. Secure X11 sessions. The program automatically sets DISPLAY on the server machine, and forwards any X11 connections over the secure channel.

Reasons to use SSH No retraining needed for normal users.
Never trusts the network. Minimal trust on the remote side of the connection. Minimal trust on domain name servers. Pure RSA authentication never trusts anything but the private key. Client RSA-authenticates the server machine in the beginning of every connection to prevent trojan horses (by routing or DNS spoofing) and man-in-the-middle attacks, and the server RSA-authenticates the client machine before accepting .rhosts or /etc/hosts.equiv authentication (to prevent DNS, routing, or IP-spoofing).

Reasons to use SSH Host authentication key distribution can be centrally by the administration, automatically when the first connection is made to a machine. Any user can create any number of user authentication RSA keys for his/her own use. The server program has its own server RSA key which is automatically regenerated every hour. An authentication agent, running in the user's laptop or local workstation, can be used to hold the user's RSA authentication keys.

Reasons to use SSH Arbitrary TCP/IP ports can be redirected through the encrypted channel in both directions The software can be installed and used (with restricted functionality) even without root privileges. Optional compression of all data with gzip (including forwarded X11 and TCP/IP port data), which may result in significant speedups on slow connections.

Keystroke Timing Theory
Keystroke timing is a biometric signature, measurable similar to iris patterns, finger/hand prints. Researchers are able to identify users based on their inter-keystroke timing patterns. Authentication and recognition systems can be developed or augmented by keystroke timing patterns.

Explain latency on x and frequency on y. Point out mean, variance. Explain how you can tell the difference in the digraphs from the keystroke latency.

Identification of users are based on the statistical comparison of known keystroke latencies to unknown keystroke latencies. If there is no statistical difference between the known latency to the unknown latency, then you cannot say that the keystroke pairs were typed by different individuals. Right, so to identify the user, we are matching a keystroke timing profile somewhere and comparing it to the unknown timings.

There are two types of statistical error possible in timing research. In the context of authentication systems the errors are: Type I Error: A valid user is rejected access to the system Type II Error: An impostor is allowed access to the system Goal is to have no Type II error while minimizing Type I error. So we are “matching” the known latencies to the unknown latencies. We use statistics to do this. In the real world, any type II events are unacceptable

This research attempts to develop a model by which user's keystrokes can be recovered by observing the timing between keystrokes. Instead of identifying the user based on keystroke characteristics, the authors attempt to recover the actual keystrokes the user typed. So past research has done a lot of work showing that keystroke characteristics have recognizable signatures: you can tie a signature back to an individual. Here however, the authors are trying to recover keystrokes, not identify the user.

Keystroke Timing Measures the latency between individual key presses, the amount of time a key was depressed, and released. This research focuses on key press events only, not key press duration, or key release. Data gathering: e.g. the key pair (also called digraph) v,o for user X has a mean latency of 50ms; the digraph t,h for user X has a mean latency of 120ms. Our research only focus on key press events. This is because when you are using SSH, key press events generate the packets. Packet latency is how the authors measure keystroke latency. When gathering data, you gather lots of latencies and obtain the mean, standard deviation. Authors assume Gaussian data.

Timing Research: Gaines/Shapiro
Research in keystroke and keyboard dynamics dates back to 1924. Sporadic bursts of research through the 1970's. 1980 Rand research directed by S. Gaines and N. Shapiro attempted to establish whether a user could be identified by the statistical characteristics of their typing behaviour. Goal was to provide a basis for a computer authentication system.

Gaines and Shapiro established that individual users appear to have typing “signatures”, and thus be identified. Established that the typing signatures appear to be stable over time. Established that certain digraphs (keystroke pairs) can be used to distinguish the typists. Digraphs were used because they seemed to be the most elemental typing units. Other research has measured other features.

Problems with Gaines and Shapiro: Small typist sample size (only 6 subjects) Typists were expert touch typists (not a random sampling from the population). Keystroke latencies were measured to a precision of 1ms. Chose to treat their data as normally distributed data even though tests for normalcy were not decisive. Assumed the standard deviation for all measurements were zero (e.g. they ignored variance in latency measurements). 1ms precision: Variance ignored: Tests for equality of means, under normality, are fairly insensitive to violations of the assumption of equal variances. This is a robustness property for Student t-tests.

Timing Research: Leggett, et. al.
Late 1980's work validated much of the work by Gaines and Shapiro. Also had some issues: Sample size was larger (17 computer programmers), but still too small. Samples are not from the general population. Type II error occurrence was too high (~ 5.0%) Required too much training data (about 1000 words per user); not practical.

Timing Research: Monrose/Rubin
Improved methodology Timing analysis program was designed to reduce user error and make raw data collection more efficient. User ran a binary on their machine at their convenience. Screen layout was designed such that the participant's attention was focused on the screen, so as not to introduce outlying data points. Raw data were ed to the investigators Were able to collect timing data from 47 participants. These improvements in data analysis helped to reduce the error and throwing out of data points.

Graphical front-end was used to analyze raw data and display plots. Easily explain outlying data points Efficiently analyze large amounts of data Presents different perspectives on the data quickly This analytical toolkit may be used by future research in keystroke timing.

Assayed multiple features of user keystroke behaviour: Inter-keystroke latencies (digraphs and trigraphs) Keypress duration Other features Used different types of text for comparison: Structured text compared to Structured text Structured text compared to free-form text Free-form text compared to free-form text Right. So remember that this paper focus' on just keypresses, this is an example of other studies that have examined more features of a users typing. Using different types of text may reveal the best type of text to use when matching known and unknown keystroke characteristics. Structured text may yield the best results vs. the freeform text. (Minimize type II error remember?)

Used multiple methods to compare keystroke data (which did not ignore variance in the data) Euclidean Distance Algorithm Non-weighted Portability Measure Weighted Portability Measure Tested three thresholds (number of standard deviations from the mean) which enabled them to minimize their Type II error (the acceptance of impostors). Weighted Portability factor with a threshold of 1 SD from the mean provided the best results. If the data point was beyond the threshold it was dropped.

Timing Analysis of Keystrokes and Timing Attacks on SSH
by Dawn Xiaodong Song David Wagner Xuqing Tian of the University of California, Berkeley. The paper documents work produced with DARPA and NSF funding.

SSH Timing: Review What are they measuring? How are they measuring?
Inter keystroke latencies. Key press events only. How are they measuring? Sniffing packet deltas on the wire One keystroke equals one packet (in interactive mode).

SSH Timing: Review Where else could they measure?
On the wire (including but not limited to hosts connected by hubs or via a wireless network). The client host. The remote host. An intermediate host (perhaps via a MITM attack).

SSH Timing: Collection of known training data
Participants repeatedly (30 – 40 times) entered keypairs to train their model. The users did not: Enter whole passwords to train the model Enter freeform text to train the model Investigators measured 142 keypairs (enough?) These keypair latencies are the known latencies that the investigators will use later in their Hidden Markov Model and Viterbi algorithms. So what is training data? We need a set of known digraphs and their latencies so that later we can compare packet latencies (of ssh packets) to the known digraphs & latencies.

SSH Timing: The Training Data
The 142 training data keypairs were classified into 5 groups: Two letters, Two hands Two letters, Same hand, Different fingers Two letters, Same hand, Same finger Letter and Number, Two hands Letter and Number, Same hand Attacker may learn one bit of information from the keystroke latency.

SSH Timing: The Training Data
Note the < 100ms and > 200 ms and what you can infer.

SSH Timing: Gaussian Modelling
The data look normal (Gaussian). The investigators derive and plot Gaussian graphs for each keystroke pair (142 total graphs) per user. A lot of overlap between mean digraph latencies. So how does one tell the difference between a digraph peak at 75 ms and 80 ms? So, where are we at? We've retrieved the training data from our users keypairs from each user, repetitions of each keypair. Lets graph the latencies of each digraph.

SSH Timing: Gaussian Modelling
Messy. Note high disorder on left side, low disorder right side.

SSH Timing: Entropy & Info Gain
Note high entrpy with the last slide. High entropy = low info gain.

SSH Timing: Entropy & Info Gain
Investigators calculate the information gain from keystroke latency to be a maximum of 1.2 bits of information per character pair. Assuming that the character pair is selected from the keyboard uniformly and at random. Investigators postulate that information gain will be more for english text since the entropy of english text is lower than the random passwords chosen in this research. Remember: y is latency, q is the keystroke pair.

SSH Timing: Markov Model
Now what? We have a mess of overlapping keystroke latency data. The investigators use a “Markov Model” combined with keystroke latencies to predict the typed keystroke pairs. Generally, a Markov Model says that the probability of moving to the next state depends only on the current state (not any of the previous states in the chain). Lets try to make sense of the data mess. How from the messy 142 keypair graph can we know the difference from one latency to the other. What's Markov? State==keypair

SSH Timing: Markov Model
In the context of this research, the “state” of the Markov model is the keypair that was entered. So Markov rephrased in context: the probability that keypair qnext is going to be entered is based solely on the fact that keypair q was entered. Keypresses that occurred up to keypair q will not influence the probability that keypair qnext will be entered.

SSH Timing: Hidden Markov Model
So hidden keystroke event and its measurable output.

Investigators cannot see the keypairs: this is SSH. All the investigators can measure is keystroke latency (which is indirectly measured by packet deltas). Hidden Markov is where the state (keystroke pair) cannot be measured directly. A property of the state (keystroke latency) is observed instead , by which you can probabilistically determine the hidden state (keystroke pair). Right, so the observed latency can be used to determine which keypair it matched up with.

The Hidden Markov Model makes an assumption: The probability of transitioning to the next state is only determined by the current state and is not dependant on previous states that have occurred. In context: The probability of the latency distribution for the next character pair is only dependant on the current character pair and not based on any previous character sequences. Assumptions: a.) gave good results b.) work well with random characters c.) perhaps no so well with text.

SSH Timing: Viterbi algorithm
Authors decide to use the Viterbi algorithm to analyze the keystroke latency data. The Viterbi algorithm is regularly used to analyze HMM problems. Is more efficient than calculating the probability for every keystroke pair. Given a latency y, list in order of decreasing probability the character pair q that is responsible for producing the observed latency.

SSH Timing: n-Viterbi algorithm
Remember the mess? Based on the fact that digraph latencies have severe overlap, the authors don't think that the Viterbi algorithm will produce the correct keystroke pair. They modified Viterbi to output the n most likely keystroke pairs: n-Viterbi Authors hope that the correct keystroke sequence is within the first n possibilities. N best guesses Shotgun approach

SSH Timing: Herbivore Herbivore is the author's attacking engine.
It sniffs the network for su or login packets and measures packet arrival times. Packet arrival times are compared to the known digraph latencies obtained during the password keypair HMM training. The output from Herbivore is a candidate password list. Somewhere on that list is the correct password.

SSH Timing: Herbivore Attacker must execute an attack using the output from Herbivore. Dictionary Attack (assuming local access to /etc/shadow) Brute force (assuming unix host is not configured to lock acct after x failures)

SSH Timing: Herbivore Authors state that Herbivore reduces the brute force work by a factor of 50. Herbivore gleaned 0.8 bits of information per character pair vs. a theoretical maximum of 1.2 bits. Attribute difference to differences in distributions between training data and observed data. Authors state that one users training data can be used for another users unknown timing data. Here's where the rubber meets the road. The results. Effectiveness Here we compare the effectiveness of herbivore and the using on one set of training data for one user to another set of data for another user.

SSH Timing: Herbivore Herbivore's effectiveness

Responses to the paper Multiple parties responded to the data presented in the paper.

Latency 4 UVA students, Mike Hogye, Thad Hughes, Josh Sarfaty and Joe Wolf, responded to this paper in Fall 2001. They raise many issues, the most important of which is the issue of latency, which the paper depends upon.

Latency The paper depends upon statistics from 10 year old study, which state that internet latency is roughly a static10ms. (We’ve been unable to find this quote in the cited paper) The students claim that modern latency can reach up to 170ms. As the paper depends upon static latency to give accurate time between key presses, as latency starts to fluctuate, the data returned becomes a lot less useful.

Response from SSH.com “Determining the length of the packet as described in the article is not possible, because SSH Secure Shell and later pad the password packet so that its length cannot be determined”

Response from SSH.com “Even if someone was able to successfully perform the attack, it would only reduce the work factor for trying all possible passwords by a factor of 50, which corresponds to shortening the password by approximately one character (i.e., an 8-character password would become effectively a 7-character password which would still have to be guessed correctly).”

Response from SSH.com “According to our analysis, performing the attack on a realistic 8-character password with unrestricted character set would require approximately 120 terabytes ( gigabytes) of memory, which is not feasible with the technology available in the next several years.”

SSH Password length Dug Song and Solar Designer explain in an advisory: When encapsulating plaintext data in a SSH protocol packet, the data is padded to the next 8-byte boundary (or whatever the cipher's block size is, with SSH-2), encrypted, and sent along with the plaintext length field. SSH-1 sends this field in the clear.

SSH password length An attacker passively monitoring a SSH session is able to detect the amount of plaintext sent in each packet -- exact for SSH-1, or a range of possible lengths for SSH-2. Since the login password is sent in one SSH-1 protocol packet without any special precautions, an attacker can determine the exact password length. With SSH-2, other information (including the username) is transmitted in the same packet and the plaintext length is encrypted, so only a range of possible password lengths can be determined.

Differences between the paper vs. advisory
The paper claims that SSH v. 1 passwords can have some of their length guessed (< or > than 7 characters). The advisory claims that the full length of the password can be sniffed with SSH v.1 . The advisory includes code to verify this claim.

Interactive Session Weakness – command length guessing
When typing commands over SSH, each character generates a tiny echo packet from the server. However, once the entire command is entered, a larger packet -- containing the shell prompt and possibly the command's output -- is sent by the server. By counting the tiny packets (or the plaintext lengths in packets sent to the server, in the case of SSH-1), the attacker can infer the length of each shell command.

Demo 1 Sshow (part of the dsniff suite) SSH1 still partially vunerable
SSH2 not vunerable at all.

Interactive Session Weakness
With interactive shell sessions, input characters are normally echoed by the remote end, which usually results in an echo packet from the server for each input character. However, if an application turns input echoing off, such as for entering a password, the packets start to go in one direction only -- to the server.

Interactive Session Weakness – ‘su’ password length guessing
Once an attacker knows that the victim is entering a password, all they need to do is count the packets that didn't generate a reply packet from the server. In the case of SSH-1, the sum of plaintext sizes gives the exact password length, save any backspace characters. With SSH-2, the attacker has to assume that each packet contains only one password character, which is typically the case.

‘su’ in action (SSH1)

‘su’ in action (SSH2) Server Client
40 ack 48 “s” “u” Return “a” “b” “c” “d” Return 56 64 Server Response Client Server Note that with SSH2, the server now sends back dummy packets of the right length, even if the remote end is not echoing the text you type.

Fixed ssh versions Fixes have been initially applied to OpenSSH starting with version OpenSSH contains the more complete versions of the fixes and solves certain interoperability issues associated with the earlier versions. Commercial SSH as of version 3.0 fixes these issues.

SSH1 SSH1 is still vulnerable to several of these attacks, even with the latest version of commercial/openssh. The SSH1 protocol is just not safe – yet it still has widespread adoption For example: The Solaris machines in the CS Grad/Undergrad labs only support SSH1.

SSH – trust in the server
X Forwarding, which used to be turned on by default in the client (fixed now in openssh). When used, places complete trust in the server. An evil/r00ted server can sniff a client’s local X events (keyboard and mouse data).

And now for the fun part :)
In addition to the attacks against SSH listed previously, both SSH protocol versions are vulnerable to a few social-engineering dependent attacks MITM (Monkey/Man in the Middle) Upgrade/Downgrade attacks SSH2: Forced use of RSA over DSA

MITM Attacks Hijacking of connections, snooping, data insertion
Require that you trick the client into thinking that you (the bad guy) are the server they wish to connect to. Commonly done via arp spoofing. Super low tech way (for wireless routers): Advertise a new wireless router with a slightly different name than the real one (I.e micr0soft, _public, etc).

MITM Attacks Server Attacker Client

Demo 2 Arp spoof Valid Ssh connection to unknown host
Valid ssh connection to known host MITM to known host MITM to unknown host

Downgrade Attacks SSH v2-> v1
Parameters exchanged by server and client can be substituted in the beginning of a connection. The attacker can force the client to initialize a SSH1 connection instead of SSH2. The server replies in this way: SSH the server supports ssh1 and ssh2 SSH the server supports ONLY ssh1 The attacker makes a filter to replace “1.99” with “1.51” Possibility to circumvent known_hosts

<BLINK> Audience Applause </BLINK>

- Gene Spafford, CERIAS @ Purdue "Using encryption on the Internet is the equivalent of arranging an armored car to deliver credit-card information from.

Similar presentations

Presentation on theme: "- Gene Spafford, CERIAS @ Purdue "Using encryption on the Internet is the equivalent of arranging an armored car to deliver credit-card information from."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

- Gene Spafford, CERIAS @ Purdue "Using encryption on the Internet is the equivalent of arranging an armored car to deliver credit-card information from.

Similar presentations

Presentation on theme: "- Gene Spafford, CERIAS @ Purdue "Using encryption on the Internet is the equivalent of arranging an armored car to deliver credit-card information from."— Presentation transcript:

Similar presentations

About project

Feedback