“ A Low-cost Attack on a Microsoft CAPTCHA”. The annual ACM Computer and Communications Security Conference (2008). By Jeff Yan and Ahmad Salah El Ahmad. Presentation by Kathleen Stoeckle. Outline. Overview on CAPTCHA Related Work The MSN CAPTCHA Microsoft CAPTCHA Segmentation Attack
The annual ACM Computer and Communications Security Conference (2008)
By Jeff Yan and Ahmad Salah El Ahmad
Presentation by Kathleen Stoeckle
characters under different distortions
Yan and El Ahmed’s paper examines the security of the Microsoft CAPTCHA.
Broken by Mori and Malik:
Broken by Moy et al:
On Microsoft CAPTCHA
Segmentation method – Divide challenge vertically into chunks.
Divide and Conquer
8 connectivity - Each pixel has 8 neighbors
A color fill is applied to each chunk, regardless of number of objects in the chunk.
Thick Arc Characteristics:
Thick Arc Removal Algorithm
This step is applied to all chunks with more than one object.
Premise: The positions of objects determines the difference between arcs and characters. Characters are always closer to the baseline. Characters are horizontally juxtaposed, but never vertically.
1) If one object contains circle, the other is removed.
2) If neither object contains a circle, the one with the fewer number of pixels is removed.
n = number of objects in an image
If n< 8, at least one object has two or more connected characters.
1. 8 characters in an image
2. Connected characters are connected
horizontally not vertically and thus are wider.
3. A segmented chunk contains more than one character if the chunk is wider than 35 pixels.
The number of chunks, width of chunks, and number of objects in a chunk are used to guess which chunks contain connected characters.
where c = number of characters.
Success Rate: 91% (91 out of 100 challenges)
92% of 500 random challenges
Implemented in java
1.86 Ghz Intel Core CPU and 2 GB Ram
Implications: A “state of the art” machine can achieve at least a 95% success rate for recognizing individual characters in MSN scheme. This is a conservative estimate.
Overall success rate for breaking the MSN CAPTCHA: 61% (≈ .92*.95^8).
1. Failure of Arc Removal
2. Failure of Approximation
3. Failure of Segmentation of Connected Characters
Arc Removal and “Approximation” = 72.8%/82.5%
Good usability – characters generally recognizable
Security – vulnerable to simple segmentation attack.
Usability – many characters are still unrecognizable.
Both character size and string size. The longer the string, the more security it provides.
Aids segmentation attack, but can improve usability.