Image Understanding & Web Security. Henry Baird Joint work with: Richard Fateman, Allison Coates, Kris Popat, Monica Chew, Tom Breuel, & Mark Luk. A fast-emerging research topic. Human Interactive Proofs (HIPs; definition later): first instance in 1999
Joint work with:
Richard Fateman, Allison Coates, Kris Popat,
Monica Chew, Tom Breuel, & Mark Luk
Human Interactive Proofs (HIPs; definition later):
H. Baird & K. Popat, “Web Security & Document Image Analysis,” in J. Hu & A. Antonacopoulos (Eds.), Web Document Analysis, World Scientific, 2003 (in press).
“baird AT parc DOT com”
becomes abusive when repeated many times
An image of text, not ASCII
M. D. Lillibridge, M. Abadi, K. Bharat, & A. Z. Broder, “Method for Selectively Restricting Access to Computer Systems,” U.S. Patent No. 6,195,698, Filed April 13, 1998, Issued February 27, 2001.
Udi Manber asked Prof. Manuel Blum’s group at CMU:
then hand out ads – ugh!
without inconveniencing any human users?
I.e., how to distinguish between machines and people on-line
… a kind of ‘Turing test’ !
1936 a universal model of computation
1940s helped break Enigma (U-boat) cipher
1949 first serious uses of a working computer
including plans to read printed text
(he expected it would be easy)
1950 proposed a test for machine intelligence
How to judge that a machine can ‘think’:
wishes, the judge decides which is human
evidence of machine intelligence (Turing asserted)
Modern GUIs invite richer challenges than teletypes….
A. Turing, “Computing Machinery & Intelligence,” Mind, Vol. 59(236), 1950.
(i.e. the judge is a machine)
(even assuming that its algorithms are known?)
NOTE: the machine administers, but cannot pass the test!
(M. Blum, L. A. von Ahn, J. Langford, et al, CMU-SCS)
L. von Ahn, M. Blum, N.J. Hopper, J. Langford, “CAPTCHA: Using Hard AI Problems For Security,” Proc., EuroCrypt 2003, Warsaw, Poland, May 4-8, 2003 [to appear].
English words, deformations, occlusions, backgrounds, etc
L. Von Ahn, M. Blum, N. J. Hopper, J. Langford, The CAPTCHA Web Page, http://www.captcha.net.
one English word, deformations, degradations, occlusions,
colored backgrounds, etc
letters, overlain pattern
no image degradations, spaced apart
CD-rebate, TicketMaster, MailFrontier, Qurb, Madonnarama, …
…have you seen others?
Similar problems w/ scrapers; also, likely on Intranets.
D. P. Baron, “eBay and Database Protection,” Case No. P-33, Case Writing Office, Stanford Graduate School of Business, Stanford Univ., 2001.
There remains a large gap in ability
between human and machine vision systems,
even when reading printed text
Performance of OCR machines has been systematically studied:
7 year olds can consistently do better!
This ability gap has been mapped quantitatively
S. Rice, G. Nagy, T. Nartker, OCR: An Illustrated Guide to the Frontier, Kluwer Academic Publishers: 1999.
thrs x blur
Effects of printing & imaging:
H. Baird, “Document Image Defect Models,” in H. Baird, H. Bunke, & K. Yamamoto (Eds.), Structured Document Image Analysis, Springer-Verlag: New York, 1992.
T. K. Ho & H. S. Baird, “Large Scale Simulation Studies in Image Pattern Recognition,” IEEE Trans. on PAMI, Vol. 19, No. 10, p. 1067-1079, October 1997.
Of course you can …. but OCR machines cannot!
blur, threshold, x-scale -- within certain ranges
Times Roman (TR), Times Italic (TI),
Palatino Roman (PR), Palatino Italic (PI),
Courier Roman (CR), Courier Oblique (CO), etc
Expervision TR (E), ABBYY FineReader (A), IRIS Reader (I)
Each machine has its peculiar blind spots
The machines share some blind spots
blur = 0.0
& threshold 0.02 - 0.08
threshold = 0.02
& any value of blur
~~I~~PessimalPrint: exploiting image degradations
… but people find all these easy to read
A. Coates, H. Baird, R. Fateman, “Pessimal Print: A Reverse Turing Test,” Proc. 6th IAPR Int’l Conf. On Doc. Anal. & Recogn. (ICDAR’01), Seattle, WA, Sep 10-13, 2001.
Manuel Blum proposes it, rounds up some key speakers
Henry Baird offers PARC as venue; Kris Popat helps run it
Invite all known principals: theory, systems, engineers, users
Describe the state of the art
Plan next steps for the field
IBM T.J. Watson
InterTrust Star Labs
City Univ. of Hong Hong
RSA Security Laboratories
Document Recognition Techs, Inc
CMU - SCS, Aladdin Center
Manuel Blum, Lenore Blum, Luis von Ahn, John Langford, Guy Blelloch, Nick Hopper, Ke Yang, Brighten Godfrey, Bartosz Przydatek, Rachel Rue
PARC - SPIA/Security/Theory
Henry Baird, Kris Popat, Tom Breuel, Prateek Sarkar, Tom Berson, Dirk Balfanz, David Goldberg
UCB - CS & SIMS
Richard Fateman, Allison Coates, Jitendra Malik, Doug Tygar, Alma Whitten, Rachna Dhamija, Monica Chew, Adrian Perrig, Dawn Song
Completely Automatic Public Turing test to tell Computers and Humans Apart
Text-based dialogue which an individual can use to authenticate that he/she is himself/herself (‘naked in a glass bubble’)
Individual authentication using spoken language
Human Interactive Proof (HIP)
An automatically administered challenge/response protocol
allowing a person to authenticate him/herself as belonging to a certain group over a network without the burden of passwords,
biometrics, mechanical aids, or special training.
to well-studied image restoration attacks, e.g.
Literature on the psychophysics of reading is relevant:
is known (0.3-2 degrees)
to achieve and sustain “critical reading speed”
BUT gives no answer to:
where’s the optimal comfort zone?
G. E. Legge, D. G. Pelli, G. S. Rubin, & M. M. Schleske, “Psychophysics of Reading: I. normal vision,” Vision Research25(2), 1985.
A. J. Grainger & J. Segui, “Neighborhood Frequency Effects in Visual Word Recognition,’ Perception & Psychophysics 47, 1990..
using a variable-length character n-gram Markov model
ablithan wouquire quasis
from fragmentary or occluded characters, e.g.
M. Chew & H. S. Baird, “BaffleText: A Human Interactive Proof,” Proc., SPIE/IS&T Conf. on Document Recognition & Retrieval X, Santa Clara, CA, January 23-24, 2003.
Parameters of pseudorandom mask generator:
human reading on this family of images
% Subjects willing to solve a BaffleText…
17% every time they send email
39%…if it cut spam by 10x
89% every time they register for an e-commerce site
94%…if it led to more trustworthy recommendations
100% every time they register for an email account
Out of 18 responses to the exit survey.
should fall in the range 50-100; e.g.
G. Mori & J. Malik, “Recognizing Objects in Adversarial Clutter,” submitted to CVPR’03, Madison, WI, June 16-22, 2003.The latest serious (known or published) attack…
Greg Mori & Jitendra Malik (UCB-CS)
Results of Mori-Malik attacks (Dec 2002) given
perfect foreknowledge of both lexicon and font:
P. Y. Simard, R. Szeliski, J. Benaloh, J. Couvreur, I. Calinov, “Using Character Recognition and Segmentation to Tell Computer from Humans,” Proc., Int’l Conf. on Document Analysis & Recognition, Edinburgh, Scotland, August, 2003 [to appear].
A. L. Coates, H. S. Baird, R. Fateman, “Pessimal Print: a Reverse Turing Test,” Proc., 6th IAPR Int’l Conf. On Document Analysis & Recognition, Seattle, WA, Sept. 10-13, 2001.
segmentation, occlusion, degradations, …?
linguistic & semantic context, Gestalt, style consistency…?
quantitatively, not just qualitatively
an indefinitely long sequence of distinct challenges
A technical problem – machine reading –
which he thought would be easy,
has resisted attack for 50 years, and
now allows the first widespread
practical use of variants of
his test for artificial intelligence.
Henry S. Baird