E N D
1. Understanding Sketches and Diagrams on the Tablet PC Balaji Krishnapuram
In collaboration with:
Tablet PC Group (Redmond), and
Collaborative Handwritten Ink Recognition Project group:
Martin Szummer, Chris Bishop,
Michel Gangnet, Markus Svensen
2. Background Extensive work on recognizing hand-written text already
Some problems remain, but works reasonably for the most part
Much more to user interface than simply text!
3. Project Objective:
Assume the text has been separated from the figures in earlier pre-processing step
Ongoing Research: Markus Svensen
I focus on sketch and diagram understanding
4. Practical Applications
5. Understanding Figures: Subtasks Fitting: Identify best affine transformation of model for sample of ink
6. Model for generating ink from templates
7. Model for generating ink from templates
8. Fitting algorithm
9. Noise Immunity
10. Fitting/Recognizing Segments:
11. Segmentation: Wrapper Approach Stroke: from pen down to pen up
Assume figures are drawn in a continuous sequence of strokes
Assume existence of temporal ordering information
i.e. S1, S2, S3, ..., ST
Further assume that max. number of strokes used to draw a template, NS, is reasonably small (e.g. 10 or less)
12. Segmentation: Divide & Conquer [score,partition]=f(S1, S2, S3, ..., ST , NS)
Base case:
if T< NS consider fitting/recognising the entire set of strokes as a single figure
For all k=2 to T-1 : how good is it to divide it at k?
[score1,partition1]=f(S1, S2, S3, ..., Sk , NS);
[score2,partition2]=f(Sk+1, Sk+2, ..., ST , NS);
Total_score(k)=score1+score2;
Total_partition(k)=[partition1;partition2];
Return best score/partition out of all the possibilities considered.
13. Square or 4 Lines?
14. Over-explaining / Under-explaining
15. Gets it right most of the time…
16. … but some mistakes too
17. Current limitations/problems Works fine most of the time! Mistakes when figures are confusingly close or very small
Slow:
Approx. 5 seconds for each of the previous figs.
Each fitting takes about 0.1 seconds, combinatorial explosion in partitioning the image into segments
We use information about temporal sequence of strokes!
Temporal information lost during cut + paste operations
Users do go back and add things to figures later
Only considers Affine transform based fitting.
Arrows and other complicated templates may need other (non-affine) fitting
18. Further work: Scoring seems to be perfectly fine
Main focus on partitioning the image:
how to order the search through the set of all partitions,
guaranteed to reach best interpretation eventually.
Speed gains in fitting/recognizing individual figures
Line based (instead of point based)
Randomized algorithms like RANSAC (Phil, Antonio)
Discriminative approach (feature extraction, learn classifiers for parallelograms, ellipses etc)
19. Acknowledgements Martin Szummer, Chris Bishop, Michel Gangnet, Markus Svensen, Hannah Pepper
Antonio Criminisi, Mike Tipping, Phil Torr
The whole MLP group
All those who provided us ink samples from real, human users!
20. Questions / Suggestions !?!