1 / 34

CMo: When Less Is More

CMo: When Less Is More . Context-Directed Browsing for Mobiles. Yevgen Borodin Jalal Mahmud I.V. Ramakrishnan. Miniaturization and Mobility. Mobile Web. Regular Web Sites. Happy Scrolling. Browsing Example. Mobile Browsing Problems. Data Transfer Cost is High Connection is Slow

avalon
Download Presentation

CMo: When Less Is More

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMo:When Less Is More Context-Directed Browsing for Mobiles Yevgen Borodin Jalal Mahmud I.V. Ramakrishnan

  2. Miniaturization and Mobility

  3. Mobile Web

  4. Regular Web Sites

  5. Happy Scrolling

  6. Browsing Example

  7. Mobile Browsing Problems • Data Transfer Cost is High • Connection is Slow • Small Screens • Lots of Scrolling • Time-Consuming • Strenuous • Tiring

  8. Browsing With CMo

  9. Interface Manager Context Analyzer Browser Object Geometric Analyzer Architecture CMo Proxy Server

  10. First Problem: identifying significant frames • CMo HTTP proxy • Utilizes Mozilla to parse DOM • Get a tree of “frames” • Tag these by content “link”, “text”, “image link” … • Identify “maximal semantic blocks” • Discard leaves • look for all X or Y aligned blocks

  11. Context Collection The Page is Segmented into 5 Blocks

  12. Next Problem: identifying context of links • User has clicked somewhere • What is the context? • Possible ideas • The text of the link itself • The surrounding text (in the HTML stream) • The surrounding text (on the page) • CMo looks at the nearby text • … only if it has something to do with the link text

  13. Next Problem: identifying context of links • Link text parsed into 1, 2, 3-grams • “Rice not ruling out talks with Iranians” -> • Rice, not, ruling, out, with, Iranians • Rice ruling, ruling out, … • Rice ruling out, ruling out talks, …

  14. Next Problem: identifying context of links • Perform similar analysis on sibling blocks • Calculate cosine similarilybetween m-sets • Cardinality of intesecting members • Divided by the product of the square root of each set’s cardinality. • USA, news, sports | USA, world -> .4 • USA, news | USA, world -> .5 • USA, news | USA, news -> 1

  15. Cos(M1, M2) > T Cos(M1, M2) < T Context Collection M2 M1 M1 M2 M2

  16. Last Problem: where to zoom at target • Break target page into frames • Compare each frame with context • Metrics used: • Words, 2-, 3-grams matched exactly • Words, 2-, 3-grams that stem match

  17. Next Problem: where to zoom at target • End up with a 6-tuple for each target block • How to rank… Machine Learning! • Supervised learning using SVM • Linear classifier • maximizes distance from hyperplane (QP) • 900 labeled examples, 100 unlabled.

  18. Features SVM Rank The Page is Segmented into 3 Blocks 0.1 0.4 0.8

  19. The Highest Ranking Block is Most Relevant! 0.8

  20. Exact Match of Context Words: Rice Exact Bigram Match: ruling talks Exact Trigram Match: Secretary State Condoleezza Match of Word Stems: rule Match of Stemmed Bigrams: talk Iranian Match of Stemmed Trigrams: Iranian offici confer

  21. Experimental Setup • Web Site Domains (5 Websites in Each) • News, Books, Consumer Electronics • Office Supplies, Informational • 30 Graduate Students

  22. Training SVM for Block Relevance • Data Collection • Collected 1000+ Pairs of Pages from 25 Web Sites • Labeled Data with Link, Context, Relevant Block • Training SVM • Computed Features for 900 Pairs of Pages • Trained SVM Model with Feature Vectors • Used 100 Pages for Cross-Validation

  23. Somewhat complicated procedure for training • Classificaion of blocks on link targets • Feeds back into the link context threshold

  24. Evaluation • Accuracy of Context Identification • Accuracy of Relevant Block Identification • Browsing Time with CMo vs. Regular Browser • Number of Pen Taps with CMo

  25. Evaluation: Context Collection • Using 500 Web Pages from 25 Websites

  26. Evaluation: Relevancy Detection • SVM Model Trained Using 900 Page Pairs • Testing Done with Remaining 100 page pairs

  27. Evaluation • Users perform news tasks such as (T1) • In Google news, find a given story • Click link to New York Times • Provide a specific piece of information contained in that story. • Other tasks were shopping-like (T8) • Go to amazon • Click on “Pink ipod” • Determine its sales rank

  28. Evaluation: Stylus Taps

  29. Evaluation: Time

  30. Future Work • Porting CMo to Client Side • Expand SVM Features • Use Partitioning to Improve Segmentation • Explore Navigation Options

  31. Contributions • Using Context to Find Relevant Information • Saving Users Browsing time • Reducing the Number of Stylus Taps • Conveying the Richness of Web Pages

  32. Questions?

More Related