1 / 36

Combinatorial Optimization for Text Layout

Combinatorial Optimization for Text Layout. Richard Anderson University of Washington. Microsoft Research, Beijing, September 6, 2000 http://www.cs.washington.edu/homes/anderson/msrcn.ppt. Biography. Background Education PhD Stanford (1985), Post Doc MSRI, Berkeley Experience

redford
Download Presentation

Combinatorial Optimization for Text Layout

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000 http://www.cs.washington.edu/homes/anderson/msrcn.ppt

  2. Biography • Background • Education • PhD Stanford (1985), Post Doc MSRI, Berkeley • Experience • University of Washington, since 1986. Associate Chair for outreach. Visiting prof. IISc, Bangalore, 1993-1994 • Professional Interests • Algorithms • Parallel algorithms, N-Body Simulation, Model Checking for Software, Text Layout • Distance Learning • Tutored Video Instruction, Professional Master’s Program

  3. Optimization for Text Layout • Express text placement as a geometric optimization problem. • Why??? • Generate best layouts • Body of algorithmic research to build on, as well as high performance hardware • Problem specification and formalization • Flexibility via parameterization

  4. TeX [Knuth] • Typography as optimization • Optimal paragraphing via dynamic programming algorithm • Flexibility • Tradeoff between uneven lines and hyphenation frequency • Penalty: weighted sum of whitespace and hyphenation penalties

  5. Outline • Survey of problems studied • 1) Generating all paragraphs of text • 2) Picture layout with anchors to text • 3) Optimal table layout • 4) Customized content compression

  6. Paragraphing problem • Given geometric constraints, find line breaks • Fixed width, find minimum height • Greedy Algorithm • Fixed height, find minimum width • Only need to consider n2 widths: O(n3) algorithm. • Most practical approach – binary search on width. O(nlog W) algorithm • Theoretical O(n) algorithm

  7. All minimal paragraph sizes • Find minimum width paragraph for a given height. • Solve for each height: best known: O(n3/2) Malfoy couldn’t believe his eyes when he saw that Harry and Ron were still at Hogwarts the next day, looking tired but perfectly cheerful. Malfoy couldn’t believe his eyes when he saw that Harry and Ron were still at Hogwarts the next day, looking tired but perfectly cheerful. Malfoy couldn’t believe his eyes when he saw that Harry and Ron were still at Hogwarts the next day, looking tired but perfectly cheerful.

  8. All minimal paragraph sizes • Motivation • Placement of floating text • Formatting tables with text entries • Basic approach • Break into segments of roughly n1/2 words each • Compute possibilities for these, and then combine • Much work still to do on this problem

  9. Placement of text and pictures • Given text with embedded pictures and tables • Place pictures close to their references (anchors) • This is a major headache when using LaTeX! • Futher complications • Multi-column layouts • Partial column width pictures • Typographic considerations for text and headings • Other graphical layout considerations

  10. Placement of text and pictures • Given text and pictures, where each picture has a location in the text, find a layout which minimizes the sum of the text-anchor distances • Single page and multi page problems • Horizontal placement of pictures fixed wrt column boundaries • May require that picture order is consistent with text order

  11. Results • 2-d bin packing problem – do the pictures fit on the page. • May not be the problem of interest – simper cases – pictures fit in columns, align with text rows, fixed horizontal position in columns. • Easy for one column. • NP-complete for three or more columns. • NP-complete even if picture area is very small.

  12. Two-d bin packing, except that rectangles have fixed horizontal positions Motivated by picture placement Best known result: 3-approximation algorithm Problem arises in memory allocation Fixed horizontal bin packing

  13. Practical results • The number of pictures and columns is small. (columns <= 5, pictures <= 10). • Enumeration works well for pictures <= 3. • Branch and bound works well for pictures <=6. • Heuristics + B&B work well for given range. • Prototypes developed, including typography and aesthetic considerations. • Very interesting layouts generated

  14. General Problem Given a set of configurations for each cell, find the maximum value table that satisfies size constraints Special Cases Layout Problem No values, minimize table height for fixed width Compression Problem Configurations for a cell satisfy nesting property Value decreases with size Tables

  15. NP complete Restricted instances: {(1,2), (2,1)}, {(1,1)} Layout Problem (with S. Sobti) Potions. Severus Snape Care of magical creatures.Rubeus Hagrid Divination.Sybill Trelawney Defense against dark arts. R. J. Lupin Potions. Severus Snape Care of magical creatures.Rubeus Hagrid Divination.Sybill Trelawney Defense against dark arts. R. J. Lupin

  16. Layout Problem: results • Fixed W, minimize H, NP complete • Minimize aW+bH solvable with mincut algorithm • Compute convex hull of feasible table configurations • Heuristic algorithm

  17. Table compression problem • Display a table in less than the required area, with a penalty for shrinking cells Potions. Severus Snape Care of magical creatures.Rubeus Hagrid Divination.Sybill Trelawney Defense against dark arts. R. J. Lupin Potions. Severus Snape Care of magical creatures.Hagrid Divin.Sybill T. Defense against dark arts. Lupin Potions. Severus Snape Care of magical critters.Hagrid Divin.Sybill T. Def. dark arts. Lupin Potions. S. Snape Care of creatures.Hagrid Divin.Sybill T. Def. dark arts. Lupin Potions. S. Snape Critr care.Hagrid Divin.Sybill T. Dark arts. Lupin Pot Critters.Hagrid Div D. arts. Lupin

  18. NP complete for simple case Choice cells: 1 x 1 (value 1), 0 x 0 (value 0) Dummy cells: 0 x 0 (value 0) Maximize number of full size choice cells in when table n x n table compressed to n/2 x n/2. Reduction from clique problem Incidence matrix reduction Compression Problem

  19. Attacking the 0-1 problem Equivalent problem: maximum density (n/2,n/2)-subgraph of a (n,n)-bipartite graph 1 1 2 2 3 3 4 4 Choose n/2 vertices from each side to maximize the number of edges between chosen vertices

  20. Find MDS of G=(X,Y,E) Choose X’, the set of n/2 vertices of highest degree w.r.t. Y Choose Y’, the set of n/2 vertices of highest degree w.r.t. X’ Claim: (X’,Y’) is a 1/2 approximation of the MDS Proof: (X’,Y) has at least as many edges as the MDS. (X’,Y’) has at least half as many edges as (X’,Y) Greedy Algorithm

  21. Non-bipartite graphs Add vertices of maximum degree starting with empty graph Remove vertices of minimum degree, starting with full graph 4/9 approximation algorithm (Asahiro et al.) Open problem: generalize and analyze greedy algorithms for tables Greedy Algorithms

  22. Maxcut problem: divide vertices of a graph into two sets to maximize number of edges between the sets. Goemans-Williamson SDP result: Improved approximation bound from 0.5 to 0.878 Introduced new technique to the field Idea - solve the problem on an n-dimensional sphere, use a random projection to divide vertices. MDS problem can also be attacked with SDP. Technical problems with bipartiteness and equal division lead to a weak result. Semidefinite programming

  23. Research directions • Can semidefinite programming beat the greedy algorithm on the 0-1 problem? • Develop greedy algorithms for the general case. • Linear programming: fractional solution to table problems has a natural interpretation. • Results on rounding? • Combinatorial algorithms for the fractional problem. • Develop/analyze fast heuristic algorithms

  24. Content Choice • If information does not fit, allow substitutions The Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts Academic Press, Hogsmeade, 1999, 2nd Edition, 238 pages, Albus Dumbledore editor. The Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts Ac. Press, Hogsmeade, 1999, 2nd Ed., 238 pp, Albus Dumbledore ed. The Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts Ac. Press, Hogsmeade, 1999, 2nd Edition, 238 pages The Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts, Hogsmeade, 1999, 2nd Ed., 238 pp.

  25. The Dark Forces: A Guide to Self-Protection, Q. Trimble, HAP, Hogs., `99, 2nd, 238 pp. Dark Forces, Q. Trimble, HAP, `99, 2nd. The Dark Forces: Self-Protection, Q. Trimble, HAP, 1999, 2nd, 238 pp. Dark Forces, Q. Trimble, HAP, 1999. The Dark Forces, Q. Trimble, HAP, Hogs., 1999, 2nd, 238 pp. Dk. Forces, Q. Trimble, HAP, 1999. The Dark Forces Q. Trimble, HAP, `99, 2nd, 238 pp. Dark Forces, Trimble.

  26. Source representation <text> <choice> <fragment val=90> The Dark Forces: A Guide to Self-Protection </fragment> <fragment val=50> The Dark Forces: Self-Protection </fragment> <fragment val=30> The Dark Forces</fragment> <fragment val=20> Dark Forces</fragment> <fragment val=10> Dk. Forces</fragment> </choice> <choice> <fragment val=30> Hogwarts Academic Press </fragment> <fragment val=20> Hogwarts Ac. Press </fragment> <fragment val=15> Hogwarts </fragment> <fragment val=10> HAP </fragment> <fragment val=0> </fragment> </choice> . . . </text>

  27. Typography with content choice • Problem 1: • Given a fixed area for the text, find the optimal choice of content • Problem 2: • Find the set of all maximal configurations • Problem 3: • Find a good approximation to the set of all maximal configurations

  28. Content Choice • Algorithmic choice: rectangles with values. Place one rectangle from each set to maximize value. 40 40 15 25 20

  29. Warm up problem: Lists • Optimally display the list for a fixed height • Set of configurations for each list item. (height, value) • Solvable with knapsack dynamic programming algorithm

  30. List compression Harry Potter and the Prisoner of Azkaban ~ Usually ships in 24 hours J. K. Rowling / Hardcover / Published 1999 Our Price: $9.98 ~ You Save: $9.97 (50%)Harry Potter and the Sorcerer's Stone ~ Usually ships in 24 hours J. K. Rowling / Hardcover / Published 1998 Our Price: $8.98 ~ You Save: $8.97 (50%) Harry Potter and the Chamber of Secrets J. K. Rowling / Hardcover / Published 1999 Our Price: $8.98 ~ You Save: $8.97 (50%) Harry Potter and the Prisoner of Azkaban ~ J. K. Rowling / Hardcover / Published 1999 Our Price: $9.98 Harry Potter and the Sorcerer's Stone J. K. Rowling / Hardcover / Published 1998 Our Price: $8.98 Harry Potter and the Chamber of Secrets J. K. Rowling / Hardcover / Published 1999 Our Price: $8.98 Harry Potter and the Prisoner of Azkaban ~ J. K. Rowling / HC / Publ 1999 Our Price: $9.98 Harry Potter and the Sorcerer's Stone J. K. Rowling / HC / 1998 $8.98 Harry Potter and the Chamber of Secrets J. K. Rowling / HC / 1999 $8.98 Harry Potter and the Prisoner of Azkaban J. K. Rowling $9.98 Harry Potter and the Sorcerer's Stone Rowling HP : Chamber of Secrets

  31. Implementation goal • Real time resizing of lists • Maintain optimal display as window size changes. • Recompute at refresh rate • Knapsack/dynamic programming algorithm • http://www.cs.washington.edu/homes/anderson/demo2/Page1.htm

  32. Customization • Choice-content generation • Generate choices for fields • Automatic abbreviations • Dictionary lookup • Assign weights • Based on compression and component • Based on user profile

  33. Browsing applications • Browsing book lists • User sets degree of compression • Issues query • Source gives default weights • Value of field • Strength of match • Value of item • Weights modified based on user profile • Optimal list display done for given compression factor

  34. Display of 2-d time tables • Show most likely routes and times at highest precision • Based on user profile and travel data • Memory of user interactions (expanding items)

  35. Graphical layout as geometric optimization Theoretical background Basic algorithms for rectangle placement Algorithm implementation Performance requirements are significant Application Do these techniques work for universal, customized display? Summary

More Related