260 likes | 332 Views
This paper explores Dimension in Synthesis, specifically focusing on ambiguity in synthesis from examples and keywords. It discusses the potential users of synthesis technology, including algorithm designers and software developers, with a vision to enable automated personal assistants and high-quality education. The paper details intent specifications, examples, and various transformations such as bitvector algorithms and semantic string transformations. It also delves into logical specifications and interactive synthesis using examples, showcasing how this technology can advance algorithm design and problem-solving. The language for constructing output strings is examined, offering insights into synthesizing expressions and guarded expressions, along with practical applications and challenges in the field.
E N D
Dimensions in Synthesis Part 3: Ambiguity (Synthesis from Examples & Keywords) Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012
Potential Users of Synthesis Technology Algorithm Designers Software Developers Most Useful Target End-Users Most Transformational Target Students and Teachers • Vision for End-users: Enable people to have (automated) personal assistants. • Vision for Education: Enable every student to have access to free & high-quality education.
Intent Specification • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Intent Specification ICSE 2010: SusmitJha, Gulwani, Seshia, Tiwari. • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Synthesis from Logical Specification Turn off rightmost 1-bit Functional Specification: b Æ[(I[p]=1Æ (I[j]=0)) )(J[p]=0 Æ(J[j] = I[j]))] b j=p+1 jp p=1 Tool Output: J = I & (I-1) PLDI 2011: Gulwani, Jha, Tiwari, Venkatesan.
Interactive Synthesis using Examples Turn-off rightmost contiguous string of 1’s User: I want a program that maps 01011 -> 01000 Tool: There exist at least two programs that match the spec Program 1: (x+1) & (x-1) Program 2: (x+1) & x But they differ on 00000 (Distinguishing Input) What should 00000 be mapped to? User: 00000 -> 00000 •
Interactive Synthesis using Examples Turn-off rightmost contiguous string of 1’s User: 01011 -> 01000 Tool: 00000 ? User: 00000 Tool: 01111 ? User: 00000 Tool: 00110 ? User: 00000 Tool: 01100 ? User: 00000 Tool: 01010 ? User: 01000 Tool: Your program is x & (1 + ((x-1)|x))
Intent Specification • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Intent Specification • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Language for Constructing Output Strings Guarded Expression G := Switch((b1,e1), …, (bn,en)) String Expression e := Concatenate(f1, …, fn) Base Expression f := s // Constant String | SubStr(vi, p1, p2) Index Expression p := k // Constant Integer | Pos(r1, r2, k) // kth position in string whose left/right side matches with r1/r2 Notation: SubStr2(vi,r,k)´SubsStr(vi,Pos(²,r,k),Pos(r,²,k)) • Denotes kth occurrence of regular expression r in vi
Example Format phone numbers Switch((b1, e1), (b2, e2)), where b1´Match(v1,NumTok,3), b2 ´:Match(v1,NumTok,3), e1´Concatenate(SubStr2(v1,NumTok,1), ConstStr(“-”), SubStr2(v1,NumTok,2), ConstStr(“-”), SubStr2(v1,NumTok,3)) e2´ Concatenate(ConstStr(“425-”),SubStr2(v1,NumTok,1), ConstStr(“-”),SubStr2(v1,NumTok,2))
Key Synthesis Idea: Divide and Conquer Reduce the problem of synthesizing expressions into sub-problems of synthesizing sub-expressions. • Reduction requires computing all solutions for each of the sub-problems: • This also allows to rank various solutions and select the highest ranked solution at the top-level. • A challenge here is to efficiently represent, compute, and manipulate huge number of such solutions. • I will show three applications of this idea in the talk. • Read the paper for more tricks!
Synthesizing Guarded Expression • Application #1: We reduce the problem of learning guarded expression P to the problem of learning string expressions for each input-output pair. Goal: Given input-output pairs: (i1,o1), (i2,o2), (i3,o3), (i4,o4), find P such that P(i1)=o1, P(i2)=o2, P(i3)=o3, P(i4)=o4. Algorithm: 1. Learn set S1 of string expressions s.t.8e inS1, [[e]] i1 = o1. Similarly compute S2, S3, S4. Let S = S1 ÅS2 ÅS3 ÅS4. 2(a) If S ≠ ; then result is Switch((true,S)).
Example: Various choices for a String Expression Input Output Constant Constant Constant
Synthesizing String Expressions Application #2: To represent/learn all string expressions, it suffices to represent/learn all base expressions for each substring of the output. Number of all possible string expressions (that can construct a given output string o1 from a given input string i1) is exponential in size of output string. • # of substrings is just quadratic in size of output string! • We use a DAG based data-structure, and it supports efficient intersection operation!
Example: Various choices for a SubStr Expression Various ways to extract “706” from “425-706-7709”: • Chars after 1st hyphen and before 2nd hyphen. Substr(v1, Pos(HyphenTok,²,1), Pos(²,HyphenTok,2)) • Chars from 2nd number and up to 2nd number. Substr(v1, Pos(²,NumTok,2), Pos(NumTok,²,2)) • Chars from 2nd number and before 2nd hyphen. Substr(v1, Pos(²,NumTok,2), Pos(²,HyphenTok,2)) • Chars from 1st hyphen and up to 2nd number. Substr(v1, Pos(HyphenTok,²,1), Pos(²,HyphenTok,2))
Synthesizing SubStr Expressions Application #3: To represent/learn all SubStr expressions, we can independently represent/learn all choices for each of the two index expressions. The number of SubStr(v,p1,p2) expressions that can extract a given substring w from a given string v can be large! • This allows for representing and computing O(n1*n2) choices for SubStr using size/time O(n1+n2).
Back to Synthesizing Guarded Expression Goal: Given input-output pairs: (i1,o1), (i2,o2), (i3,o3), (i4,o4), find P such that P(i1)=o1, P(i2)=o2, P(i3)=o3, P(i4)=o4. Algorithm: Learn set S1 of string expressions s.t.8e inS1, [[e]] i1 = o1. Similarly compute S2, S3, S4. Let S = S1 ÅS2 ÅS3 ÅS4. 2(a). If S ≠ ; then result is Switch((true,S)). 2(b). Else find a smallest partition, say {S1,S2}, {S3,S4}, s.t.S1ÅS2 ≠ ; and S3ÅS4≠ ;. 3. Learn boolean formulas b1, b2s.t. b1 maps i1, i2 to true and i3, i4 to false. b2maps i3, i4to true and i1, i2to false. 4. Result is: Switch((b1,S1ÅS2), (b2,S3ÅS4))
Ranking Strategy • Prefer shorter programs. • Fewer number of conditionals. • Shorter string expression, regular expressions. • Prefer programs with less number of constants.
Intent Specification VLDB 2012/CAV 2012: Rishabh Singh, Gulwani • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Intent Specification PLDI 2011: Bill Harris, Gulwani • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Intent Specification CHI 2012: Salman Cheema, Gulwani, LaViola • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Architecture (Partial) Sketch/Ink Strokes Sketch Recognition Engine [HCI] Circle/Line Objects Constraint Inference Engine [Machine Learning] Constraints between Objects Model Synthesis/Beautification Engine [Theorem Proving] (Partial) Drawing Pattern Synthesis Engine [Program Synthesis] Suggestions for Drawing Completion
Intent Specification Joint work with: Vu Le, Zhendong Su (UC-Davis) • Examples • Bitvector Algorithms (ICSE ‘10) • Spreadsheet Macros (CACM ‘12) • Syntactic String Transformations (POPL ‘11) • Semantic String Transformations (VLDB ‘12) • Number Transformations (CAV ‘12) • Table Transformations (PLDI ‘11) • Sketch • Drawings (CHI 2012) • Keywords • SmartPhone Apps
Potential Users of Synthesis Technology Algorithm Designers Software Developers Most Useful Target End-Users Most Transformational Target Students and Teachers • Vision for End-users: Enable people to have (automated) personal assistants. • Vision for Education: Enable every student to have access to free & high-quality education.
Dimensions in Synthesis (Application) (Ambiguity) (Algorithm) • Concept Language • Programs • Straight-line programs • Automata • Queries • Sequences • User Intent • Logic, Natural Language • Examples, Demonstrations/Traces • Search Technique • SAT/SMT solvers (Formal Methods) • A*-style goal-directed search (AI) • Version space algebras (Machine Learning) PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.