slide1
Download
Skip this Video
Download Presentation
Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations

Loading in 2 Seconds...

play fullscreen
1 / 27

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operation - PowerPoint PPT Presentation


  • 132 Views
  • Uploaded on

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations. Shuhei Denzumi 1 , Ryo Yoshinaka 2, 1 , Shin-ichi Minato 1,2 , and Hiroki Arimura 1

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operation' - markku


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations

Shuhei Denzumi1, Ryo Yoshinaka2, 1, Shin-ichi Minato1,2, and Hiroki Arimura1

1) Hokkaido University2) JST ERATO Minato Discrete Structure Manipulation System Project

background
Background
  • Researches on string processing become active.
    • Massive online data: The internet and sensing networks.
    • String matching and string mining problems.
  • Data mining
    • Input data should be represented in compact form
    • Computation under compressed structure is needed

Input

Data Structure

Result

Input

Compress

Operation

Input

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

manipulatable compact
Manipulatable & Compact
  • Manipulatable Compact data structure
    • Represent data in compressed form
    • Have operations to manipulate data in compacted style
    • Get much attention for recent years
  • Binary Decision Diagram (BDD)
    • LSI area
  • Deterministic Finite Automata (DFA)
    • Natural Language Processing area

Input

D 1

Data Structure

Input

Compaction

Operation

D 3

Input

D 2

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

what is sequence bdd
What is Sequence BDD?
  • Sequence Binary Decision Diagram (SeqBDD, SDD).
    • Loekito, Bailey, and Pei (2009)
    • Graph structure
    • Represent finite sets of stringswith finite length
  • SDD’s basic properties are unknown
    • Minimization
    • Size complexity
    • Operation time
  • Application
    • Data mining
    • Graph mining
    • Human genome sequencing

Text

Text

Text

Sequence BinaryDecision Diagram

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

family of bdds
Family of BDDs
  • Compact representation for discrete structure
    • With rich algebraic operations

BDD [Bryant 1986]

Boolean functions

xy ∨ yz∨ zx

¬xyz ∨ x¬yz∨ xy¬z

SDD [Loekito, et.al 2009]

Sets of strings

ZDD [Minato 1993]

Sets of combinations

{{a}, {b}, {a, b}}

{abc, acb, bac, bca}

{{a}, {b}, {c}, {a, b, c}}

{a, b, ab, bab, abbab}

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

result
Result
  • Relationship to Acyclic Deterministic Finite Automata (ADFA)
    • Translation from an SDD to an ADFA and vice versa
    • An SDD is never larger than an ADFA
    • An SDD can be |Σ| times smaller than an ADFA
  • Computational complexity of binary set operations
    • Generalize eight set operations
    • Tight analysis on time complexity for binary set operation algorithm
  • Experimental results
    • SDDs can be smaller than ADFAs
    • Binary operation time

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

preliminary
Preliminary

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

definition
Definition

a

b

z

1

0

  • Σ: alphabet (totally ordered by ≺)
  • Internal node: , , , , 1/0 - terminal node: /
  • 1/0 - edge: /
  • SDD: directed acyclic graph
  • Internal node S, τ(S) ↦ 〈S.lab, S.1, S.0〉
    • S.lab: label
    • S.1: 1-child
    • S.0: 0-child
  • Ordering rule
    • N.lab ≺ (N.0).lab

S

S.1

S.lab

a

a

b

z

b

c

S.0

1

0

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

semantics
Semantics
  • L(N): set of strings N represents
  • L( ) = {ε}
  • L( ) = {}
  • L(N) = N.lab・L(N.1)∪L(N.0)
  • A path from the root to the 1-terminal noderepresent a string.

1

{aa, ab, bb}

{aa, ab, bb}

{aa, ab, bb}

{aa, ab, bb}

0

a

a

a

a

{a, b}

{a, b}

{a, b}

{a, b}

a

a

a

a

b

b

b

b

{bb}

{bb}

{bb}

{bb}

{b}

{b}

{b}

{b}

b

b

b

b

{ε}

{ε}

{ε}

{ε}

1

1

1

1

0

0

0

0

{}

{}

{}

{}

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

comparison to adfa
Comparison to ADFA
  •  accept state
  •  reject state

1

0

{aa, ab, bb}

a

a

b

b

c

c

{aa, ab, bb}

{a, b}

{b}

b

a

b

c

a

{a, b}

{bb}

a

b

a

a

b

c

b

a

b

b

1

0

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

reduction process

a・{} ∪ L(N.0) = L (N.0)

Reduction process

N’

  • Suppression
    • N.1 ≠ 0-terminal node
    • In ADFA, removing edges pointing dead state
  • Merging
      • τ(N) = τ(N’) ⇒ N = N’
      • In ADFA, share all equivalent nodes
  • Theorem
    • Under these rules, SDD is unique and minimal
    • Like ADFA’s have unique canonical form

x

N.0

N.0

N

N

N.1

N.1

x

x

a

N.0

N.0

0

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

characteristic
Characteristic
  • Almost isomorphic to Acyclic Deterministic Finite Automata
  • BDD/ZDD techniques are applicable
  • Binary form
    • Simple recursive algorithm
    • Easy to implement
  • Rich collections of operations
  • Use of hash tables
    • To share equivalent nodes
    • To share intermediate computations

BDD/ZDD

ADFA

SDD

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

relationship to acyclic automata
Relationship toAcyclic Automata

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

slide14
Size
  • An SDD node correspond to an ADFA edge
  • The description size is proportional to|N|: the number of internal nodes in SDD N|A|: the number of edges in ADFA A

a

b

c

a

b

c

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

theorem size compare
Theorem: Size compare
  • For equivalent an SDD and an ADFA
  • From an ADFA A to an SDD N
  • From an SDD N to an ADFA A
  • SDD |Σ| times can be smaller than ADFA

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

0 child sharing
0-child sharing

a

e

c

c

d

e

a

b

b

d

d

c

e

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

example
Example

{anbicj, n = 0, …, 4, i, j = 0, 1}

ADFA A

SDD S

a

a

a

1

c

a

b

b

c

c

a

b

c

c

b

a

b

c

b

c

a

a

|S| = 6

|A| = 14

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

experiment
Experiment
  • Input: Canterbury corpus
    • BibleAll: bible.txt, BibleBi: all bigrams from bible.txt, Ecoli: E.coli.txt
    • Fac means store all fanctors of input data

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

binary set operation algorithm
Binary Set Operation Algorithm

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

set operation
Set operation

P

Q

  • A binary set operation♢ ∈ {∪, ∩, \, …}
  • Input: two SDDs P, Q
  • Output: SDD Rsuch thatL(R) = L(P) ♢ L(Q)

Binary Set Operation

P ♢Q

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

apply algorithm
Apply algorithm
  • Originally for BDD [Bryant 1986], applied to SDD
  • Based on the definition L(N) = N.lab ・ L(N.1) ∪ L(N.0)
  • In operation, (when P.lab = Q.lab)L(P) ♢ L(Q) = P.lab ・ (L(P.1) ♢ L(Q.1)) ∪ (L(P.0) ♢ L(Q.0))

P

Q

P♢Q

a

a

a

P1

P0

Q1

P1♢Q1

Q0

P0♢Q1

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

hash table technique
Hash table technique
  • Key-Value hashtables
  • Uniquetable
    • Key: 〈letter x, SDD node N1, SDD node N0〉
    • Value: SDD node N with τ(N) = 〈x, N1, N0〉
  • Opcache
    • Key: 〈operation id ♢, SDD node P, SDD node Q〉
    • Value: SDD node R which is R = P ♢ Q

P

Q

N1

P ♢Q

N

x

Uniquetable

Opcache

Key (triple)

Key (triple)

〈♢, P, Q〉

〈x, N1, N0〉

N0

Value (node)

Value (node)

R

N

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

node create process
Node create process
  • Any SDD node needed during computation is created via this process
  • Once an internal node is registered in Uniquetable, equivalent nodes will not created anymore.

Check the Uniquetable for key 〈x, N1, N0〉.

Exist

Not exist

Return it.

Create a new node and return it.

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

time complexity
Time complexity
  • When P ♢ Q is executed
    • Every operation use Opcache
    • At most |P| ×|Q| different instances of recursive calls invoke
    • (Assume that the access time to hash tables is constant)
  • Naïve method
    • Prepare |P| × |Q| size table
  • This method
    • No useless or redundant node
  • Theorem
    • Worst case O(|P| |Q|) time
    • Example needs Ω(|P| |Q|) time exist
    • Lower and upper bound got

Check the Opcachefor key 〈♢, P, Q〉.

Exist

Not exist

P ♢ Q is already done,

return it.

Continue to computation on 0-side and 1-side.

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

experiment1
Experiment
  • Operation time
    • Prepare two SDDs for all factors of random texts of length n
    • Time to compute operation

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

conclusion
Conclusion
  • Relationship to Acyclic Automata
    • An SDD can be |Σ| times smaller than an ADFA
    • For real data, SDDs are 10~20 % more compact than ADFAs
  • Computational complexity of binary set operations
    • Worst case time complexity is quadratic
    • Tight time bound is analyzed
    • In our experiment, operation time is almost linear
  • Future work
    • Efficient implement of various operations
    • Propose substring index on SDD
    • Factor SDD construction algorithm

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011

ad