Notes on Sequence Binary Decision Diagrams:
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on
  • Presentation posted in: General

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations. Shuhei Denzumi 1 , Ryo Yoshinaka 2, 1 , Shin-ichi Minato 1,2 , and Hiroki Arimura 1

Download Presentation

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Notes on sequence binary decision diagrams relationship to acyclic automata and complexities of binary set operations

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations

Shuhei Denzumi1, Ryo Yoshinaka2, 1, Shin-ichi Minato1,2, and Hiroki Arimura1

1) Hokkaido University2) JST ERATO Minato Discrete Structure Manipulation System Project


Background

Background

  • Researches on string processing become active.

    • Massive online data: The internet and sensing networks.

    • String matching and string mining problems.

  • Data mining

    • Input data should be represented in compact form

    • Computation under compressed structure is needed

Input

Data Structure

Result

Input

Compress

Operation

Input

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Manipulatable compact

Manipulatable & Compact

  • Manipulatable Compact data structure

    • Represent data in compressed form

    • Have operations to manipulate data in compacted style

    • Get much attention for recent years

  • Binary Decision Diagram (BDD)

    • LSI area

  • Deterministic Finite Automata (DFA)

    • Natural Language Processing area

Input

D 1

Data Structure

Input

Compaction

Operation

D 3

Input

D 2

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


What is sequence bdd

What is Sequence BDD?

  • Sequence Binary Decision Diagram (SeqBDD, SDD).

    • Loekito, Bailey, and Pei (2009)

    • Graph structure

    • Represent finite sets of stringswith finite length

  • SDD’s basic properties are unknown

    • Minimization

    • Size complexity

    • Operation time

  • Application

    • Data mining

    • Graph mining

    • Human genome sequencing

Text

Text

Text

Sequence BinaryDecision Diagram

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Family of bdds

Family of BDDs

  • Compact representation for discrete structure

    • With rich algebraic operations

BDD [Bryant 1986]

Boolean functions

xy ∨ yz∨ zx

¬xyz ∨ x¬yz∨ xy¬z

SDD [Loekito, et.al 2009]

Sets of strings

ZDD [Minato 1993]

Sets of combinations

{{a}, {b}, {a, b}}

{abc, acb, bac, bca}

{{a}, {b}, {c}, {a, b, c}}

{a, b, ab, bab, abbab}

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Result

Result

  • Relationship to Acyclic Deterministic Finite Automata (ADFA)

    • Translation from an SDD to an ADFA and vice versa

    • An SDD is never larger than an ADFA

    • An SDD can be |Σ| times smaller than an ADFA

  • Computational complexity of binary set operations

    • Generalize eight set operations

    • Tight analysis on time complexity for binary set operation algorithm

  • Experimental results

    • SDDs can be smaller than ADFAs

    • Binary operation time

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Preliminary

Preliminary

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Definition

Definition

a

b

z

1

0

  • Σ: alphabet (totally ordered by ≺)

  • Internal node: , , , , 1/0 - terminal node: /

  • 1/0 - edge: /

  • SDD: directed acyclic graph

  • Internal node S, τ(S) ↦ 〈S.lab, S.1, S.0〉

    • S.lab: label

    • S.1: 1-child

    • S.0: 0-child

  • Ordering rule

    • N.lab ≺ (N.0).lab

S

S.1

S.lab

a

a

b

z

b

c

S.0

1

0

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Semantics

Semantics

  • L(N): set of strings N represents

  • L( ) = {ε}

  • L( ) = {}

  • L(N) = N.lab・L(N.1)∪L(N.0)

  • A path from the root to the 1-terminal noderepresent a string.

1

{aa, ab, bb}

{aa, ab, bb}

{aa, ab, bb}

{aa, ab, bb}

0

a

a

a

a

{a, b}

{a, b}

{a, b}

{a, b}

a

a

a

a

b

b

b

b

{bb}

{bb}

{bb}

{bb}

{b}

{b}

{b}

{b}

b

b

b

b

{ε}

{ε}

{ε}

{ε}

1

1

1

1

0

0

0

0

{}

{}

{}

{}

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Comparison to adfa

Comparison to ADFA

  •  accept state

  •  reject state

1

0

{aa, ab, bb}

a

a

b

b

c

c

{aa, ab, bb}

{a, b}

{b}

b

a

b

c

a

{a, b}

{bb}

a

b

a

a

b

c

b

a

b

b

1

0

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


Reduction process

a・{} ∪ L(N.0) = L (N.0)

Reduction process

N’

  • Suppression

    • N.1 ≠ 0-terminal node

    • In ADFA, removing edges pointing dead state

  • Merging

    • τ(N) = τ(N’) ⇒ N = N’

    • In ADFA, share all equivalent nodes

  • Theorem

    • Under these rules, SDD is unique and minimal

    • Like ADFA’s have unique canonical form

  • x

    N.0

    N.0

    N

    N

    N.1

    N.1

    x

    x

    a

    N.0

    N.0

    0

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Characteristic

    Characteristic

    • Almost isomorphic to Acyclic Deterministic Finite Automata

    • BDD/ZDD techniques are applicable

    • Binary form

      • Simple recursive algorithm

      • Easy to implement

    • Rich collections of operations

    • Use of hash tables

      • To share equivalent nodes

      • To share intermediate computations

    BDD/ZDD

    ADFA

    SDD

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Relationship to acyclic automata

    Relationship toAcyclic Automata

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Notes on sequence binary decision diagrams relationship to acyclic automata and complexities of binary set operations

    Size

    • An SDD node correspond to an ADFA edge

    • The description size is proportional to|N|: the number of internal nodes in SDD N|A|: the number of edges in ADFA A

    a

    b

    c

    a

    b

    c

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Theorem size compare

    Theorem: Size compare

    • For equivalent an SDD and an ADFA

    • From an ADFA A to an SDD N

    • From an SDD N to an ADFA A

    • SDD |Σ| times can be smaller than ADFA

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    0 child sharing

    0-child sharing

    a

    e

    c

    c

    d

    e

    a

    b

    b

    d

    d

    c

    e

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Example

    Example

    {anbicj, n = 0, …, 4, i, j = 0, 1}

    ADFA A

    SDD S

    a

    a

    a

    1

    c

    a

    b

    b

    c

    c

    a

    b

    c

    c

    b

    a

    b

    c

    b

    c

    a

    a

    |S| = 6

    |A| = 14

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Experiment

    Experiment

    • Input: Canterbury corpus

      • BibleAll: bible.txt, BibleBi: all bigrams from bible.txt, Ecoli: E.coli.txt

      • Fac means store all fanctors of input data

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Binary set operation algorithm

    Binary Set Operation Algorithm

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Set operation

    Set operation

    P

    Q

    • A binary set operation♢ ∈ {∪, ∩, \, …}

    • Input: two SDDs P, Q

    • Output: SDD Rsuch thatL(R) = L(P) ♢ L(Q)

    Binary Set Operation

    P ♢Q

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Apply algorithm

    Apply algorithm

    • Originally for BDD [Bryant 1986], applied to SDD

    • Based on the definition L(N) = N.lab ・ L(N.1) ∪ L(N.0)

    • In operation, (when P.lab = Q.lab)L(P) ♢ L(Q) = P.lab ・ (L(P.1) ♢ L(Q.1)) ∪ (L(P.0) ♢ L(Q.0))

    P

    Q

    P♢Q

    a

    a

    a

    P1

    P0

    Q1

    P1♢Q1

    Q0

    P0♢Q1

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Hash table technique

    Hash table technique

    • Key-Value hashtables

    • Uniquetable

      • Key: 〈letter x, SDD node N1, SDD node N0〉

      • Value: SDD node N with τ(N) = 〈x, N1, N0〉

    • Opcache

      • Key: 〈operation id ♢, SDD node P, SDD node Q〉

      • Value: SDD node R which is R = P ♢ Q

    P

    Q

    N1

    P ♢Q

    N

    x

    Uniquetable

    Opcache

    Key (triple)

    Key (triple)

    〈♢, P, Q〉

    〈x, N1, N0〉

    N0

    Value (node)

    Value (node)

    R

    N

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Node create process

    Node create process

    • Any SDD node needed during computation is created via this process

    • Once an internal node is registered in Uniquetable, equivalent nodes will not created anymore.

    Check the Uniquetable for key 〈x, N1, N0〉.

    Exist

    Not exist

    Return it.

    Create a new node and return it.

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Time complexity

    Time complexity

    • When P ♢ Q is executed

      • Every operation use Opcache

      • At most |P| ×|Q| different instances of recursive calls invoke

      • (Assume that the access time to hash tables is constant)

    • Naïve method

      • Prepare |P| × |Q| size table

    • This method

      • No useless or redundant node

    • Theorem

      • Worst case O(|P| |Q|) time

      • Example needs Ω(|P| |Q|) time exist

      • Lower and upper bound got

    Check the Opcachefor key 〈♢, P, Q〉.

    Exist

    Not exist

    P ♢ Q is already done,

    return it.

    Continue to computation on 0-side and 1-side.

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Experiment1

    Experiment

    • Operation time

      • Prepare two SDDs for all factors of random texts of length n

      • Time to compute operation

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Conclusion

    Conclusion

    • Relationship to Acyclic Automata

      • An SDD can be |Σ| times smaller than an ADFA

      • For real data, SDDs are 10~20 % more compact than ADFAs

    • Computational complexity of binary set operations

      • Worst case time complexity is quadratic

      • Tight time bound is analyzed

      • In our experiment, operation time is almost linear

    • Future work

      • Efficient implement of various operations

      • Propose substring index on SDD

      • Factor SDD construction algorithm

    Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operationsby Shuhei Denzumi, Ryo Yoshinaka, Shin-ichi Minato, and Hiroki Arimura, 2011-08-30 (TUE), Prague Stringology Conference 2011


    Notes on sequence binary decision diagrams relationship to acyclic automata and complexities of binary set operations

    Thank you!


  • Login