hardcoding finite automata n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
HARDCODING FINITE AUTOMATA PowerPoint Presentation
Download Presentation
HARDCODING FINITE AUTOMATA

Loading in 2 Seconds...

play fullscreen
1 / 21

HARDCODING FINITE AUTOMATA - PowerPoint PPT Presentation


  • 96 Views
  • Uploaded on

Ernest Ketcha Ngassam Prof. Bruce W. Watson Prof. Derrick G. Kourie Department of Computer Science University of Pretoria Fastar Research Group http://fastar.cs.up.ac.za. HARDCODING FINITE AUTOMATA. FA Definition: ( Σ , S, F, δ , s 0 ) Finite set of Alphabet symbols ( Σ )

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

HARDCODING FINITE AUTOMATA


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
hardcoding finite automata
Ernest Ketcha Ngassam

Prof. Bruce W. Watson

Prof. Derrick G. Kourie

Department of Computer Science

University of Pretoria

Fastar Research Group

http://fastar.cs.up.ac.za

HARDCODING FINITE AUTOMATA
introductory remarks
FA Definition: (Σ, S, F, δ, s0)

Finite set of Alphabet symbols (Σ)

Finite set of states (S)

Finite set of accepting states (F)

Transition function (δ)

Starting state (s0)

FAs Context

Chomsky hierarchy

Right linear grammar

Many FAs Applications

Pattern matching in text

Text indexing

Computational genetics

Network intrusion detection

Computer and natural virus scanning

Natural language translation

Spell checking

Etc.

FAs are therefore performance-sensitive

CFL

CFL

RLL

RLL

CSL

CSL

UL

UL

Introductory Remarks
related work
1986, Penello in “Very fast LR Parsing”

System that produces hardcoded parsers in Assembly Language

1988, Horspool and Whitney in “Even Faster LR Parsing”

Used Pennello’s idea

Additional optimization strategies to reduce the code size

Some fine tuning

1995, Bhamidipaty and Proebsting in “Very fast YACC-Compatible Parsers (For Very Little Effort)”

YACC produces table-driven Parsers

The method produced directly executable hardcoded parsers in C

2002, Kimmel in “Programming with Regular Expressions in C#”

Suggests implementation of regular expressions in Assembler

Related work
conventional fa implementation
Objective: Determine if a string is in a language represented by an FA?

Key issue:

Transition table that embeds

Alphabet

States

Entries

Uses function / ”controller” recognize(str, transition): boolean

Checks for acceptance symbol per symbol from str

Transverses the table transition

Returns true or false

Conventional FA Implementation

(i,chk)

what is a hardcoded algorithm
No table as data structure

Only Primitive data types used

Data embedded into algorithm

Data are part of the instructions

Uses function recognize(str): boolean

Checks for acceptance symbol per symbol from str

Returns true or false

What is a Hardcoded algorithm?

read(str[0]);

goto label_0;

label_0:

action_0;

read(str[1]);

goto label_1;

label_1:

action_1;

read(str[2]);

goto label_2;

label_{n-1}:

action_{n-1};

goto decision;

Instructions

table driven vs hardcoded algorithms
Table-driven heavily depends on data

Hardcoded heavily depends on instructions

Computationally equivalents O(len)

Need to perform empirical evaluation!

Table-driven vs. Hardcoded Algorithms

Hardcoded

Table-driven

preliminary experiments
Based on single symbol recognition

Easy to implement

Problem domain restricted

Various implementation strategies for the hardcoded algorithm

High-level language (2 variations)

Low-level language (3 variations)

Baseline for string recognition

Table-driven Algorithm reflects work for any transition function

Hardcoded Algorithm reflects work for specific transition function

a

e

s0

d

Preliminary Experiments

Transition array

the experiment data collection
Generate random transition array

Measure clock cycles using

The control program for Table-driven (C++)

Hardcoded program (5 variations)

Switch statement (C++)

Nested conditionals (C++)

Linear search (ASM)

Jump table (ASM)

Direct jump (ASM)

The Experiment & Data Collection
preliminary results
Just an indication on how to continue with experiments

Hardcode outperforms table-driven (in low-level language)

Conclusion:

Rely on jump table version for further experiments

Use it to explore cache effects

Preliminary Results
a simple string test experiment
Language based on:

Accepting symbol (a)

Rejecting symbol (b)

In each of the n-1 states

a :triggers a transition to the next state

b : does not trigger transition

Only string accepted: aaa…aaa (n-1 times)

Represents worst case scenario

Not concerned about reducing the FA

Use Jump table and table-driven versions

a

a

a

a

1

2

3

n

A Simple String Test Experiment
performance based on 2 symbols alphabet

Table-driven (2 symbols alphabet)

Hardcode (2 symbols alphabet)

Hardcode (single state)

Table-driven (single state)

Performance based on 2 symbols alphabet
  • Remark:
    • Caching effect on the hardcoded version
      • L1 cache (Hits) between 10 states and about 110 states
      • L1 cache (Misses) between 160 states and about 360 states
      • L2 cache (Hits) between 460 states and 1700 states
      • Slow L2 cache (Misses) from 1800 states then need Main memory
slide12

1

2

3

n

The String Recognition Experiment

String

slide13
Two ways of Implementing a string recognizer:

1

2

3

n

The String Recognition Experiment

String

the string recognition experiment
Two ways of Implementing a string recognizer:

Implementation based on direct indexing

1

2

3

n

(i,val(strk))

The String Recognition Experiment

String

the string recognition experiment1
Two ways of Implementing a string recognizer:

Implementation based on direct indexing

Implementation based on symbol searching

1

2

3

n

(i, pos(strk))

a

b

c

d

e

Array of alphabet symbols

The String Recognition Experiment

String

the string recognition experiment2
Two ways of Implementing a string recognizer:

Implementation based on direct indexing

Implementation based on symbol searching

Binary search

Linear search

We used Linear search

1

2

3

n

(i, pos(strk))

a

b

c

d

e

Array of alphabet symbols

The String Recognition Experiment

String

the string recognition experiment3

1

2

3

n

The String Recognition Experiment
  • Language based on:
      • 10-symbol alphabet
  • Number of states between 10 and 4000
  • Randomly generate accepting string of length n-1 (n automaton size)
  • Filling density of each automaton sets to 41%
the string recognition experiment4

Hardcode searching

Table-driven searching

Hardcode direct index

Table-driven direct index

The String Recognition Experiment
  • Remarks and Finding:
    • Caching effect on the hardcoded version
    • Noises due to Branch Prediction Buffer
      • Wrong guesses in the Branch History Buffer
    • Hardcoding outperforms table-driven up to a thousand states
future work
Dynamic Implementation of Finite Automata for Performance (DIFAP) using:

Table-driven

Linked list

Hardcode

Fine tuning,

Constraints,

Etc.

An Adaptive method for DIFAP (A-DIFAP)

Adapts to system’s/platform’s constraints at run-time

Programming Language specific toolkit for DIFAP / A-DIFAP

Exploits programming language’s features

Future Work
publications
Preliminary Experiments on Hardcoding Finite Automata. CIAA 2003

Hardcoding Finite State Automata Processing. SAICSIT 2003

Hardcoding Finite State Automata Processing. (Submitted to SACJ)

On Hardcoding Finite State Automata Processing. Technical Report T/UE 2003.

The Effect of Cache Memory on Hardcoded Finite Automata (To be submitted to SP&E)

Publications