Simultaneous transducers for data parallel xml parsing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Simultaneous Transducers for Data-Parallel XML Parsing PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

Simultaneous Transducers for Data-Parallel XML Parsing. Yinfei Pan, Ying Zhang, and Kenneth Chiu. Computer Science SUNY Binghamton. Problem. It is well-known that XML parsing is slow (DOM style) What we want to address? Build faster XML DOM parser How?

Download Presentation

Simultaneous Transducers for Data-Parallel XML Parsing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Simultaneous transducers for data parallel xml parsing

Simultaneous Transducers for Data-Parallel XML Parsing

Yinfei Pan, Ying Zhang, and Kenneth Chiu

Computer Science

SUNY Binghamton


Problem

Problem

  • It is well-known that XML parsing is slow (DOM style)

  • What we want to address?

    • Build faster XML DOM parser

  • How?

    • Using multiple cores to do parallel processing on single XML document


Parallel parse single xml document

Parallel parse Single XML document

  • We do it by:

    • Decompose the XML document into chunks

    • Parse each chunk in parallel, use one core per chunk

    • Merge the results of each chunk into final result

  • However, we met problem as:

    • Each chunk cannot be unambiguously parsed independently, since the true state of the parser when start a chunk is unknown until preceding chunks are parsed.

      • /foo

  • Element Content: <a>bar /foo</a>

  • Element end tag: …</bar></foo>…


Previous work

Previous work

  • We attempted to solve this problem in previous work by:

    • A very fast preparse scan (sequential)

    • Build an outline of the document (skeleton)

    • Skeleton are used to guide full parse by first decomposing XML document into well-formed fragments on well-defined unambiguous positions

    • The XML fragments are parsed separately on each core by Libxml2 APIs

    • Merge the results into final DOM with Libxml2 APIs


Skeleton

Skeleton


Problem of previous work

Problem of Previous Work

  • The preparse stage is sequential, which limits the overall performance scale only to 4 cores

  • Solution in this work: Parallelize the preparse stage

    • Preparse stage is simple and deterministic, and can easily be model into a Finite State Transducer (FST)

      • We call it FST since it will accept the XML document as input and output the skeleton.

    • We then transform this preparse finite state transducer (FST) to another finite state transducer, this transformed FST can start work at arbitrary start position of the XML document, so we then can just divide the XML document into equal sized chunks. This transformed FST is called the Simultaneous Finite Transducers (SFT)


The preparsing fst

>

C

3

5

>

2

C

4

/

6

!

(

END

)

"

C

1

C

2

0

1

<

"

C

1

>

(

START

)

>

'

(

END

)

C

5

3

7

4

/

'

C

1

The preparsing FST

  • The FST for preparsing, it has 8 states and 2 actions.


Example of running preparsing dfa

START

END

0

1

3

3

0

0

0

1

2

2

0

Example of running preparsing DFA

<foo>sample</foo>


Simultaneous finite transducers sft

Simultaneous Finite Transducers (SFT)

  • The SFT Consider all possibilities, and generate the preparsing results for all possible initial states as a chunk is preparsed.

    • Its start state is simply the set of all states of the FST

  • Potentially, each state of the SFT is an element from the power set of the states of the original FST.

  • For the actual execution, the SFT transitions from a set of states to another set of states


Execution of sft

a

a,b

0

1

2

a

b

a1

a2

{0 1 2}

a

b

{1 2}

{2}

a1(0)

a2(1)

Execution of SFT

Input: ab

FST:

Apply a1 and move result set

Apply a2 and move result set

[ {R0}, {R1}, {R2} ]

[ { }, {R0, R1}, {R2} ]

[ {}, {}, {R0, R1, R2} ]

SFT:


Transform fst to sft

<

a

{0 1 2 3}

a0(0)

a1(1)

>

/

a2(2)

a4(3)

a3(1)

a

a

a

/

>

>

{1}

{3}

{0}

{0 2 3}

a3(1)

a4(3)

a2(2)

a4(3)

<

a0(0)

<

a0(0)

a

a

>

{2}

a1(1)

a2(2)

Transform FST to SFT


Performance evaluation

Performance Evaluation

  • Machine 1:

    • Sun E6500 with 30 400 MHz US-II processors

    • Operating System: Solaris 10

    • Compiler: g++ 4.0 with the option -O3

  • Machine 2:

    • 8-core Linux machine with two Intel Xeon L5320 CPUs at 1.86 GHz.

    • Linux kernel version: 2.6.18

    • Compiler: g++ 4.0 with the option -O3


Test documents

Test documents

  • File 1: 1kzk.xml of size 34MBytes from Protein Data Bank

  • File 2: gen1.xml of size 29 MBytes generated from XMark project

  • File 3: gen2.xml of size 19 Mbytes generated from XML Benchmark project


1kzk xml

1kzk.xml


Gen1 xml

gen1.xml


Gen2 xml

gen2.xml


Preparse speedup on solaris

Preparse Speedup on Solaris


Preparse speedup on linux

Preparse Speedup on Linux


Full parsing speedups on machine 1

Full parsing speedups on Machine 1


Full parsing speedups on linux

Full parsing speedups on Linux


Thank you

Thank you

Questions?


Select the true context and form the final correct result

Select the true context and form the final correct result

  • When all chunks are finished, we can chain all the preparse results together to create the complete, correct skeleton.

    • The correct state at the end of the first chunk is unambiguous, since the first chunk starts in a known state.

    • This determines the state at the beginning of the second chunk, and is used to select the correct preparse result of the second chunk.

    • This in turn determines the correct state at the beginning of the third chunk

    • so on and so forth…


Parallel efficiency machine 1

Parallel efficiency (Machine 1)


Parallel efficiency machine 2

Parallel efficiency (Machine 2)


  • Login