- 83 Views
- Uploaded on
- Presentation posted in: General

Simultaneous Transducers for Data-Parallel XML Parsing

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Simultaneous Transducers for Data-Parallel XML Parsing

Yinfei Pan, Ying Zhang, and Kenneth Chiu

Computer Science

SUNY Binghamton

- It is well-known that XML parsing is slow (DOM style)
- What we want to address?
- Build faster XML DOM parser

- How?
- Using multiple cores to do parallel processing on single XML document

- We do it by:
- Decompose the XML document into chunks
- Parse each chunk in parallel, use one core per chunk
- Merge the results of each chunk into final result

- However, we met problem as:
- Each chunk cannot be unambiguously parsed independently, since the true state of the parser when start a chunk is unknown until preceding chunks are parsed.
- /foo

- Each chunk cannot be unambiguously parsed independently, since the true state of the parser when start a chunk is unknown until preceding chunks are parsed.

- Element Content: <a>bar /foo</a>

- Element end tag: …</bar></foo>…

- We attempted to solve this problem in previous work by:
- A very fast preparse scan (sequential)
- Build an outline of the document (skeleton)
- Skeleton are used to guide full parse by first decomposing XML document into well-formed fragments on well-defined unambiguous positions
- The XML fragments are parsed separately on each core by Libxml2 APIs
- Merge the results into final DOM with Libxml2 APIs

- The preparse stage is sequential, which limits the overall performance scale only to 4 cores
- Solution in this work: Parallelize the preparse stage
- Preparse stage is simple and deterministic, and can easily be model into a Finite State Transducer (FST)
- We call it FST since it will accept the XML document as input and output the skeleton.

- We then transform this preparse finite state transducer (FST) to another finite state transducer, this transformed FST can start work at arbitrary start position of the XML document, so we then can just divide the XML document into equal sized chunks. This transformed FST is called the Simultaneous Finite Transducers (SFT)

- Preparse stage is simple and deterministic, and can easily be model into a Finite State Transducer (FST)

>

C

3

5

>

2

C

4

/

6

!

(

END

)

"

C

1

C

2

0

1

<

"

C

1

>

(

START

)

>

'

(

END

)

C

5

3

7

4

/

'

C

1

- The FST for preparsing, it has 8 states and 2 actions.

START

END

0

1

3

3

0

0

0

1

2

2

0

<foo>sample</foo>

- The SFT Consider all possibilities, and generate the preparsing results for all possible initial states as a chunk is preparsed.
- Its start state is simply the set of all states of the FST

- Potentially, each state of the SFT is an element from the power set of the states of the original FST.
- For the actual execution, the SFT transitions from a set of states to another set of states

a

a,b

0

1

2

a

b

a1

a2

{0 1 2}

a

b

{1 2}

{2}

a1(0)

a2(1)

Input: ab

FST:

Apply a1 and move result set

Apply a2 and move result set

[ {R0}, {R1}, {R2} ]

[ { }, {R0, R1}, {R2} ]

[ {}, {}, {R0, R1, R2} ]

SFT:

<

a

{0 1 2 3}

a0(0)

a1(1)

>

/

a2(2)

a4(3)

a3(1)

a

a

a

/

>

>

{1}

{3}

{0}

{0 2 3}

a3(1)

a4(3)

a2(2)

a4(3)

<

a0(0)

<

a0(0)

a

a

>

{2}

a1(1)

a2(2)

- Machine 1:
- Sun E6500 with 30 400 MHz US-II processors
- Operating System: Solaris 10
- Compiler: g++ 4.0 with the option -O3

- Machine 2:
- 8-core Linux machine with two Intel Xeon L5320 CPUs at 1.86 GHz.
- Linux kernel version: 2.6.18
- Compiler: g++ 4.0 with the option -O3

- File 1: 1kzk.xml of size 34MBytes from Protein Data Bank
- File 2: gen1.xml of size 29 MBytes generated from XMark project
- File 3: gen2.xml of size 19 Mbytes generated from XML Benchmark project

Thank you

Questions?

- When all chunks are finished, we can chain all the preparse results together to create the complete, correct skeleton.
- The correct state at the end of the first chunk is unambiguous, since the first chunk starts in a known state.
- This determines the state at the beginning of the second chunk, and is used to select the correct preparse result of the second chunk.
- This in turn determines the correct state at the beginning of the third chunk
- so on and so forth…