1 / 27

XML Transmission Compaction The Quest for Streaming Updates Wednesday, 15 October 2003

XML Transmission Compaction The Quest for Streaming Updates Wednesday, 15 October 2003 5:00pm – 5:30pm. James E. Hartley FISD/SIIA Chief Technologist. Which of the Following Statements is True?. The World is Flat The Moon is Made of Cheese The Earth is the Center of the Universe

adanna
Download Presentation

XML Transmission Compaction The Quest for Streaming Updates Wednesday, 15 October 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Transmission Compaction The Quest for Streaming Updates Wednesday, 15 October 2003 5:00pm – 5:30pm James E. Hartley FISD/SIIA Chief Technologist

  2. Which of the Following Statements is True? The World is Flat The Moon is Made of Cheese The Earth is the Center of the Universe XML is Too Verbose for Market Data

  3. Which of the Following Statements is True? The World is Flat Christopher Columbus The Moon is Made of Cheese Neil Armstrong The Earth is the Center of the Universe Copernicus, Galileo, Keplar, Newton XML is Too Verbose for Market Data James Hartley??

  4. What is XML and What’s the Deal? • XML is a way of encoding data with descriptive tags facilitating data interchange • Example: Passing a date and time • Instead of just “2003/10/15 5:00 p.m.” <dateTime>2003-10-15T17:00:00+05:00</dateTime> • Example: Passing a “last trade” price • Instead of just “103.73” <trade><last>103.73</last><currency>USD</currency></trade>

  5. But Wait – That Does Seem Verbose!! • Date and Time… from 20 to 46 bytes • Last price… from 6 to 58 bytes • In fact, encoding of data in XML can take over 10 times the number of bytes!

  6. Let’s Consider MDDL – XML for Market Data • MDDL is “Market Data Definition Language” The XML specification to enable the interchange of information necessary to account, to analyze, and to trade financial instruments of the world's markets. • The industry standard for encoding market data in XML – for all your data needs!

  7. But Why Would We Want MDDL? • Common terms, definitions, and relationships of data used in the market data industry • Removes confusion on data and definition • Facilitates merging and data interchange • Neutral standard for encoding data • The list goes on – for more info, just ask!

  8. What If There Were… • An industry standard nomenclature for describing market data? • An industry standard data feed for requesting and distributing market data? • Would that be worth something to ya? Huh? • Regardless of position in industry – there is value (positives are greater than negatives)

  9. A Trade in MDDL – Ala Tokyo Stock Exchange 890 Bytes! <mddl version="2.2-beta"> <header> <dateTime>2003-10-15T17:00:00.000+05:00<dateTime> <source>XTC Demonstration</source> </header> <snap><equityDomain><commonClass> <instrumentIdentifier> <code scheme="http://www.mddl.org/ext/scheme/symbol?SRC=XTKS">6501</code> <name>A Company in Your Neighborhood</name> </instrumentIdentifier> <sequence>0306</sequence> <session>1</session> <trade> <last>12375</last> <dateTime>2003-10-15T16:58:32.234+05:00</dateTime> <marketCenter> <code scheme="http://www.mddl.org/xtc/Examples/scheme/iso10383.xml">XTKS</code> </marketCenter> <size>200</size> <currency>JPY</currency> <status scheme="http://wws.mddl.org/xtc/Examples/scheme/tradeStatus.xml">normal</status> </trade> </commonClass></equityDomain></snap> </mddl>

  10. A Trade in MDDL – Ala Tokyo Stock Exchange 890 Bytes! <mddl version="2.2-beta"> <header> <dateTime>2003-10-15T17:00:00.000+05:00<dateTime> <source>XTC Demonstration</source> </header> <snap><equityDomain><commonClass> <instrumentIdentifier> <code scheme="http://www.mddl.org/ext/scheme/symbol?SRC=XTKS">6501</code> <name>A Company in Your Neighborhood</name> </instrumentIdentifier> <sequence>0306</sequence> <session>1</session> <trade> <last>12375</last> <dateTime>2003-10-15T16:58:32.234+05:00</dateTime> <marketCenter> <code scheme="http://www.mddl.org/xtc/Examples/scheme/iso10383.xml">XTKS</code> </marketCenter> <size>200</size> <currency>JPY</currency> <status scheme="http://wws.mddl.org/xtc/Examples/scheme/tradeStatus.xml">normal</status> </trade> </commonClass></equityDomain></snap> </mddl>

  11. How Do We Deal With This? • Identify which data elements actually are modified – these are “fields” • Remaining text is nothing more than markup • The remaining shell defines a “template” • The “template” would need to be transmitted once a day or so…

  12. XML X Transmission T C Compaction

  13. Size of Data Transmitted - Primer • A “bit” is the atomic unit of electronic data • Its value may be “0” or “1” • 4 bits is a “nybble” • 2 nybbles is a “byte” (or 8 bits) • 2 bytes is a “word” • We want to minimize bytes-per-message

  14. A Trade in MDDL – Ala Tokyo Stock Exchange 890 Bytes! <mddl version="2.2-beta"> <header> <dateTime>2003-10-15T17:00:00.000+05:00<dateTime> <source>XTC Demonstration</source> </header> <snap><equityDomain><commonClass> <instrumentIdentifier> <code scheme="http://www.mddl.org/ext/scheme/symbol?SRC=XTKS">6501</code> <name>A Company in Your Neighborhood</name> </instrumentIdentifier> <sequence>0306</sequence> <session>1</session> <trade> <last>12375</last> <dateTime>2003-10-15T16:58:32.234+05:00</dateTime> <marketCenter> <code scheme="http://www.mddl.org/xtc/Examples/scheme/iso10383.xml">XTKS</code> </marketCenter> <size>200</size> <currency>JPY</currency> <status scheme="http://wws.mddl.org/xtc/Examples/scheme/tradeStatus.xml">normal</status> </trade> </commonClass></equityDomain></snap> </mddl>

  15. The Fields We Need to Worry About • Time of Message: “2003-10-15T17:00:00.000+05:00” • Ticker Symbol: “6501” • Sequence Number: “0306” • Last Trade Price: “12375” • Time of Trade: “2003-10-15T16:59:59.234+05:00” • Exchange of Trade: “XTKS” • Size of Trade: “200” • Trade Status: “normal” • Getting better – down to 84 bytes…

  16. Time of Message, Time of Trade • What if we sent a “heartbeat” message once per half-second (500 milliseconds)? • Then we could tell time as a “delta” from that frequent “timestamp” • 500 milliseconds can be delivered in 9 bits • Even fewer if we get creative…

  17. Ticker Symbol • The Tokyo Stock Exchange uses 4-digit numbers for many of their stocks • Our system could map each unique stock to a specific number – and sort based on “most active” • 20 bits is enough to allow for 1,048,575 instruments in our system

  18. Sequence Number • A sequence number helps the receiver determine if there are missing messages • The number usually “wraps” to zero if it exceeds the current maximum – but it is prudent to allow for sufficient transactions • 12 bits allows for 4096 transactions per day on a particular stock

  19. Last Trade Price, Size of Trade • Our Tokyo price is provided in Japanese Yen • 16 bits allows for 65535 yen… • We can play games with “lots” and “blocks” to report the number of stocks traded • 8 bits allows for a wide range of values…

  20. Exchange of Trade • There are a limited number of exchanges that can be referenced • In our case, “XTKS” is one of just a few exchanges that are legal for the TSE • 2 bits is enough to identify the exchange…

  21. Trade Status • As with the exchange, the trade status is one of a few values • Note: “status” is a fictitious field in MDDL • 5 bits allows 32 unique status values

  22. So, Let’s Check Our Count • Time of Message: 9 bits • Ticker Symbol: 20 bits • Sequence Number: 12 bits • Last Trade Price: 16 bits • Time of Trade: 9 bits • Exchange of Trade: 2 bits • Size of Trade: 8 bits • Trade Status: 5 bits • Not too bad – down to 81 bits (10.1 bytes)… • There is an additional 3 byte overhead (length, msgid)

  23. With Careful Analysis and Structuring • Time of Message: 9 bits – can be done with 8 bits (hundreds) • Ticker Symbol: 20 bits – average of 10 bits • Sequence Number: 12 bits – average of 8 bits • Last Trade Price: 16 bits – average of 12 bits • Time of Trade: 9 bits – can be done with 8 bits • Exchange of Trade: 2 bits – can be removed (covered in overhead) • Size of Trade: 8 bits – average of 4 bits (and in overhead) • Trade Status: 5 bits – average of 4 bits • Much better – down to 54 bits (6.8 bytes)… • … And we haven’t even used a computer yet…

  24. Summing It All Up… • Once or twice a day – transmit “template” • And other framework – maybe 60K Bytes • Twice a second – transmit “heartbeat” • Containing about 24 bytes • With each trade – transmit “content” • Less than 9 bytes (after all techniques used)

  25. What Does It Mean? • A self-describing datafeed can be just as efficient as existing proprietary protocols • Bandwidth is not compromised • Processing power is not compromised • A self-describing datafeed allows content to be added dynamically • Increases availability of new features at the convenience of the provider

  26. What Does It All Mean? • XML (when properly implemented) facilitates merging and comparison of data • Like terms are compared • Different terms are easily merged • A self-describing datafeed allows content to be added dynamically • Increases availability of new features at the convenience of the provider

  27. This story will continue… Questions or Comments?

More Related