1 / 53

Using Q4M a message queue storage engine for MySQL

Using Q4M a message queue storage engine for MySQL. Cybozu Labs, Inc. Kazuho Oku. Background. Who am I?. Name: Kazuho Oku ( 奥 一穂 ) Original Developer of Palmscape / Xiino The oldest web browser for Palm OS Working at Cybozu Labs since 2005

aggie
Download Presentation

Using Q4M a message queue storage engine for MySQL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Q4Ma message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

  2. Background Using Q4M

  3. Who am I? • Name: Kazuho Oku (奥一穂) • Original Developer of Palmscape / Xiino • The oldest web browser for Palm OS • Working at Cybozu Labs since 2005 • Research subsidiary of Cybozu, Inc. in Japan Using Q4M

  4. About Cybozu, Inc. • Japan’s largest groupware vendor • Mostly provides as software products, not as services • Some of our apps bundle MySQL as storage Using Q4M

  5. About Pathtraq • Started in Aug. 2007 • Web ranking service • One of Japan’s largest • like Alexa, but semi-realtime, and per-page • running on MySQL • Need for a fast and reliable message relay • for communication between the main server and content analysis server(s) Using Q4M

  6. Design Goals of Q4M • Robust • Do not lose data on OS crash or power failure • Fast • Transfer thousands of messages per second • Easy to Use • Use SQL for access / maintenance • Integration into MySQL • no more separate daemons to take care of Using Q4M

  7. What is a Message Queue? Using Q4M

  8. What is a Message Queue? • Middleware for persistent asynchronous communication • communicate between fixed pairs (parties) • a.k.a. Message Oriented Middleware • MQ is intermediate storage • RDBMS is persistent storage • Senders / receivers may go down Using Q4M

  9. Minimal Configuration of a MQ Queue • Senders and receivers access a single queue Sender Receiver Using Q4M

  10. MQ and Relays Queue Queue • Separate queue for sender and receiver • Messages relayed between queues Sender Receiver Relay Using Q4M

  11. Merits of Message Relays • Destination can be changed easily • Relays may transfer messages to different locations depending on their headers • Robustness against network failure • no loss or duplicates when the relay fails • Logging and Multicasting, etc. Using Q4M

  12. Message Brokers • Publish / subscribe model • Separation between components and their integration • Components read / write to predefined queues • Integration is definition of routing rules between the message queues • Messages are often transformed (filtered) within the relay agent Using Q4M

  13. What about Q4M? • Q4M itself is a message queue • Can connect Q4M instances to create a message relay • Provides API for creating message relays and brokers Using Q4M

  14. Performance of Q4M • over 7,000 mess/sec. • message size: avg. 512 bytes • syncing to disk • Outperforming most needs • if you need more, just scale out • Can coexist with other storage engines without sacrificing their performance see http://labs.cybozu.co.jp/blog/kazuhoatwork/2008/06/q4m_06_release_and_benchmarks.php Using Q4M

  15. Applications of Q4M Using Q4M

  16. Asynchronous Updates • Mixi (Japan's one of the largest SNS) uses Q4M to buffer writes to DB, to offload peak demands from http://alpha.mixi.co.jp/blog/?p=272 Using Q4M

  17. Connecting Distant Servers • Pathtraq uses Q4M to create a relay between its database and content analysis processes → Contents to be analyzed → Pathtraq DB Content Analysis Processes MySQL conn. over SSL,gzip ← Results of the analysis ← Using Q4M

  18. To Prefetch Data • livedoor Reader (web-based feed aggregator) uses Q4M to prefetch data from database to memcached • uses Q4M for scheduling web crawlers as well from http://d.hatena.ne.jp/mala/20081212/1229074359 Using Q4M

  19. Scheduling Web Crawlers Request Queue Retry Queue • Web crawlers with retry-on-error • Sample code included in Q4M dist. If failed to fetch, store URL in retry queue Store Result Spiders Read URL URL DB Re- scheduler Using Q4M

  20. Delayed Content Generation • Hatetter(RSS feed-to-twitter-API gateway) uses Q4M to delay content generation • Source code: github.com/yappo/website-hatetter Using Q4M

  21. User Notifications Queue(s) • For sending notifications from web services DB App. Logic SMTP Agent IM Agent Using Q4M

  22. Installing Q4M Using Q4M

  23. Installing Q4M • Compatible with MySQL 5.1 • Download from q4m.31tools.com • Binary releases available for some platforms • Installing from source: • requires source code of MySQL • ./configure && make && make install • run support-files/install.sql Using Q4M

  24. Configuration Options of Q4M • --with-sync=no|fsync|fdatasync|fcntl • Controls synchronization to disk • default: fdatasync on linux • --enable-mmap • Mmap’ed reads lead to higher throughput • default: yes • --with-delete=pwrite|msync • msyncrecommended on linux>=2.6.20 if you need really high performance Using Q4M

  25. Q4M Basics Using Q4M

  26. The Model Q4M table • Various publishers write to queue • Set of subscribers consume the entries in queue Publisher Publisher Subscribers Publisher Using Q4M

  27. Creating a Q4M Table • ENGINE=QUEUE creates a Q4M table • No primary keys or indexes • Sorted by insertion order (it’s a queue) mysql> CREATE TABLE qt ( -> id int(10) unsigned NOT NULL, -> message varchar(255) NOT NULL -> ) ENGINE=QUEUE; Query OK, 0 rows affected (0.42 sec) Using Q4M

  28. Modifying Data on a Q4M Table • No restrictions for INSERT and DELETE • No support for UPDATE mysql> INSERT INTO qt (id,message) -> VALUES -> (1,'Hello'), -> (2,'Bonjour'), -> (3,'Hola'); Query OK, 3 rows affected (0.02 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec) Using Q4M

  29. SELECT from a Q4M Table • Works the same as other storage engines • SELECT COUNT(*) is cached mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec) mysql> SELECT COUNT(*) FROM qt; +----------+ | COUNT(*) | +----------+ | 3 | +----------+ 1 row in set (0.00 sec) How to subscribe to a queue? Using Q4M

  30. Calling queue_wait() • After calling, only one row becomes visible from the connection mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec) mysql> SELECT queue_wait('qt'); +------------------+ | queue_wait('qt') | +------------------+ | 1 | +------------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | +----+---------+ 1 row in set (0.00 sec) Using Q4M

  31. OWNER Mode and NON-OWNER Mode • In OWNER mode, only the OWNED row is visible • OWNED row becomes invisible from other connections • rows of other storage engines are visible NON-OWNER Mode 1,'Hello' 2,'Bonjour' 3,'Hola' OWNER Mode 1,'Hello' queue_wait() queue_end() queue_abort() Using Q4M

  32. Returning to NON-OWNER mode • By calling queue_abort, the connection returns to NON-OWNER mode mysql> SELECT QUEUE_ABORT(); +---------------+ | QUEUE_ABORT() | +---------------+ | 1 | +---------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.01 sec) Using Q4M

  33. Consuming a Row • By calling queue_end, the OWNED row is deleted, and connection returns to NON-OWNER mode mysql> SELECT queue_wait('qt'); (snip) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | +----+---------+ 1 row in set (0.01 sec) mysql> SELECT queue_end(); +-------------+ | queue_end() | +-------------+ | 1 | +-------------+ 1 row in set (0.01 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 2 | Bonjour | | 3 | Hola | +----+---------+ 2 rows in set (0.00 sec) Using Q4M

  34. Writing a Subscriber • Call two functions: queue_wait, queue_end • Multiple subscribers can be run concurrently • each row in the queue is consumed only once while (true) { SELECT queue_wait('qt'); # switch to owner mode rows := SELECT * FROM qt; # obtain data if (count(rows) != 0) # if we have any data, then handle_row(rows[0]); # consume the row SELECT queue_end(); # erase the row from queue } Using Q4M

  35. Writing a Subscriber (cont'd) • Or call queue_wait as a condition • Warning: conflicts with trigger-based insertions while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } Using Q4M

  36. The Model – with code Q4M table while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } Publisher INSERT INTO queue ... INSERT INTO queue ... INSERT INTO queue ... Publisher Subscribers Publisher Using Q4M

  37. Three Functions in Detail Using Q4M

  38. queue_wait(table) • Enters OWNER mode • 0〜1 row becomes OWNED • Enters OWNER mode even if no rows were available • Default timeout: 60 seconds • Returns 1 if a row is OWNED (0 on timeout) • If called within OWNER mode, the owned row is deleted Using Q4M

  39. Revisiting Subscriber Code • Calls to queue_end just before queue_wait can be omitted while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } Using Q4M

  40. Conditional queue_wait() • Consume rows of certain condition • Rows that do not match will be left untouched • Only numeric columns can be checked • Fast - condition tested once per each row examples: SELECT queue_wait('table:(col_a*3)+col_b<col_c'); SELECT queue_wait('table:retry_count<5'); Using Q4M

  41. queue_wait(tbl_cond,[tbl_cond…,timeout]) • Accepts multiple tables and timeout • Data searched from leftmost table to right • Returns table index (the leftmost table is 1) of the newly owned row • Returns zero if no rows are being owned example: SELECT queue_wait('table_A','table_B',60); Using Q4M

  42. Functions for Exiting OWNER Mode • queue_end • Deletes the owned row and exits OWNER mode • queue_abort • Releases (instead of deleting) the owned row and exits OWNER mode • Close of a MySQL connection does the same thing Using Q4M

  43. Relaying and Routing Messages Using Q4M

  44. The Problem Q4M Table (source) Q4M Table (dest.) • Relay (or router) consists of more than 3 processes, 2 conns • No losses, no duplicates on crash or disconnection Relay Program Using Q4M

  45. Internal Row ID • Every row have a internal row ID • invisible from Q4M table definition • monotonically increasing 64-bit integer • Used for detecting duplicates • Use two functions to skip duplicates • Data loss prevented by using queue_wait / queue_end Using Q4M

  46. queue_rowid() • Returns row ID of the OWNED row (if any) • Returns NULL if no row is OWNED • Call when retrieving data from source Using Q4M

  47. queue_set_srcid(src_tbl_id, mode, src_row_id) • Call before inserting a row to destination table • Checks if the row is already inserted into the table, and ignores next INSERT if true • Parameters: • src_tbl_id - id to determine source table (0〜63) • mode - "a" to drop duplicates, "w" to reset • src_row_id - row ID obtained from source table Using Q4M

  48. Pseudo Code • Relays data from src_tbl to dest_tbl while (true) { # wait for data SELECT queue_wait(src_tbl) => src_db; # read row and rowid row := (SELECT * FROM src_tbl => src_db); rowid := (SELECT queue_rowid() => src_db); # insert the row after setting srcid SELECT queue_set_srcid(src_tbl_id, 'a', rowid) => dest_db; INSERT INTO dest_tbl (row) => dest_db; } Using Q4M

  49. q4m-forward • Simple forwarder script • installed into mysql-dir/bin usage: q4m-forward [options] src_addrdest_addr example: % support-files/q4m-forward \ "dbi:mysql:database=db1;table=tbl1;user=foo;password=XXX" \ "dbi:mysql:database=db2;table=tbl2;host=bar;user=foo" options: --reset reset duplicate check info. --sender=idx slot no. used for checking duplicates (0..63, default: 0) --help Using Q4M

  50. Limitations and the Future of Q4M Using Q4M

More Related