slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Sam Ghods VP Technology PowerPoint Presentation
Download Presentation
Sam Ghods VP Technology

Loading in 2 Seconds...

play fullscreen
1 / 24

Sam Ghods VP Technology - PowerPoint PPT Presentation


  • 64 Views
  • Uploaded on

Sam Ghods VP Technology. The simplest way for businesses to share and access data, anywhere. 12 months ago… 4 00M files 4 0M folders One MySQL database. Need to scale!. NoSQL!. “NoSQL” goodies. Easy to scale Just add machines! Sharding handled by the database

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Sam Ghods VP Technology' - kelly-kirk


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Sam Ghods

VP Technology

slide4

12 months ago…

400M files

40M folders

One MySQL database

nosql goodies
“NoSQL” goodies
  • Easy to scale
    • Just add machines!
    • Sharding handled by the database
    • Linearly scales, shared-nothing, no serious SPOF
  • Fast, fairly simple CRUD operations
  • Schema-less
slide11

If you use a NoSQL store, but need any advanced features in your data store, you have to rebuild them from scratch yourself.

inter row consistency file trees must remain consistent
Inter-Row ConsistencyFile trees must remain consistent
  • Folder A
    • Test File
    • Test File
  • Solution: unique index
  • Solution: lock folder A
inter row consistency modify data structure and log event1
Inter-Row ConsistencyModify data structure and log event
  • Folder A
    • Test File 2
  • Solution: Use transactions
  •  rename event
inter row consistency denormalizations
Inter-Row ConsistencyDenormalizations
  • Folder A
    • Test File 1
  • Solution: transactions

 delete

  •  this must be deleted too
indexes
Indexes
  • Indexes are way more awesome than people give them credit for
    • Guaranteed to be consistent
    • Extremely fast
    • Data locality – Only access and pull the data you need
    • No maintenance required except initial ALTER cost
  • SELECT files ORDER BY name (or updated time, or size, etc…)
tools
Tools
  • How do you know what’s happening in your data store?
    • SHOW FULL PROCESSLIST
    • innotop
  • Benchmarking tools
    • mysqlslap
  • pt-query-digest
    • github.com/box/anemometer
maturity reliability
Maturity/Reliability
  • Biggest companies in the world have been using MySQL for primary data storage for over a decade
    • Facebook, Google, Twitter, every othercompany ever
  • When you’re dealing with the crown jewels of your company, you can’t experiment
hbase
HBase
  • Currently using it as a massive event-propagation store (which can be recreated from MySQL data)
  • Started a 3-person task force to learn and productionalize it
  • Considering moving more to it in the future but likely need few more years of production experience
final thoughts
Final Thoughts
  • Don’t choose a database just because “it scales”
  • “Wade, don’t jump into new technologies.”
  • If you go with new technology, be aware that crazy things might happen
  • Make sure you’re not rebuilding MySQL
slide24

Hiring!

sam@box.com