1 / 34

Infinispan , transactional key- value DataGrid and NoSQL database

Infinispan , transactional key- value DataGrid and NoSQL database. 11. April 2013 Alexander Petrov. Alexander Petrov. Sr. Consultant at Inmeta Consulting Current project: Skattetaten Grid POC Previous projects involving grid technologies:

zareh
Download Presentation

Infinispan , transactional key- value DataGrid and NoSQL database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Infinispan, transactional key-valueDataGrid and NoSQL database 11. April 2013 Alexander Petrov

  2. Alexander Petrov • Sr. Consultant at Inmeta Consulting • Current project: Skattetaten Grid POC • Previous projects involving grid technologies: • Mattilsynet food authority system. • FrameSolution BPM framework used in Lovisa National Court Authority(Norway), Mattilsynet Food Authority • Other noteworthy projects • Coca Cola Basis ERP system – Coca Cola Bottler factories • mPowerMobilitec 300 million subscribers worldwide, and delivers over 500,000 pieces of content every day.

  3. Usage scenarios • Big data, Databases areslow. Memory is FAST! • Provides huge computing power. • Tax calculation  • Financial organizations • Government organizations use it for communication and data sharing between the different departments. • Scientific computations • MMORPG games

  4. Agenda • General terminology relevant to Distributed Caching • Challenges related to introducing distributed caching to existing system • Metrics and tuning

  5. Distributed Caching - Concepts • Cache JSR – 107 • Java Data Grid JSR - 347 • In memory Data Grid • Cluster • Distribution • Node – a member of a cluster • Transaction awareness • Colocation • Map / Reduce • Consistency

  6. Real World Use Case

  7. Typical J2EE backend

  8. Data access • Transaction scope • Locking\deadlocking • Flushing policies • Mixing the technology stack. • Performance

  9. Legacy Cache

  10. Our end goal • Wow we did it!

  11. Summary • Our Custom cache is super fast, but its cache hit ratio is rather low. • Our custom cache has a tendency of getting dirty as the updates to the shared data can not be propagated. At the same time the separation of the data regions is not full. • Marshaling is a rather slow and heavy process. • We are facing a technological cocktail and we need to keep integrity.

  12. Replication • Write through • Write Behind • Replication Queue

  13. Invalidation

  14. Distribution

  15. More terminology • Eviction • Least Recently Used • First In First Out • LIRS • Custom • Expiration • Invalidation

  16. Caching topologies – Mirrored Cache • Ref. Data vs Transactional • Reference data: Good. Max 30000 reads/sec 1k size • Transactional data: Good. Max 25000 writes/sec 1k size .

  17. Caching topologies – Replica Cache • Reference data: Good. 30000 reads/sec per server. Grow linearly by adding servers. • Transactional data: Not so good. Max 20000writes/second. Drops if you add 3rd server to 2500.

  18. Caching topologies – Partitioned Cache • Ref. Data vs Transactional • Reference data: Good. Max 30000 reads/sec 1k size • Transactional data: Good. Max 25000 writes/sec 1k size

  19. Caching topologies - Partitioned Replica • Reference data(1kb):Good. 30000 reads/sec per server. Grow linearly by adding servers. • Transactional data(1kb):Good. 20000 writes/sec per server. Grow linearly by adding servers.

  20. How to define our topology • What is the size of our cluster? Reads vs. Writes • Communication inside our grid • UDP,TCP • Synchronous vs. Asynchronous. • What about the transaction isolation? • Repeatable Reads vs. Read Committed • What is the nature of our application? • Read intensive data • CMS systems • Write Intensive Data • Document Management System

  21. Level 1 Cache / Near Cache • Level1 cache is Supported only for Distribution mode • Level 1 cache might have a performance Impact in certain systems

  22. Cache stores and loaders • Passivation • Activation • Hibernate

  23. Transactions, Isolation and Locking • Long running transactions need to be avoided. • What is a long running transaction? How long is actually long. • Read Committed vs Repeatable Reads

  24. Classic Deadlocksituation C is locked by TX2 TX1 (Wants update A,B,C) TX2 (Wants to update C,B,A) A is locked by TX1

  25. Repeatable Read What is returned?? TX1 TX2

  26. Cache statistics

  27. Remoting statistics

  28. Locking statistics

  29. Marshaling data • Java serialization • Java externalization • Impact on performance • Generic domain.

  30. Real World Use Case

  31. Data access • Transaction scope • Locking\deadlocking • Flushing policies • Mixing the technology stack. • Performance

  32. Our end goal • Wow we did it!

  33. The End • Thank you for your attention

  34. Used sources http://www.alachisoft.com/ncache/caching-topology.html http://www.infoq.com/news/2011/10/java-data-grid https://github.com/datagrids/spec/wiki http://www.jboss.org/infinispan/documentation http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking

More Related