1 / 21

Inner Architecture of a Social Networking System

Inner Architecture of a Social Networking System. Petr Kunc, Jaroslav Škrabálek , Tomáš Pitner. Who am I?. Master student of FI MU Member of LaSArIS Webtops Modern web applications Cloud (and distributive ) solutions First time speaker at conference. Social network systems.

tamika
Download Presentation

Inner Architecture of a Social Networking System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inner Architecture of a Social NetworkingSystem Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner

  2. Whoam I? • Master student of FI MU • MemberofLaSArIS • Webtops • Modern web applications • Cloud (and distributive) solutions • Firsttimespeakeratconference

  3. Social network systems • Hundredsmillionusers => advanced software architecture and technologies • High performance • Scalability • Billionsofrows

  4. Table ofcontents • What and why? • Takeplace • Whichway? • Hadoop • HBase • Memcached • How? • Architecture and design • Wasitworthit? • Testing

  5. Takeplace

  6. Takeplace and SocialNetworking • Web-based service facilitating organization ofevents based on meeting, sharing andcommunication. • Emphasison social and interpersonal interaction • Easytool to comment conferences (feedback) • Professional user network: to create relations amongacademic and professionalworldwithcommoninterests • Analysis and statistics • „To behavelikeFacebookwith relations likeTwitterand to beused as LinkedIn.“

  7. Functionalrequirements • Entities can create asymmetric relations • Posts • Walls and newsfeed • Comments and „like“

  8. Technology requirements • Linux and Cloud • Data-orientedapplication • Highthroughput • Heavyloads • Concurrentrequests • Cachingtool

  9. Relationaldatabases • Fixedschema, ACID, indexes, joins • Problems • scaling up datasetsize • Read/writeconcurrency • Typical use ofMySQL: Production=> Memcached (losing ACID) => Costly server => Denormalizing => „materialize“ most commonqueries=> drop triggers, indexes • (compromisesorexpensive)

  10. Hbase • Inspired by Google BigTable • Regions • 4 dimensions • „multidimensionalsortedpersistentdistributedkey-value map“ • Keys& values = array of bytes • Row, CF, Columns & Version

  11. Example { “aa” : { “cf” : { “c1” : data “c2” : data } “cf2” : { “anyByteArray” : true } }, “ab” : { … } }

  12. Hadoop • SW framework – backboneofdistributedenvironment • MapReduce • HDFS

  13. HBase • No realindexes • Automaticpartitioning • Scalelinearly and automatically • Parallel • Cheap • Not foreveryone • Write once, read many • Built on top of Hadoop

  14. Memcached • Distributed cache • Typical usage public Data getData(String query) { Data data = memcached.get(query); if (data == null) { data = database.get(query); memcached.set(query, data); } return data; }

  15. Architecture

  16. Architecture (2) • To be used in any system • Interface of services (REST, SOAP, …) • User tables • Services: Follow, Wall, Like and Discussion • Security

  17. Architecture (3) User ID transformation

  18. Data! • Three tables • Entities • Followers, Following, Blocked, Count, News • Walls • Info, text, likes • Discussions (similar to Walls)

  19. Storing data • Row IDs! Performance! • Lexically • Sequence scanner • UID (constant length) • yyyymmddhhmmssSSS • Inverted bytes -> newest to oldest

  20. News feed • One by one (slow) • OR • Store news at each profile (great redundancy) • MEMCACHED! • Post put in DB => search followers => store minimized in Memcached => links to news feed => 1 normal q & 1 batch q to Memcached • TTL (LRU)

  21. Conclusion • Pros • High volume data distribution • Scalability • High throughput • Heavy data load (write once, read many) • Cons • Losing relations, indexes, triggers, … • Responsibility for consistent data • still not surehowitwillbehavewhendeployed on production

More Related