320 likes | 622 Views
HBase. OUTLINE. Basic Data Model Implementation Architecture of HDFS Hbase Server HRegionServer. Basic. HBase directly uses or subclasses the parent Hadoop implementation. Basic. Linux. Linux. Linux. Linux. Basic. DataBase of problem : Grown of Data
E N D
OUTLINE • Basic • Data Model • Implementation • Architecture of HDFS • Hbase Server • HRegionServer
Basic • HBase directly uses or subclassesthe parent Hadoopimplementation
Basic Linux Linux Linux Linux
Basic • DataBaseof problem: • Grown of Data • Complexity of install and maintain Solution : Relational DataBase Management System(RDBMS) • Mutil-RDBMS of poblem:(for nodes ) • JOIN • not effective • rebalance Solution : NoSQLDataBase
Basic • NoSQLDataBase: • Distributed • Scalability • Easy to use • (EX:put, get ,alter etc.)
Basic • List of NoSQL: • OpenSource • HBase (Yahoo!) • Cassandra (Facebook) • SimpleDB (Amazon) • Commercial • BigTable(Google)
Basic • Hbase: • Hadoop’sDataBase. • Reversion of 0.20.6 released • Usage with Map/Reduce
OUTLINE • Basic • Data Model • Implementation • Architecture of HDFS • Hbase Server • HRegionServer
Table • member : Row , Column, TimeStamp
Table • Add< Family, Label> • Add column
Region Express: Region(start row key, end row key>& identifier Region1(com.yahoo.new.tw,com.def.www>,ID region1 region2
Sort • Sort by row key • byte-ordered • Add label on family column
Locking update User1 update update update User2 User3 User4
OUTLINE • Basic • Data Model • Implementation • Architecture of Hbase • Hbase Server • HRegionServer
Architecture of Hbase ZooKeeper HM HR HR HDFS Client NN DN DN HR HR HR DN DN DN Cluster NN: NameNode DN: DataNode HM: Hmaster HR:HRegion
rebalance • a single host grows the regions • it split a row into two new regions of approximately equal size. • Until not across threshold • automatic
Hbase Master • The master node is lightly loaded. • assignment of the replacement daughters • Recovering regionserver failures.
RegionServer • carry zero or more regions • client read/write/scan requests • Random access • Automatic split regions • Send HeartBeat to Master
RegionInfo. • Region of metadata • the current list, state, recent history, and location of all regions afloat onthe cluster. • {NAME => ‘docs’, FAMILIES => [{NAME => ‘cache’, COMPRESSION => ‘NONE’, VERSIONS => ’3′, TTL => ′ 2147483647′, BLOCKSIZE => ’65536′, IN_MEMORY => ‘false’, BLOCKCACHE => ‘false’}
HBase in operation • memory size of 256MB and each row is 1KB size useregion 6.9 x 1010 user regions .META. -ROOT- 2.6 x 105 META regions • 1.8 x 1019 (264) bytes of user data
HBase in operation ZooKeeper HM R R HBase Client Request NN DN DN Step 1. Step 3. R R R User region consult DN DN DN Step 2 ROOT META • Read Requests • - Step 1.location of -ROOT- • - Step 2.location of the .META. Region • - Step3.user region space Cluster NN: NameNode DN: DataNode HM: Hmaster HR:Regionsever
HBase in operation ZooKeeper HBase Client HM R R NN DN DN Interacts with RegionServer R R R DN DN DN • Read Requests • -clients cache • save information of ROOT , META and • User Region Cluster NN: NameNode DN: DataNode HM: Hmaster HR:Regionsever
HBase in operation HBase Client Interacts with RegionServer HLog Region Serser • table Region server of state Region Hstore Region Hstore Hstore MemStore HFile HFile Hfile
HBase in operation HBase Client HLog RegionServer Region Serser • Client request to save data in table Region Hstore Region Hstore Hstore MemStore HFile HFile Hfile
Hbase of characteristic • Fault tolerance • Batch processing • Automatic partitioning • Scale linearly with new nodes