1 / 18

P2P Control System based on Map/Reduce

P2P Control System based on Map/Reduce. Youngil Kim Awalin Sopan Sonia Ng Zeng. Outline. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System Analysis System Information Logger (SIL) System Information Gatherer (SIG) Map/Reduce

selina
Download Presentation

P2P Control System based on Map/Reduce

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. P2P Control System based on Map/Reduce Youngil Kim AwalinSopan Sonia Ng Zeng

  2. Outline • Introduction • Concept of the Project • System architecture • Implementation – HDFS • Implementation – System Analysis • System Information Logger (SIL) • System Information Gatherer (SIG) • Map/Reduce • Implementation – Visualization • Implementation – P2P Application • Demo

  3. Introduction • How can we know system information from many nodes? • It is hard to track which node has a problem when too many nodes exist • But… HDFS and Map/Reduce make it easy! • Gather system information of each node to HDFS • Analyze system information using Map/Reduce • A kind of network managing system like HP’s OpenView

  4. Concept of the Project • Tool to have an overview of the nodes in the P2P • Still preserving the de-centralized nature of P2P • Can be run on any computer – from within the P2P or outside of it. So, the computer running the tool is not necessarily the “master” • If the tool is not running, the P2P still remains intact • Still, one can control the P2P from the tool • The tool will provide an interface to do both: overview and control • Therefore, the user does not need to be an expert to work with a network system

  5. System Architecture P2P app. P2P app. P2P app. P2P app. p2p Local p2p Local p2p Local p2p Local P2P Network

  6. System Architecture Sys Info Logger Hadoop Slave Node HadoopSlave HadoopSlave HadoopSlave P2P app. P2P app. P2P app. P2P app. Sys Info Logger p2p Local p2p Local p2p Local p2p Local System Info Gatherer (Hadoop Master) HDFS P2P Network Sys Info Logger Sys InfoLogger

  7. System Architecture Sys Info Logger Hadoop Slave Node HadoopSlave HadoopSlave HadoopSlave P2P app. P2P app. P2P app. P2P app. System Information Sys Info Logger p2p Local p2p Local p2p Local p2p Local System Info Gatherer (Hadoop Master) HDFS System Control Network P2P Network Sys Info Logger System Manager (Visualization) Sys InfoLogger

  8. Implementation – P2P Application • Implemented minimal P2P to show how our tool works • How to control application or system on each node using visualization • Has STOP/RESUME operations • Functions • Response to “QUERY”  Show active/inactive (overview) • Response to “CONTROL”  Change node status based on control argument (active/inactive)

  9. Implementation - HDFS • Hadoop for DFS & Map/Reduce Framework • We use bug cluster • Master: brood00 • Slaves: Currently tested with 5 nodes (bug51 ~ bug55) • Using each local storage • Using “/tmp” directory because home directory is not a local storage but NFS volume. • Network Ports: • hdfs(9000), job tracker(9001), • Namenode Interface (50070), JobTracker Interface (50030)

  10. Implementation - System Analysis

  11. System Information Logger (SIL) • mr_syslog.py • Implemented in Python • Saves information in both local storage and HDFS • Gathers information every 10 secs • Creates logfile based on time • Information of each node is saved with the following format • < 20110501_2252_bug51.log > • bug51 1304304720: mem(75.50), cpu(1.00), disk(10.00) • bug51 1304304724: mem(75.50), cpu(1.50), disk(10.00) • bug51 1304304727: mem(75.51), cpu(0.40), disk(10.00) • bug51 1304304729: mem(75.51), cpu(0.50), disk(10.00) • bug51 1304304732: mem(75.50), cpu(0.50), disk(10.00) • bug51 1304304734: mem(75.50), cpu(0.40), disk(10.00)

  12. System Information Gatherer (SIG) • Functions • Find current resource usage of each node at current time using Map/Reduce • Currently, it shows maximum values per minute time slot • Communication Gateway between nodes and visualization tool • Send “QUERY” to each P2P application to check on the status of each node • Send node status to visualization tool • Node ID • Status (in/active) • CPU Usage • Memory Usage • Disk Storage

  13. Map/Reduce • Map: • Input – each node log file • Key: position of file • Value: raw data, one line per key • Output • Key: node ID • Value: set of system information (CPU/memory/storage usage) • Eg: < bug51, [30.0, 29.0, 12.0] >

  14. Map/Reduce • Reduce: • Input – from Map • Key: node ID • Value: set of set of system information • Eg: < bug51, [ [30.0, 29.0, 12.0], [33.0, 40.0, 9.0], … ] > • Output • Key: Node ID • Value: Maximum values for each piece of information • Eg: < bug51, [33.0, 40.0, 12.0] >

  15. Implementation - Visualization • Written in Java • Used Prefuse toolkit for a tabular visualization for the node status • Only need to use the right-click menu to control the node • Live communication with the nodes • To query the node status from the SIG • To send commands to the nodes in the P2P network in real-time

  16. Visualization • Initial view of all nodes • After stopping Bug53

  17. Demo • System set-up and initialization (video file) • Show namenode & jobtracker interface Show Map/Reduce jobs • Show Visualization tool • Changes of each status • Control each P2P application

More Related